Skip to content

Commit dc49b0b

Browse files
authored
add envFrom (#192)
Signed-off-by: Garrett Goon <[email protected]>
1 parent cfe55f4 commit dc49b0b

File tree

4 files changed

+30
-0
lines changed

4 files changed

+30
-0
lines changed

tools/pytorchjob-generator/chart/README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -41,6 +41,7 @@ customize the Jobs generated by the tool.
4141
| Key | Type | Default | Description |
4242
|-----|------|---------|-------------|
4343
| environmentVariables | array | `nil` | List of variables/values to be defined for all the ranks. Values can be literals or references to Kuberetes secrets or configmaps. See [values.yaml](values.yaml) for examples of supported syntaxes. NOTE: The following standard [PyTorch Distributed environment variables](https://pytorch.org/docs/stable/distributed.html#environment-variable-initialization) are set automatically and can be referenced in the commands without being set manually: WORLD_SIZE, RANK, MASTER_ADDR, MASTER_PORT. |
44+
| envFrom | array | `nil` | List of ConfigMaps or Secrets specifying environment variables. See [values.yaml](values.yaml) for examples of supported syntaxes. NOTE: the environmentVariables field takes precedence over envFrom. mlbatch also performs some automatic checks on the environmentVariables passed by the user, such as checking that the user does not specify NCCL_TOPO_FILE when topologyFileConfigMap is also provided. These checks are *not* performed on any environment variables inherited from envFrom. |
4445
| sshGitCloneConfig | object | `nil` | Private GitHub clone support. See [values.yaml](values.yaml) for additional instructions. |
4546
| setupCommands | array | no custom commands are executed | List of custom commands to be ran at the beginning of the execution. Use `setupCommand` to clone code, download data, and change directories. |
4647
| mainProgram | string | `nil` | Name of the PyTorch program to be executed by `torchrun`. Please provide your program name here and NOT in "setupCommands" as this helm template provides the necessary "torchrun" arguments for the parallel execution. WARNING: this program is relative to the current path set by change-of-directory commands in "setupCommands". If no value is provided; then only `setupCommands` are executed and torchrun is elided. |

tools/pytorchjob-generator/chart/templates/appwrapper.yaml

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -116,6 +116,10 @@ spec:
116116
{{- include "mlbatch.volumes" . | indent 38 }}
117117
containers:
118118
- name: pytorch
119+
{{- if .Values.envFrom }}
120+
envFrom:
121+
{{- toYaml .Values.envFrom | nindent 46 }}
122+
{{- end }}
119123
image: {{ required "Please specify a 'containerImage' in the user file" .Values.containerImage }}
120124
imagePullPolicy: {{ .Values.imagePullPolicy | default "IfNotPresent" }}
121125
{{- include "mlbatch.securityContext" . | indent 44 }}
@@ -139,6 +143,10 @@ spec:
139143
{{- include "mlbatch.volumes" . | indent 38 }}
140144
containers:
141145
- name: pytorch
146+
{{- if .Values.envFrom }}
147+
envFrom:
148+
{{- toYaml .Values.envFrom | nindent 46 }}
149+
{{- end }}
142150
image: {{ required "Please specify a 'containerImage' in the user file" .Values.containerImage }}
143151
imagePullPolicy: {{ .Values.imagePullPolicy | default "IfNotPresent" }}
144152
{{- include "mlbatch.securityContext" . | indent 44 }}

tools/pytorchjob-generator/chart/values.schema.json

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -42,6 +42,10 @@
4242
{ "type": "null" },
4343
{ "type": "array" }
4444
]},
45+
"envFrom": { "oneOf": [
46+
{ "type": "null" },
47+
{ "type": "array" }
48+
]},
4549
"sshGitCloneConfig": { "oneOf": [
4650
{ "type": "null" },
4751
{

tools/pytorchjob-generator/chart/values.yaml

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -101,6 +101,23 @@ environmentVariables:
101101
# name: configmap-name
102102
# key: configmap-key
103103

104+
105+
# -- (array) List of ConfigMaps or Secrets specifying environment variables. See
106+
# [values.yaml](values.yaml) for examples of supported syntaxes.
107+
#
108+
# NOTE: the environmentVariables field takes precedence over envFrom. mlbatch also performs some
109+
# automatic checks on the environmentVariables passed by the user, such as checking that the user
110+
# does not specify NCCL_TOPO_FILE when topologyFileConfigMap is also provided. These checks are
111+
# *not* performed on any environment variables inherited from envFrom.
112+
# @section -- Workload Specification
113+
envFrom:
114+
# - secretRef
115+
# name: my-secrets
116+
# - secretRef
117+
# name: my-other-secrets
118+
# - configMapRef
119+
# name: my-config-map
120+
104121
# Private GitHub clone support.
105122
#
106123
# 0) Create a secret and configMap to enable Private GitHub cloning as documented for your organization.

0 commit comments

Comments
 (0)