Skip to content

Commit d125b34

Browse files
committed
Document deploying DRA to OpenShift
* Document the differences on OpenShift * Include useful setup scripts Signed-off-by: Vitaliy Emporopulo <[email protected]>
1 parent ac31d61 commit d125b34

File tree

4 files changed

+175
-1
lines changed

4 files changed

+175
-1
lines changed

README.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ A document and demo of the DRA support for GPUs provided by this repo can be fou
1212

1313
## Demo
1414

15-
This section describes using `kind` to demo the functionality of the NVIDIA GPU DRA Driver.
15+
This section describes using `kind` to demo the functionality of the NVIDIA GPU DRA Driver. For Red Hat OpenShift, refer to [running the NVIDIA DRA driver on OpenShift](demo/clusters/openshift/README.md).
1616

1717
First since we'll launch kind with GPU support, ensure that the following prerequisites are met:
1818
1. `kind` is installed. See the official documentation [here](https://kind.sigs.k8s.io/docs/user/quick-start/#installation).

demo/clusters/openshift/README.md

+147
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,147 @@
1+
# Running the NVIDIA DRA Driver on Red Hat OpenShift
2+
3+
This document explains the differences between deploying the NVIDIA DRA driver on OpenShift and upstream Kubernetes or its flavors.
4+
5+
## Prerequisites
6+
7+
Install a recent build of OpenShift 4.16 (e.g. 4.16.0-ec.4). You can use the Assisted Installer to install on bare metal, or obtain an IPI installer binary (`openshift-install`) from the [Release Status](https://amd64.ocp.releases.ci.openshift.org/) page. Note that a development version of OpenShift requires access to [an internal CI registry](https://docs.ci.openshift.org/docs/how-tos/use-registries-in-build-farm/) in the pull secret. Refer to the [OpenShift documentation](https://docs.openshift.com/container-platform/4.15/installing/index.html) for different installation methods.
8+
9+
## Enabling DRA on OpenShift
10+
11+
Enable the `TechPreviewNoUpgrade` feature set as explained in [Enabling features using FeatureGates](https://docs.openshift.com/container-platform/4.15/nodes/clusters/nodes-cluster-enabling-features.html), either during the installation or post-install. The feature set includes the `DynamicResourceAllocation` feature gate.
12+
13+
Update the cluster scheduler to enable the DRA scheduling plugin:
14+
15+
```console
16+
$ oc patch --type merge -p '{"spec":{"profile": "HighNodeUtilization", "profileCustomizations": {"dynamicResourceAllocation": "Enabled"}}}' scheduler cluster
17+
```
18+
19+
## NVIDIA GPU Drivers
20+
21+
The easiest way to install NVIDIA GPU drivers on OpenShift nodes is via the NVIDIA GPU Operator.
22+
23+
**Be careful to disable the device plugin so it does not conflict with the DRA plugin**:
24+
25+
```yaml
26+
devicePlugin:
27+
enabled: false
28+
```
29+
30+
Keep in mind that the NVIDIA GPU operator is needed here only to install NVIDIA binaries on the cluster nodes.
31+
32+
The operator might not be available through the OperatorHub in a pre-production version of OpenShift. In this case, deploy the operator from a bundle or add a certified catalog index from an earlier version of OpenShift, e.g.:
33+
34+
```yaml
35+
kind: CatalogSource
36+
apiVersion: operators.coreos.com/v1alpha1
37+
metadata:
38+
name: certified-operators-v415
39+
namespace: openshift-marketplace
40+
spec:
41+
displayName: Certified Operators v4.15
42+
image: registry.redhat.io/redhat/certified-operator-index:v4.15
43+
priority: -100
44+
publisher: Red Hat
45+
sourceType: grpc
46+
updateStrategy:
47+
registryPoll:
48+
interval: 10m0s
49+
```
50+
51+
Then follow the installation steps in [NVIDIA GPU Operator on Red Hat OpenShift Container Platform](https://docs.nvidia.com/datacenter/cloud-native/openshift/latest/index.html).
52+
53+
## NVIDIA Binaries on RHCOS
54+
55+
The location of some NVIDIA binaries on an OpenShift node differs from the defaults. Make sure to pass the following values when installing the Helm chart:
56+
57+
```yaml
58+
nvidiaDriverRoot: /run/nvidia/driver
59+
nvidiaCtkPath: /var/usrlocal/nvidia/toolkit/nvidia-ctk
60+
```
61+
62+
## OpenShift Security
63+
64+
OpenShift generally requires more stringent security settings than Kubernetes. If you see a warning about security context constraints when deploying the DRA plugin, pass the following to the Helm chart, either via an in-line variable or a values file:
65+
66+
```yaml
67+
kubeletPlugin:
68+
containers:
69+
plugin:
70+
securityContext:
71+
privileged: true
72+
seccompProfile:
73+
type: Unconfined
74+
```
75+
76+
If you see security context constraints errors/warnings when deploying a sample workload, make sure to update the workload's security settings according to the [OpenShift documentation](https://docs.openshift.com/container-platform/4.15/operators/operator_sdk/osdk-complying-with-psa.html). Usually applying the following `securityContext` definition at a pod or container level works for non-privileged workloads.
77+
78+
```yaml
79+
securityContext:
80+
runAsNonRoot: true
81+
seccompProfile:
82+
type: RuntimeDefault
83+
allowPrivilegeEscalation: false
84+
capabilities:
85+
drop:
86+
- ALL
87+
```
88+
89+
## Using Multi-Instance GPU (MIG)
90+
91+
Workloads that use the Multi-instance GPU (MIG) feature require MIG to be enabled on the worker nodes with [MIG-supported GPUs](https://docs.nvidia.com/datacenter/tesla/mig-user-guide/index.html#supported-gpus), e.g. A100.
92+
93+
First, make sure to stop any custom pods that might be using the GPU. Disable the DCGM and DCGM Exporter of the NVIDIA GPU Operator by editing the operator's cluster policy (set `enabled: false`).
94+
95+
Enable MIG via the MIG manager of the NVIDIA GPU Operator. **Do not configure MIG devices as the DRA driver will do it automatically on the fly.**
96+
97+
1. In the GPU operator namespace, create a `ConfigMap` with the key `config.yaml` and the following content:
98+
99+
```yaml
100+
version: v1
101+
mig-configs:
102+
all-enabled:
103+
- devices: all
104+
mig-enabled: true
105+
mig-devices: {}
106+
all-disabled:
107+
- devices: all
108+
mig-enabled: false
109+
```
110+
111+
2. Update the cluster policy to point the MIG manager to the new `ConfigMap`:
112+
113+
```yaml
114+
migManager:
115+
config:
116+
name: <configmap_name>
117+
```
118+
119+
3. Label the target nodes with `nvidia.com/mig.config=all-enabled`:
120+
121+
```console
122+
$ oc label node <node> nvidia.com/mig.config=all-enabled --overwrite
123+
```
124+
125+
MIG will be automatically enabled on the labeled nodes. For additional information, see [MIG Support in OpenShift Container Platform](https://docs.nvidia.com/datacenter/cloud-native/openshift/latest/mig-ocp.html).
126+
127+
You can verify the MIG status using the `nvidia-smi` command from a GPU driver pod:
128+
129+
```console
130+
$ oc exec -ti nvidia-driver-daemonset-<suffix> -n nvidia-gpu-operator -- nvidia-smi
131+
+-----------------------------------------------------------------------------------------+
132+
| NVIDIA-SMI 550.54.14 Driver Version: 550.54.14 CUDA Version: N/A |
133+
|-----------------------------------------+------------------------+----------------------+
134+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
135+
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
136+
| | | MIG M. |
137+
|=========================================+========================+======================|
138+
| 0 NVIDIA A100 80GB PCIe On | 00000000:17:00.0 Off | On |
139+
| N/A 35C P0 45W / 300W | 0MiB / 81920MiB | N/A Default |
140+
| | | Enabled |
141+
+-----------------------------------------+------------------------+----------------------+
142+
```
143+
144+
If the MIG status is marked with an asterisk (i.e. `Enabled*`), it means that the setting could not be fully applied and you may need to reboot the node.
145+
This can happen on some cloud service providers (CSP) where the CSP blocks GPU reset for the GPUs passed into a VM.
146+
147+
See the [NVIDIA Multi-Instance GPU User Guide](https://docs.nvidia.com/datacenter/tesla/mig-user-guide/index.html) for more information about MIG.
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
#!/usr/bin/env bash
2+
3+
set -ex
4+
set -o pipefail
5+
6+
oc create -f - <<EOF
7+
kind: CatalogSource
8+
apiVersion: operators.coreos.com/v1alpha1
9+
metadata:
10+
name: certified-operators-v415
11+
namespace: openshift-marketplace
12+
spec:
13+
displayName: Certified Operators v4.15
14+
image: registry.redhat.io/redhat/certified-operator-index:v4.15
15+
priority: -100
16+
publisher: Red Hat
17+
sourceType: grpc
18+
updateStrategy:
19+
registryPoll:
20+
interval: 10m0s
21+
EOF
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
#!/usr/bin/env bash
2+
3+
set -ex
4+
set -o pipefail
5+
6+
oc patch --type merge -p '{"spec":{"profile": "HighNodeUtilization", "profileCustomizations": {"dynamicResourceAllocation": "Enabled"}}}' scheduler cluster

0 commit comments

Comments
 (0)