Skip to content
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.

Commit 0c32dfa

Browse files
committedDec 22, 2024·
Document deploying DRA to OpenShift
* Document the differences on OpenShift * Include useful setup scripts Signed-off-by: Vitaliy Emporopulo <[email protected]>
1 parent b1fe289 commit 0c32dfa

File tree

4 files changed

+151
-1
lines changed

4 files changed

+151
-1
lines changed
 

‎README.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ A document and demo of the DRA support for GPUs provided by this repo can be fou
1212

1313
## Demo
1414

15-
This section describes using `kind` to demo the functionality of the NVIDIA GPU DRA Driver.
15+
This section describes using `kind` to demo the functionality of the NVIDIA GPU DRA Driver. For Red Hat OpenShift, refer to [running the NVIDIA DRA driver on OpenShift](demo/clusters/openshift/README.md).
1616

1717
First since we'll launch kind with GPU support, ensure that the following prerequisites are met:
1818
1. `kind` is installed. See the official documentation [here](https://kind.sigs.k8s.io/docs/user/quick-start/#installation).

‎demo/clusters/openshift/README.md

+138
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,138 @@
1+
# Running the NVIDIA DRA Driver on Red Hat OpenShift
2+
3+
This document explains the differences between deploying the NVIDIA DRA driver on OpenShift and upstream Kubernetes or its flavors.
4+
5+
## Prerequisites
6+
7+
Install OpenShift 4.16 or later. You can use the Assisted Installer to install on bare metal, or obtain an IPI installer binary (`openshift-install`) from the [OpenShift clients page](https://mirror.openshift.com/pub/openshift-v4/clients/ocp/) page. Refer to the [OpenShift documentation](https://docs.redhat.com/en/documentation/openshift_container_platform/latest/html/installation_overview/ocp-installation-overview) for different installation methods.
8+
9+
## Enabling DRA on OpenShift
10+
11+
Enable the `TechPreviewNoUpgrade` feature set as explained in [Enabling features using FeatureGates](https://docs.redhat.com/en/documentation/openshift_container_platform/latest/html/nodes/working-with-clusters#nodes-cluster-enabling-features-about_nodes-cluster-enabling), either during the installation or post-install. The feature set includes the `DynamicResourceAllocation` feature gate.
12+
13+
Update the cluster scheduler to enable the DRA scheduling plugin:
14+
15+
```console
16+
$ oc patch --type merge -p '{"spec":{"profile": "HighNodeUtilization", "profileCustomizations": {"dynamicResourceAllocation": "Enabled"}}}' scheduler cluster
17+
```
18+
19+
## NVIDIA GPU Drivers
20+
21+
The easiest way to install NVIDIA GPU drivers on OpenShift nodes is via the NVIDIA GPU Operator with the device plugin disabled. Follow the installation steps in [NVIDIA GPU Operator on Red Hat OpenShift Container Platform](https://docs.nvidia.com/datacenter/cloud-native/openshift/latest/index.html), and **_be careful to disable the device plugin so it does not conflict with the DRA plugin_**:
22+
23+
```yaml
24+
devicePlugin:
25+
enabled: false
26+
```
27+
28+
## NVIDIA Binaries on RHCOS
29+
30+
The location of some NVIDIA binaries on an OpenShift node differs from the defaults. Make sure to pass the following values when installing the Helm chart:
31+
32+
```yaml
33+
nvidiaDriverRoot: /run/nvidia/driver
34+
nvidiaCtkPath: /var/usrlocal/nvidia/toolkit/nvidia-ctk
35+
```
36+
37+
## OpenShift Security
38+
39+
OpenShift generally requires more stringent security settings than Kubernetes. If you see a warning about security context constraints when deploying the DRA plugin, pass the following to the Helm chart, either via an in-line variable or a values file:
40+
41+
```yaml
42+
kubeletPlugin:
43+
containers:
44+
plugin:
45+
securityContext:
46+
privileged: true
47+
seccompProfile:
48+
type: Unconfined
49+
```
50+
51+
If you see security context constraints errors/warnings when deploying a sample workload, make sure to update the workload's security settings according to the
52+
[OpenShift documentation](https://docs.redhat.com/en/documentation/openshift_container_platform/latest/html/operators/developing-operators#osdk-complying-with-psa). Usually applying the following `securityContext` definition at a pod or container level works for non-privileged workloads.
53+
54+
```yaml
55+
securityContext:
56+
runAsNonRoot: true
57+
seccompProfile:
58+
type: RuntimeDefault
59+
allowPrivilegeEscalation: false
60+
capabilities:
61+
drop:
62+
- ALL
63+
```
64+
65+
## Using Multi-Instance GPU (MIG)
66+
67+
Workloads that use the Multi-instance GPU (MIG) feature require MIG to be enabled on the worker nodes with [MIG-supported GPUs](https://docs.nvidia.com/datacenter/tesla/mig-user-guide/index.html#supported-gpus), e.g. A100.
68+
69+
First, make sure to stop any custom pods that might be using the GPU to avoid disruption when the new MIG configuration is applied.
70+
71+
Disable the CUDA validator so that it does not try to find a MIG partition (or a GPU) to run on, because none will be available until dynamically created by the DRA driver.
72+
73+
```yaml
74+
validator:
75+
<...>
76+
cuda:
77+
env:
78+
- name: WITH_WORKLOAD
79+
value: 'false'
80+
<...>
81+
```
82+
83+
Enable MIG via the MIG manager of the NVIDIA GPU Operator. **Do not configure MIG devices as the DRA driver will do it automatically on the fly**:
84+
85+
```console
86+
$ oc label node <node> nvidia.com/mig.config=all-enabled --overwrite
87+
```
88+
89+
MIG will be automatically enabled on the labeled nodes, and all existing MIG partitions will be deleted. For additional information, see [MIG Support in OpenShift Container Platform](https://docs.nvidia.com/datacenter/cloud-native/openshift/latest/mig-ocp.html).
90+
91+
**Note:**
92+
The `all-enabled` MIG configuration profile is available out of the box in the NVIDIA GPU Operator starting v24.3. With an earlier version, you may need to [create a custom profile](https://docs.nvidia.com/datacenter/cloud-native/openshift/latest/mig-ocp.html#creating-and-applying-a-custom-mig-configuration).
93+
94+
Update the cluster policy to include:
95+
96+
```yaml
97+
migManager:
98+
...
99+
env:
100+
- name: MIG_PARTED_MODE_CHANGE_ONLY
101+
value: 'true'
102+
...
103+
```
104+
105+
Setting `MIG_PARTED_MODE_CHANGE_ONLY=true` will prevent the MIG Manager from interfering with the DRA driver.
106+
107+
You can verify the MIG status using the `nvidia-smi` command from a GPU driver pod:
108+
109+
```console
110+
$ oc exec -ti nvidia-driver-daemonset-<suffix> -n nvidia-gpu-operator -- nvidia-smi
111+
+-----------------------------------------------------------------------------------------+
112+
| NVIDIA-SMI 550.54.14 Driver Version: 550.54.14 CUDA Version: N/A |
113+
|-----------------------------------------+------------------------+----------------------+
114+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
115+
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
116+
| | | MIG M. |
117+
|=========================================+========================+======================|
118+
| 0 NVIDIA A100 80GB PCIe On | 00000000:17:00.0 Off | On |
119+
| N/A 35C P0 45W / 300W | 0MiB / 81920MiB | N/A Default |
120+
| | | Enabled |
121+
+-----------------------------------------+------------------------+----------------------+
122+
```
123+
124+
**Note:**
125+
On some cloud service providers (CSP), the CSP blocks GPU reset for GPUs passed into a VM. In this case [ensure](https://docs.nvidia.com/datacenter/cloud-native/gpu-operator/latest/gpu-operator-mig.html#enabling-mig-during-installation) that the `WITH_REBOOT` environment variable is set to `true`:
126+
127+
```yaml
128+
migManager:
129+
...
130+
env:
131+
- name: WITH_REBOOT
132+
value: 'true'
133+
...
134+
```
135+
136+
When MIG settings could not be fully applied, the MIG status will be marked with an asterisk (i.e. `Enabled*`) and you will need to reboot the nodes manually.
137+
138+
See the [NVIDIA Multi-Instance GPU User Guide](https://docs.nvidia.com/datacenter/tesla/mig-user-guide/index.html) for more information about MIG.
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
apiVersion: config.openshift.io/v1
2+
kind: FeatureGate
3+
metadata:
4+
name: cluster
5+
spec:
6+
featureSet: TechPreviewNoUpgrade
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
#!/usr/bin/env bash
2+
3+
set -ex
4+
set -o pipefail
5+
6+
oc patch --type merge -p '{"spec":{"profile": "HighNodeUtilization", "profileCustomizations": {"dynamicResourceAllocation": "Enabled"}}}' scheduler cluster

0 commit comments

Comments
 (0)
Please sign in to comment.