Skip to content

Commit e3b002f

Browse files
committed
Document deploying DRA to OpenShift
* Document the differences on OpenShift * Include useful setup scripts
1 parent 1646035 commit e3b002f

File tree

5 files changed

+198
-1
lines changed

5 files changed

+198
-1
lines changed

README.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ A document and demo of the DRA support for GPUs provided by this repo can be fou
1212

1313
## Demo
1414

15-
This section describes using `kind` to demo the functionality of the NVIDIA GPU DRA Driver.
15+
This section describes using `kind` to demo the functionality of the NVIDIA GPU DRA Driver. For Red Hat OpenShift, refer to [running the NVIDIA DRA driver on OpenShift](demo/clusters/openshift/README.md).
1616

1717
First since we'll launch kind with GPU support, ensure that the following prerequisites are met:
1818
1. `kind` is installed. See the official documentation [here](https://kind.sigs.k8s.io/docs/user/quick-start/#installation).

demo/clusters/openshift/README.md

+140
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,140 @@
1+
# Running the NVIDIA DRA Driver on Red Hat OpenShift
2+
3+
This document explains the differences between deploying the NVIDIA DRA driver on OpenShift and upstream Kubernetes or its flavors.
4+
5+
## Prerequisites
6+
7+
Install a recent build of OpenShift 4.16 (.e.g. 4.16.0-ec.3). You can obtain an IPI installer binary (`openshift-install`) from the [Release Status](https://amd64.ocp.releases.ci.openshift.org/) page, or use the Assisted Installer to install on bare metal. Refer to the [OpenShift documentation](https://docs.openshift.com/container-platform/4.15/installing/index.html) for different installation methods.
8+
9+
## Enabling DRA on OpenShift
10+
11+
Enable the `TechPreviewNoUpgrade` feature set as explained in [Enabling features using FeatureGates](https://docs.openshift.com/container-platform/4.15/nodes/clusters/nodes-cluster-enabling-features.html), either during the installation or post-install. The feature set includes the `DynamicResourceAllocation` feature.
12+
13+
Update the cluster scheduler to enable the DRA scheduling plugin:
14+
15+
```console
16+
$ oc patch --type merge -p '{"spec":{"profile": "HighNodeUtilization", "profileCustomizations": {"dynamicResourceAllocation": "Enabled"}}}' scheduler cluster
17+
```
18+
19+
## NVIDIA GPU Drivers
20+
21+
The easiest way to install NVIDIA GPU drivers on OpenShift nodes is via the NVIDIA GPU Operator.
22+
23+
**Be careful to disable the device plugin so it does not conflict with the DRA plugin**. It is recommended to leave only the NVIDIA GPU driver and driver toolkit configs, and disable everything else:
24+
25+
```yaml
26+
<...>
27+
devicePlugin:
28+
enabled: false
29+
<...>
30+
driver:
31+
enabled: true
32+
<...>
33+
toolkit:
34+
enabled: true
35+
<...>
36+
```
37+
38+
39+
The NVIDIA GPU Operator might not be available through the OperatorHub in a pre-production version of OpenShift. In this case, deploy the operator from a bundle or add a certified catalog index from an earlier version of OpenShift, e.g.:
40+
41+
```yaml
42+
kind: CatalogSource
43+
apiVersion: operators.coreos.com/v1alpha1
44+
metadata:
45+
name: certified-operators-v415
46+
namespace: openshift-marketplace
47+
spec:
48+
displayName: Certified Operators v4.15
49+
image: registry.redhat.io/redhat/certified-operator-index:v4.15
50+
priority: -100
51+
publisher: Red Hat
52+
sourceType: grpc
53+
updateStrategy:
54+
registryPoll:
55+
interval: 10m0s
56+
```
57+
58+
Then follow the installation steps in [NVIDIA GPU Operator on Red Hat OpenShift Container Platform](https://docs.nvidia.com/datacenter/cloud-native/openshift/latest/index.html).
59+
60+
## NVIDIA Binaries on RHCOS
61+
62+
The location of some NVIDIA binaries on an OpenShift node differs from the defaults. Make sure to pass the following values when installing the Helm chart:
63+
64+
```yaml
65+
nvidiaDriverRoot: /run/nvidia/driver
66+
nvidiaCtkPath: /var/usrlocal/nvidia/toolkit/nvidia-ctk
67+
```
68+
69+
## OpenShift Security
70+
71+
OpenShift generally requires more stringent security settings than Kubernetes. If you see a warning about security context constraints when deploying the DRA plugin, pass the following to the Helm chart, either via an in-line variable or a values file:
72+
73+
```yaml
74+
kubeletPlugin:
75+
containers:
76+
plugin:
77+
securityContext:
78+
privileged: true
79+
seccompProfile:
80+
type: Unconfined
81+
```
82+
83+
If you see security context constraints errors/warnings when deploying a sample workload, make sure to update the workload's security settings according to the [OpenShift documentation](https://docs.openshift.com/container-platform/4.15/operators/operator_sdk/osdk-complying-with-psa.html). Usually applying the following `securityContext` definition at a pod or container level works for non-privileged workloads.
84+
85+
```yaml
86+
securityContext:
87+
runAsNonRoot: true
88+
seccompProfile:
89+
type: RuntimeDefault
90+
allowPrivilegeEscalation: false
91+
capabilities:
92+
drop:
93+
- ALL
94+
```
95+
96+
If you see the following error when trying to deploy a workload:
97+
98+
```console
99+
Warning FailedScheduling 21m default-scheduler running Reserve plugin "DynamicResources": podschedulingcontexts.resource.k8s.io "gpu-example" is forbidden: cannot set blockOwnerDeletion if an ownerReference refers to a resource you can't set finalizers on: , <nil>
100+
```
101+
102+
apply the following RBAC configuration (this should be fixed in newer OpenShift builds):
103+
104+
```yaml
105+
apiVersion: rbac.authorization.k8s.io/v1
106+
kind: ClusterRole
107+
metadata:
108+
name: system:kube-scheduler:podfinalizers
109+
rules:
110+
- apiGroups:
111+
- ""
112+
resources:
113+
- pods/finalizers
114+
verbs:
115+
- update
116+
---
117+
apiVersion: rbac.authorization.k8s.io/v1
118+
kind: ClusterRoleBinding
119+
metadata:
120+
name: system:kube-scheduler:podfinalizers:crbinding
121+
roleRef:
122+
apiGroup: rbac.authorization.k8s.io
123+
kind: ClusterRole
124+
name: system:kube-scheduler:podfinalizers
125+
subjects:
126+
- kind: User
127+
name: system:kube-scheduler
128+
```
129+
130+
## Using Multi-Instance GPU (MIG)
131+
132+
Workloads that use the Multi-instance GPU (MIG) feature require MIG to be [enabled](https://docs.nvidia.com/datacenter/tesla/mig-user-guide/index.html#enable-mig-mode) on the worker nodes with [MIG-supported GPUs](https://docs.nvidia.com/datacenter/tesla/mig-user-guide/index.html#supported-gpus), e.g. A100.
133+
134+
You can do it via the driver daemon set pod running on a GPU node as follows (here, the GPU ID is 0, i.e. `-i 0`):
135+
136+
```console
137+
oc exec -ti nvidia-driver-daemonset-416.94.202402160025-0-g45bd -n nvidia-gpu-operator -- nvidia-smi -i 0 -mig 1
138+
Enabled MIG Mode for GPU 00000000:0A:00.0
139+
All done.
140+
```
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
#!/usr/bin/env bash
2+
3+
set -ex
4+
set -o pipefail
5+
6+
oc create -f - <<EOF
7+
kind: CatalogSource
8+
apiVersion: operators.coreos.com/v1alpha1
9+
metadata:
10+
name: certified-operators-v415
11+
namespace: openshift-marketplace
12+
spec:
13+
displayName: Certified Operators v4.15
14+
image: registry.redhat.io/redhat/certified-operator-index:v4.15
15+
priority: -100
16+
publisher: Red Hat
17+
sourceType: grpc
18+
updateStrategy:
19+
registryPoll:
20+
interval: 10m0s
21+
EOF
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
#!/usr/bin/env bash
2+
3+
set -ex
4+
set -o pipefail
5+
6+
oc patch --type merge -p '{"spec":{"profile": "HighNodeUtilization", "profileCustomizations": {"dynamicResourceAllocation": "Enabled"}}}' scheduler cluster
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,30 @@
1+
#!/usr/bin/env bash
2+
3+
set -ex
4+
set -o pipefail
5+
6+
oc apply -f - <<EOF
7+
apiVersion: rbac.authorization.k8s.io/v1
8+
kind: ClusterRole
9+
metadata:
10+
name: system:kube-scheduler:podfinalizers
11+
rules:
12+
- apiGroups:
13+
- ""
14+
resources:
15+
- pods/finalizers
16+
verbs:
17+
- update
18+
---
19+
apiVersion: rbac.authorization.k8s.io/v1
20+
kind: ClusterRoleBinding
21+
metadata:
22+
name: system:kube-scheduler:podfinalizers:crbinding
23+
roleRef:
24+
apiGroup: rbac.authorization.k8s.io
25+
kind: ClusterRole
26+
name: system:kube-scheduler:podfinalizers
27+
subjects:
28+
- kind: User
29+
name: system:kube-scheduler
30+
EOF

0 commit comments

Comments
 (0)