Skip to content

Commit d096544

Browse files
committed
fixup! PRR review
1 parent 86824bf commit d096544

File tree

2 files changed

+22
-6
lines changed
  • keps/sig-scheduling
    • 5027-dra-admin-controlled-device-attributes
    • 5055-dra-device-taints-and-tolerations

2 files changed

+22
-6
lines changed

keps/sig-scheduling/5027-dra-admin-controlled-device-attributes/README.md

+14-6
Original file line numberDiff line numberDiff line change
@@ -211,12 +211,15 @@ part of this proposal.
211211
Perhaps `kubectl describe resourceslices` can be extended to include the
212212
additional information. For now this is out of scope.
213213

214-
Creating a ResourceSlicePatch is racing with on-going scheduling attempts,
215-
which is unavoidable. Removing a device from a ResourceSlice has the same
216-
problem.
217-
218214
### Risks and Mitigations
219215

216+
Creating a ResourceSlicePatch is racing with on-going scheduling attempts.
217+
This is unavoidable. Removing a device from a ResourceSlice has the same
218+
problem: updates need to reach the scheduler before it can consider them.
219+
Evaluating a patch on the client-side instead of [having a controller update
220+
slices]((#storing-result-of-patching-in-resourceslice) mitigates this risk by
221+
shortening the time window where updates must be sent to the scheduler.
222+
220223
From a security perspective, permission to patch device attributes is
221224
expected to be limited to privileged users who already have the ability to add
222225
or remove DRA drivers, so there won't be a substantial difference.
@@ -277,9 +280,14 @@ type DevicePatch struct {
277280
// capacity, the value of the DevicePatch is used. If multiple
278281
// different DevicePatches match the same device, then the one with
279282
// the highest priority wins. If priorities are equal, the older
280-
// patch wins. This ensures that adding a new patch does not
283+
// patch wins, where "older" is determined based on the creation time.
284+
// This ensures that adding a new patch does not
281285
// accidentally change the effect of some existing patch unless
282-
// that is clearly intended according to the priority.
286+
// that is clearly intended according to the priority. Updates
287+
// do not change the creation time, so it could still happen that
288+
// a more recent change is preferred because it happens to be in
289+
// an older DevicePatch. Overall it is better to set the
290+
// priority to different values to avoid such ambiguities.
283291
//
284292
// +optional
285293
Priority *int

keps/sig-scheduling/5055-dra-device-taints-and-tolerations/README.md

+8
Original file line numberDiff line numberDiff line change
@@ -207,6 +207,14 @@ ResourceSlicePatch with a unique name, then remember to remove that
207207
ResourceSlicePatch again. For beta, support in `kubectl` for common
208208
operations may be needed.
209209

210+
Users might be tempted to tolerate taints to get their pods running. They do
211+
that at their own risk. Depending on the taint, the application then may not
212+
get the performance it needs (degraded hardware) or may fail at runtime
213+
(hardware gets turned off). Admission controllers or validating admission
214+
policies could be deployed to limit which tolerations may be used, but as
215+
taints are not defined by Kubernetes itself, none of that is part of Kubernetes
216+
itself.
217+
210218
## Design Details
211219

212220
The feature is following the approach and APIs taken for node taints and

0 commit comments

Comments
 (0)