Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add freeze API #68

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
82 changes: 82 additions & 0 deletions DOC.md
Original file line number Diff line number Diff line change
Expand Up @@ -94,6 +94,13 @@ ApplicationDisruptionBudgetSpec defines the desired state of ApplicationDisrupti
A NodeDisruption is allowed if at most "maxDisruptions" nodes selected by selectors are unavailable after the disruption.<br/>
</td>
<td>true</td>
</tr><tr>
<td><b><a href="#applicationdisruptionbudgetspecfreeze">freeze</a></b></td>
<td>object</td>
<td>
Define the freeze status of the budget. Frozen budget reject all disruptions ignoring any other constraints<br/>
</td>
<td>false</td>
</tr><tr>
<td><b><a href="#applicationdisruptionbudgetspechealthhook">healthHook</a></b></td>
<td>object</td>
Expand Down Expand Up @@ -121,6 +128,40 @@ Maintenance will proceed only if the endpoint responds 2XX.<br/>
</table>


### ApplicationDisruptionBudget.spec.freeze
<sup><sup>[↩ Parent](#applicationdisruptionbudgetspec)</sup></sup>



Define the freeze status of the budget. Frozen budget reject all disruptions ignoring any other constraints

<table>
<thead>
<tr>
<th>Name</th>
<th>Type</th>
<th>Description</th>
<th>Required</th>
</tr>
</thead>
<tbody><tr>
<td><b>enabled</b></td>
<td>boolean</td>
<td>
Freeze the budget to prevent any disruptions<br/>
</td>
<td>false</td>
</tr><tr>
<td><b>reason</b></td>
<td>string</td>
<td>
Reason of the freeze<br/>
</td>
<td>false</td>
</tr></tbody>
</table>


### ApplicationDisruptionBudget.spec.healthHook
<sup><sup>[↩ Parent](#applicationdisruptionbudgetspec)</sup></sup>

Expand Down Expand Up @@ -492,6 +533,13 @@ NodeDisruptionBudgetSpec defines the desired state of NodeDisruptionBudget
A NodeDisruption is allowed if at most "minUndisruptedNodes" nodes selected by selectors are unavailable after the disruption.<br/>
</td>
<td>true</td>
</tr><tr>
<td><b><a href="#nodedisruptionbudgetspecfreeze">freeze</a></b></td>
<td>object</td>
<td>
Define the freeze status of the budget. Frozen budget reject all disruptions ignoring any other constraints<br/>
</td>
<td>false</td>
</tr><tr>
<td><b><a href="#nodedisruptionbudgetspecnodeselector">nodeSelector</a></b></td>
<td>object</td>
Expand All @@ -503,6 +551,40 @@ NodeDisruptionBudgetSpec defines the desired state of NodeDisruptionBudget
</table>


### NodeDisruptionBudget.spec.freeze
<sup><sup>[↩ Parent](#nodedisruptionbudgetspec)</sup></sup>



Define the freeze status of the budget. Frozen budget reject all disruptions ignoring any other constraints

<table>
<thead>
<tr>
<th>Name</th>
<th>Type</th>
<th>Description</th>
<th>Required</th>
</tr>
</thead>
<tbody><tr>
<td><b>enabled</b></td>
<td>boolean</td>
<td>
Freeze the budget to prevent any disruptions<br/>
</td>
<td>false</td>
</tr><tr>
<td><b>reason</b></td>
<td>string</td>
<td>
Reason of the freeze<br/>
</td>
<td>false</td>
</tr></tbody>
</table>


### NodeDisruptionBudget.spec.nodeSelector
<sup><sup>[↩ Parent](#nodedisruptionbudgetspec)</sup></sup>

Expand Down
2 changes: 1 addition & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -75,7 +75,7 @@ test: manifests generate fmt vet envtest lint ## Run tests.
##@ Build

.PHONY: build
build: manifests generate fmt vet ## Build manager binary.
build: manifests generate fmt vet gen-doc ## Build manager binary.
CGO_ENABLED=0 go build -o bin/manager cmd/main.go

.PHONY: run
Expand Down
6 changes: 5 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -162,7 +162,6 @@ In some cases, an application can be unhealthy even if all its pods are running.

You can select Pods and/or PVCs.


##### PVC selector

The main reason of using a PVC selector is to ensure that node that contains data don't enter maintenance
Expand All @@ -182,6 +181,11 @@ The hook will be called with a POST method containing the JSON encoded NodeDisru

Note: It is not a replacement for readiness probes but a complement.

##### Freeze

Budgets support freezing disruptions. By setting `spec.Freeze.Enabled`, the budget will reject all disruptions and give the reason specified in `spec.Freeze.Reason`.
It is equivalent to setting 0 as the max disruptions but it provide better messages.

#### Sample object

```yaml
Expand Down
10 changes: 10 additions & 0 deletions api/v1alpha1/applicationdisruptionbudget_types.go
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,16 @@ type ApplicationDisruptionBudgetSpec struct {
// Maintenance will proceed only if the endpoint responds 2XX.
// +kubebuilder:validation:Optional
HealthHook HealthHookSpec `json:"healthHook,omitempty"`
// Define the freeze status of the budget. Frozen budget reject all disruptions ignoring any other constraints
Freeze FreezeSpec `json:"freeze,omitempty"`
}

// FreezeSpec defines the freeze status of the budget
type FreezeSpec struct {
// Freeze the budget to prevent any disruptions
Enabled bool `json:"enabled,omitempty"`
// Reason of the freeze
Reason string `json:"reason,omitempty"`
}

type HealthHookSpec struct {
Expand Down
2 changes: 2 additions & 0 deletions api/v1alpha1/nodedisruptionbudget_types.go
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,8 @@ type NodeDisruptionBudgetSpec struct {
MinUndisruptedNodes int `json:"minUndisruptedNodes"`
// NodeSelector query over pods whose nodes are managed by the disruption budget.
NodeSelector metav1.LabelSelector `json:"nodeSelector,omitempty"`
// Define the freeze status of the budget. Frozen budget reject all disruptions ignoring any other constraints
Freeze FreezeSpec `json:"freeze,omitempty"`
}

//+kubebuilder:object:root=true
Expand Down
17 changes: 17 additions & 0 deletions api/v1alpha1/zz_generated.deepcopy.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

11 changes: 11 additions & 0 deletions chart/templates/applicationdisruptionbudget-crd.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -54,6 +54,17 @@ spec:
description: ApplicationDisruptionBudgetSpec defines the desired state of
ApplicationDisruptionBudget
properties:
freeze:
description: Define the freeze status of the budget. Frozen budget reject
all disruptions ignoring any other constraints
properties:
enabled:
description: Freeze the budget to prevent any disruptions
type: boolean
reason:
description: Reason of the freeze
type: string
type: object
healthHook:
description: |-
Define a optional hook to call when validating a NodeDisruption.
Expand Down
11 changes: 11 additions & 0 deletions chart/templates/nodedisruptionbudget-crd.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,17 @@ spec:
spec:
description: NodeDisruptionBudgetSpec defines the desired state of NodeDisruptionBudget
properties:
freeze:
description: Define the freeze status of the budget. Frozen budget reject
all disruptions ignoring any other constraints
properties:
enabled:
description: Freeze the budget to prevent any disruptions
type: boolean
reason:
description: Reason of the freeze
type: string
type: object
maxDisruptedNodes:
description: A NodeDisruption is allowed if at most "maxDisruptedNodes"
nodes selected by selectors are unavailable after the disruption.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -53,6 +53,17 @@ spec:
description: ApplicationDisruptionBudgetSpec defines the desired state
of ApplicationDisruptionBudget
properties:
freeze:
description: Define the freeze status of the budget. Frozen budget
reject all disruptions ignoring any other constraints
properties:
enabled:
description: Freeze the budget to prevent any disruptions
type: boolean
reason:
description: Reason of the freeze
type: string
type: object
healthHook:
description: |-
Define a optional hook to call when validating a NodeDisruption.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,17 @@ spec:
spec:
description: NodeDisruptionBudgetSpec defines the desired state of NodeDisruptionBudget
properties:
freeze:
description: Define the freeze status of the budget. Frozen budget
reject all disruptions ignoring any other constraints
properties:
enabled:
description: Freeze the budget to prevent any disruptions
type: boolean
reason:
description: Reason of the freeze
type: string
type: object
maxDisruptedNodes:
description: A NodeDisruption is allowed if at most "maxDisruptedNodes"
nodes selected by selectors are unavailable after the disruption.
Expand Down
59 changes: 51 additions & 8 deletions internal/controller/applicationdisruptionbudget_controller.go
Original file line number Diff line number Diff line change
Expand Up @@ -115,6 +115,12 @@ func PruneADBMetrics(ref nodedisruptionv1alpha1.NamespacedName) {
// UpdateADBMetrics update metrics for an ADB
func UpdateADBMetrics(ref nodedisruptionv1alpha1.NamespacedName, adb *nodedisruptionv1alpha1.ApplicationDisruptionBudget) {
DisruptionBudgetMaxDisruptions.WithLabelValues(ref.Namespace, ref.Name, ref.Kind).Set(float64(adb.Spec.MaxDisruptions))
if adb.Spec.Freeze.Enabled {
DisruptionBudgetFrozen.WithLabelValues(ref.Namespace, ref.Name, ref.Kind).Set(1)
} else {
DisruptionBudgetFrozen.WithLabelValues(ref.Namespace, ref.Name, ref.Kind).Set(0)
}

UpdateBudgetStatusMetrics(ref, adb.Status)
}

Expand Down Expand Up @@ -199,8 +205,29 @@ func (r *ApplicationDisruptionBudgetResolver) IsImpacted(disruptedNodes resolver
}

// Return the number of disruption allowed considering a list of current node disruptions
func (r *ApplicationDisruptionBudgetResolver) TolerateDisruption(_ resolver.NodeSet) bool {
return r.ApplicationDisruptionBudget.Status.DisruptionsAllowed-1 >= 0
func (r *ApplicationDisruptionBudgetResolver) TryValidateDisruptionFromBudgetConstraints(_ resolver.NodeSet) nodedisruptionv1alpha1.DisruptedBudgetStatus {
if r.ApplicationDisruptionBudget.Spec.Freeze.Enabled {
return nodedisruptionv1alpha1.DisruptedBudgetStatus{
Reference: r.GetNamespacedName(),
Reason: fmt.Sprintf("Budget frozen: %s", r.ApplicationDisruptionBudget.Spec.Freeze.Reason),
Ok: false,
}
}

if r.ApplicationDisruptionBudget.Status.DisruptionsAllowed-1 >= 0 {
return nodedisruptionv1alpha1.DisruptedBudgetStatus{
Reference: r.GetNamespacedName(),
Reason: "",
Ok: true,
}
} else {
return nodedisruptionv1alpha1.DisruptedBudgetStatus{
Reference: r.GetNamespacedName(),
Reason: fmt.Sprintf("Number of allowed disruption exceeded (Remaining allowed disruptions: %d, current disruptions: %d)",
r.ApplicationDisruptionBudget.Status.DisruptionsAllowed, r.ApplicationDisruptionBudget.Status.CurrentDisruptions),
Ok: false,
}
}
}

func (r *ApplicationDisruptionBudgetResolver) UpdateStatus(ctx context.Context) error {
Expand All @@ -215,8 +242,24 @@ func (r *ApplicationDisruptionBudgetResolver) GetNamespacedName() nodedisruption
}
}

func (r *ApplicationDisruptionBudgetResolver) TryValidateDisruptionFromHealthHook(ctx context.Context, nd nodedisruptionv1alpha1.NodeDisruption) nodedisruptionv1alpha1.DisruptedBudgetStatus {
err := r.callHealthHook(ctx, nd)
if err != nil {
return nodedisruptionv1alpha1.DisruptedBudgetStatus{
Reference: r.GetNamespacedName(),
Reason: fmt.Sprintf("Failed to validate with healthHook: %s", err),
Ok: false,
}
}
return nodedisruptionv1alpha1.DisruptedBudgetStatus{
Reference: r.GetNamespacedName(),
Reason: "",
Ok: true,
}
}

// Call a lifecycle hook in order to synchronously validate a Node Disruption
func (r *ApplicationDisruptionBudgetResolver) CallHealthHook(ctx context.Context, nd nodedisruptionv1alpha1.NodeDisruption) error {
func (r *ApplicationDisruptionBudgetResolver) callHealthHook(ctx context.Context, nd nodedisruptionv1alpha1.NodeDisruption) error {
if r.ApplicationDisruptionBudget.Spec.HealthHook.URL == "" {
return nil
}
Expand All @@ -226,15 +269,15 @@ func (r *ApplicationDisruptionBudgetResolver) CallHealthHook(ctx context.Context

data, err := json.Marshal(nd)
if err != nil {
return err
return fmt.Errorf("controller error: Failed to serialize node disruption: %w", err)
}

namespacedName := r.GetNamespacedName()

req, err := http.NewRequestWithContext(ctx, http.MethodPost, r.ApplicationDisruptionBudget.Spec.HealthHook.URL, bytes.NewReader(data))
if err != nil {
DisruptionBudgetCheckHealthHookErrorTotal.WithLabelValues(namespacedName.Namespace, namespacedName.Name, namespacedName.Kind).Inc()
return err
return fmt.Errorf("controller error: Error while building request: %w", err)
}

headers["Content-Type"] = []string{"application/json"}
Expand All @@ -244,21 +287,21 @@ func (r *ApplicationDisruptionBudgetResolver) CallHealthHook(ctx context.Context
resp, err := client.Do(req)
if err != nil {
DisruptionBudgetCheckHealthHookErrorTotal.WithLabelValues(namespacedName.Namespace, namespacedName.Name, namespacedName.Kind).Inc()
return err
return fmt.Errorf("error while performing request on healthHook: %w", err)
}

body, err := io.ReadAll(resp.Body)
if err != nil {
DisruptionBudgetCheckHealthHookErrorTotal.WithLabelValues(namespacedName.Namespace, namespacedName.Name, namespacedName.Kind).Inc()
return err
return fmt.Errorf("error while reading response fron healthHook: %w", err)
}

DisruptionBudgetCheckHealthHookStatusCodeTotal.WithLabelValues(namespacedName.Namespace, namespacedName.Name, namespacedName.Kind, strconv.Itoa(resp.StatusCode)).Inc()

if resp.StatusCode >= 200 && resp.StatusCode < 300 {
return nil
}
return fmt.Errorf("http server responded with non 2XX status code: %s", string(body))
return fmt.Errorf("HealthHook responded with non 2XX status code: %s", string(body))
}

func (r *ApplicationDisruptionBudgetResolver) GetSelectedNodes(ctx context.Context) (resolver.NodeSet, error) {
Expand Down
Loading
Loading