Skip to content

Commit 77252e3

Browse files
sagor999spohneradastleyatvi
authored
Add node affinity support (zalando#1166)
* Adding nodeaffinity support alongside node_readiness_label * add documentation for node affinity * add node affinity e2e test * add unit test for node affinity Co-authored-by: Steffen Pøhner Henriksen <[email protected]> Co-authored-by: Adrian Astley <[email protected]>
1 parent f28706e commit 77252e3

File tree

10 files changed

+505
-17
lines changed

10 files changed

+505
-17
lines changed

charts/postgres-operator/crds/postgresqls.yaml

Lines changed: 91 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -396,6 +396,97 @@ spec:
396396
type: string
397397
caSecretName:
398398
type: string
399+
nodeAffinity:
400+
type: object
401+
properties:
402+
preferredDuringSchedulingIgnoredDuringExecution:
403+
type: array
404+
items:
405+
type: object
406+
required:
407+
- weight
408+
- preference
409+
properties:
410+
preference:
411+
type: object
412+
properties:
413+
matchExpressions:
414+
type: array
415+
items:
416+
type: object
417+
required:
418+
- key
419+
- operator
420+
properties:
421+
key:
422+
type: string
423+
operator:
424+
type: string
425+
values:
426+
type: array
427+
items:
428+
type: string
429+
matchFields:
430+
type: array
431+
items:
432+
type: object
433+
required:
434+
- key
435+
- operator
436+
properties:
437+
key:
438+
type: string
439+
operator:
440+
type: string
441+
values:
442+
type: array
443+
items:
444+
type: string
445+
weight:
446+
format: int32
447+
type: integer
448+
requiredDuringSchedulingIgnoredDuringExecution:
449+
type: object
450+
required:
451+
- nodeSelectorTerms
452+
properties:
453+
nodeSelectorTerms:
454+
type: array
455+
items:
456+
type: object
457+
properties:
458+
matchExpressions:
459+
type: array
460+
items:
461+
type: object
462+
required:
463+
- key
464+
- operator
465+
properties:
466+
key:
467+
type: string
468+
operator:
469+
type: string
470+
values:
471+
type: array
472+
items:
473+
type: string
474+
matchFields:
475+
type: array
476+
items:
477+
type: object
478+
required:
479+
- key
480+
- operator
481+
properties:
482+
key:
483+
type: string
484+
operator:
485+
type: string
486+
values:
487+
type: array
488+
items:
489+
type: string
399490
tolerations:
400491
type: array
401492
items:

docs/user.md

Lines changed: 23 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -517,7 +517,7 @@ manifest the operator will raise the limits to the configured minimum values.
517517
If no resources are defined in the manifest they will be obtained from the
518518
configured [default requests](reference/operator_parameters.md#kubernetes-resource-requests).
519519

520-
## Use taints and tolerations for dedicated PostgreSQL nodes
520+
## Use taints, tolerations and node affinity for dedicated PostgreSQL nodes
521521

522522
To ensure Postgres pods are running on nodes without any other application pods,
523523
you can use [taints and tolerations](https://kubernetes.io/docs/concepts/configuration/taint-and-toleration/)
@@ -531,6 +531,28 @@ spec:
531531
effect: NoSchedule
532532
```
533533

534+
If you need the pods to be scheduled on specific nodes you may use [node affinity](https://kubernetes.io/docs/tasks/configure-pod-container/assign-pods-nodes-using-node-affinity/)
535+
to specify a set of label(s), of which a prospective host node must have at least one. This could be used to
536+
place nodes with certain hardware capabilities (e.g. SSD drives) in certain environments or network segments,
537+
e.g. for PCI compliance.
538+
539+
```yaml
540+
apiVersion: "acid.zalan.do/v1"
541+
kind: postgresql
542+
metadata:
543+
name: acid-minimal-cluster
544+
spec:
545+
teamId: "ACID"
546+
nodeAffinity:
547+
requiredDuringSchedulingIgnoredDuringExecution:
548+
nodeSelectorTerms:
549+
- matchExpressions:
550+
- key: environment
551+
operator: In
552+
values:
553+
- pci
554+
```
555+
534556
## How to clone an existing PostgreSQL cluster
535557

536558
You can spin up a new cluster as a clone of the existing one, using a `clone`

e2e/tests/test_e2e.py

Lines changed: 106 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -929,6 +929,112 @@ def test_zzz_taint_based_eviction(self):
929929
new_master_node = nm[0]
930930
self.assert_distributed_pods(new_master_node, new_replica_nodes, cluster_label)
931931

932+
@timeout_decorator.timeout(TEST_TIMEOUT_SEC)
933+
def test_node_affinity(self):
934+
'''
935+
Add label to a node and update postgres cluster spec to deploy only on a node with that label
936+
'''
937+
k8s = self.k8s
938+
cluster_label = 'application=spilo,cluster-name=acid-minimal-cluster'
939+
940+
# verify we are in good state from potential previous tests
941+
self.eventuallyEqual(lambda: k8s.count_running_pods(), 2, "No 2 pods running")
942+
self.eventuallyEqual(lambda: len(k8s.get_patroni_running_members("acid-minimal-cluster-0")), 2, "Postgres status did not enter running")
943+
self.eventuallyEqual(lambda: self.k8s.get_operator_state(), {"0": "idle"}, "Operator does not get in sync")
944+
945+
# get nodes of master and replica(s)
946+
master_node, replica_nodes = k8s.get_pg_nodes(cluster_label)
947+
948+
self.assertNotEqual(master_node, [])
949+
self.assertNotEqual(replica_nodes, [])
950+
951+
# label node with environment=postgres
952+
node_label_body = {
953+
"metadata": {
954+
"labels": {
955+
"node-affinity-test": "postgres"
956+
}
957+
}
958+
}
959+
960+
try:
961+
# patch current master node with the label
962+
print('patching master node: {}'.format(master_node))
963+
k8s.api.core_v1.patch_node(master_node, node_label_body)
964+
965+
# add node affinity to cluster
966+
patch_node_affinity_config = {
967+
"spec": {
968+
"nodeAffinity" : {
969+
"requiredDuringSchedulingIgnoredDuringExecution": {
970+
"nodeSelectorTerms": [
971+
{
972+
"matchExpressions": [
973+
{
974+
"key": "node-affinity-test",
975+
"operator": "In",
976+
"values": [
977+
"postgres"
978+
]
979+
}
980+
]
981+
}
982+
]
983+
}
984+
}
985+
}
986+
}
987+
988+
k8s.api.custom_objects_api.patch_namespaced_custom_object(
989+
group="acid.zalan.do",
990+
version="v1",
991+
namespace="default",
992+
plural="postgresqls",
993+
name="acid-minimal-cluster",
994+
body=patch_node_affinity_config)
995+
self.eventuallyEqual(lambda: self.k8s.get_operator_state(), {"0": "idle"}, "Operator does not get in sync")
996+
997+
# node affinity change should cause replica to relocate from replica node to master node due to node affinity requirement
998+
k8s.wait_for_pod_failover(master_node, 'spilo-role=replica,' + cluster_label)
999+
k8s.wait_for_pod_start('spilo-role=replica,' + cluster_label)
1000+
1001+
podsList = k8s.api.core_v1.list_namespaced_pod('default', label_selector=cluster_label)
1002+
for pod in podsList.items:
1003+
if pod.metadata.labels.get('spilo-role') == 'replica':
1004+
self.assertEqual(master_node, pod.spec.node_name,
1005+
"Sanity check: expected replica to relocate to master node {}, but found on {}".format(master_node, pod.spec.node_name))
1006+
1007+
# check that pod has correct node affinity
1008+
key = pod.spec.affinity.node_affinity.required_during_scheduling_ignored_during_execution.node_selector_terms[0].match_expressions[0].key
1009+
value = pod.spec.affinity.node_affinity.required_during_scheduling_ignored_during_execution.node_selector_terms[0].match_expressions[0].values[0]
1010+
self.assertEqual("node-affinity-test", key,
1011+
"Sanity check: expect node selector key to be equal to 'node-affinity-test' but got {}".format(key))
1012+
self.assertEqual("postgres", value,
1013+
"Sanity check: expect node selector value to be equal to 'postgres' but got {}".format(value))
1014+
1015+
patch_node_remove_affinity_config = {
1016+
"spec": {
1017+
"nodeAffinity" : None
1018+
}
1019+
}
1020+
k8s.api.custom_objects_api.patch_namespaced_custom_object(
1021+
group="acid.zalan.do",
1022+
version="v1",
1023+
namespace="default",
1024+
plural="postgresqls",
1025+
name="acid-minimal-cluster",
1026+
body=patch_node_remove_affinity_config)
1027+
self.eventuallyEqual(lambda: self.k8s.get_operator_state(), {"0": "idle"}, "Operator does not get in sync")
1028+
1029+
# remove node affinity to move replica away from master node
1030+
nm, new_replica_nodes = k8s.get_cluster_nodes()
1031+
new_master_node = nm[0]
1032+
self.assert_distributed_pods(new_master_node, new_replica_nodes, cluster_label)
1033+
1034+
except timeout_decorator.TimeoutError:
1035+
print('Operator log: {}'.format(k8s.get_operator_log()))
1036+
raise
1037+
9321038
@timeout_decorator.timeout(TEST_TIMEOUT_SEC)
9331039
def test_zzzz_cluster_deletion(self):
9341040
'''

manifests/complete-postgres-manifest.yaml

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -172,3 +172,14 @@ spec:
172172
# When TLS is enabled, also set spiloFSGroup parameter above to the relevant value.
173173
# if unknown, set it to 103 which is the usual value in the default spilo images.
174174
# In Openshift, there is no need to set spiloFSGroup/spilo_fsgroup.
175+
176+
# Add node affinity support by allowing postgres pods to schedule only on nodes that
177+
# have label: "postgres-operator:enabled" set.
178+
# nodeAffinity:
179+
# requiredDuringSchedulingIgnoredDuringExecution:
180+
# nodeSelectorTerms:
181+
# - matchExpressions:
182+
# - key: postgres-operator
183+
# operator: In
184+
# values:
185+
# - enabled

manifests/postgresql.crd.yaml

Lines changed: 91 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -392,6 +392,97 @@ spec:
392392
type: string
393393
caSecretName:
394394
type: string
395+
nodeAffinity:
396+
type: object
397+
properties:
398+
preferredDuringSchedulingIgnoredDuringExecution:
399+
type: array
400+
items:
401+
type: object
402+
required:
403+
- weight
404+
- preference
405+
properties:
406+
preference:
407+
type: object
408+
properties:
409+
matchExpressions:
410+
type: array
411+
items:
412+
type: object
413+
required:
414+
- key
415+
- operator
416+
properties:
417+
key:
418+
type: string
419+
operator:
420+
type: string
421+
values:
422+
type: array
423+
items:
424+
type: string
425+
matchFields:
426+
type: array
427+
items:
428+
type: object
429+
required:
430+
- key
431+
- operator
432+
properties:
433+
key:
434+
type: string
435+
operator:
436+
type: string
437+
values:
438+
type: array
439+
items:
440+
type: string
441+
weight:
442+
format: int32
443+
type: integer
444+
requiredDuringSchedulingIgnoredDuringExecution:
445+
type: object
446+
required:
447+
- nodeSelectorTerms
448+
properties:
449+
nodeSelectorTerms:
450+
type: array
451+
items:
452+
type: object
453+
properties:
454+
matchExpressions:
455+
type: array
456+
items:
457+
type: object
458+
required:
459+
- key
460+
- operator
461+
properties:
462+
key:
463+
type: string
464+
operator:
465+
type: string
466+
values:
467+
type: array
468+
items:
469+
type: string
470+
matchFields:
471+
type: array
472+
items:
473+
type: object
474+
required:
475+
- key
476+
- operator
477+
properties:
478+
key:
479+
type: string
480+
operator:
481+
type: string
482+
values:
483+
type: array
484+
items:
485+
type: string
395486
tolerations:
396487
type: array
397488
items:

0 commit comments

Comments
 (0)