Skip to content

Commit 7dd7691

Browse files
authored
Update Kubernetes related docs (#2258)
Update k8s peer discovery to 4.1 Move Kubernetes DIY guidelines to a separate page
1 parent acf0163 commit 7dd7691

File tree

6 files changed

+326
-490
lines changed

6 files changed

+326
-490
lines changed

docs/cluster-formation.md

Lines changed: 27 additions & 244 deletions
Original file line numberDiff line numberDiff line change
@@ -133,7 +133,7 @@ cluster_formation.registration.enabled = false
133133
```
134134

135135
When configured this way, the node has to be registered manually or using another mechanism,
136-
e.g. by a container orchestrator such as [Nomad](https://developer.hashicorp.com/nomad/integrations/hashicorp/rabbitmq) or [Kubernetes](https://www.rabbitmq.com/kubernetes/operator/operator-overview).
136+
e.g. by a container orchestrator such as [Nomad](https://developer.hashicorp.com/nomad/integrations/hashicorp/rabbitmq).
137137

138138
If peer discovery isn't configured, or it [repeatedly fails](#discovery-retries),
139139
or no peers are reachable, a node that wasn't a cluster member in the past
@@ -462,268 +462,51 @@ cluster_formation.aws.use_private_ip = true
462462

463463
## Peer Discovery on Kubernetes {#peer-discovery-k8s}
464464

465-
### Kubernetes Peer Discovery Overview
466-
467-
A [Kubernetes](https://kubernetes.io/)-based discovery mechanism
468-
is available via [a plugin](https://github.com/rabbitmq/rabbitmq-server/tree/main/deps/rabbitmq_peer_discovery_k8s).
469-
470-
As with any [plugin](./plugins), it must be enabled before it
471-
can be used. For peer discovery plugins it means they must be [enabled](./plugins#basics)
472-
or [preconfigured](./plugins#enabled-plugins-file) before first node boot:
473-
474-
```bash
475-
rabbitmq-plugins --offline enable rabbitmq_peer_discovery_k8s
476-
```
477-
478-
### Important: Prerequisites and Deployment Considerations
465+
:::tip
479466

480-
:::important
481-
The recommended option for deploying RabbitMQ to Kubernetes is the [RabbitMQ Kubernetes Cluster Operator](/kubernetes/operator/operator-overview).
467+
In most cases you don't need to worry about peer discovery, when deploying to Kubernetes.
482468

483-
It follows the recommendations listed below.
469+
[Cluster Operator](/kubernetes/operator/operator-overview) (the recommended way of deploying to Kubernetes)
470+
as well as popular Helm charts, pre-configure peer discovery for you.
484471
:::
485472

486-
With this mechanism, nodes fetch a list of their peers from
487-
a Kubernetes API endpoint using a set of configured values:
488-
a URI scheme, host, port, as well as the token and certificate paths.
489-
490-
If the recommended option of the [RabbitMQ Kubernetes Cluster Operator](/kubernetes/operator/operator-overview) cannot be used,
491-
there are several prerequisites and deployment choices that must be taken into
492-
account when deploying RabbitMQ to Kubernetes, with this peer discovery mechanism
493-
and in general.
494-
495-
#### Use a Stateful Set
496-
497-
A RabbitMQ cluster deployed to Kubernetes will use a set of pods. The set must be a [stateful set](https://kubernetes.io/docs/tasks/run-application/run-replicated-stateful-application/#statefulset).
498-
A [headless service](https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/#limitations) must be used to
499-
control [network identity of the pods](https://kubernetes.io/docs/concepts/services-networking/dns-pod-service/)
500-
(their hostnames), which in turn affect RabbitMQ node names.
501-
On the headless service `spec`, field `publishNotReadyAddresses` must be set to `true` to propagate SRV DNS records for its Pods for the purpose of peer discovery.
502-
503-
In addition, since RabbitMQ nodes [resolve their own and peer hostnames during boot](./clustering#hostname-resolution-requirement),
504-
CoreDNS [caching timeout may need to be decreased](https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/#stable-network-id) from default 30 seconds
505-
to a value in the 5-10 second range.
506-
507-
:::important
508-
509-
CoreDNS [caching timeout may need to be decreased](https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/#stable-network-id)
510-
from default 30 seconds to a value in the 5-10 second range
511-
512-
:::
513-
514-
If a stateless set is used recreated nodes will not have their persisted data and will start as blank nodes.
515-
This can lead to data loss and higher network traffic volume due to more frequent
516-
data synchronisation of both [quorum queues](./quorum-queues)
517-
and [streams](./streams) on newly joining nodes.
518-
519-
#### Use Persistent Volumes
520-
521-
How [storage is configured](https://kubernetes.io/docs/concepts/storage/persistent-volumes/)
522-
is generally orthogonal to peer discovery. However, it does not make sense to run a stateful
523-
data service such as RabbitMQ with [node data directory](./relocate) stored on a transient volume.
524-
Use of transient volumes can lead nodes to not have their persisted data after a restart.
525-
This has the same consequences as with stateless sets covered above.
526-
527-
#### Make Sure `/etc/rabbitmq` is Mounted as Writeable
528-
529-
RabbitMQ nodes and images may need to update a file under `/etc/rabbitmq`, the default [configuration file location](./configure#config-location) on Linux. This may involve configuration file generation
530-
performed by the image used, [enabled plugins file](./plugins#enabled-plugins-file) updates,
531-
and so on.
532-
533-
It is therefore highly recommended that `/etc/rabbitmq` is mounted as writeable and owned by
534-
RabbitMQ's effective user (typically `rabbitmq`).
535-
536-
#### Use Parallel podManagementPolicy
537-
538-
`podManagementPolicy: "Parallel"` is the recommended option for RabbitMQ clusters.
539-
540-
Because of [how nodes rejoin their cluster](./clustering#restarting), `podManagementPolicy` set to `OrderedReady`
541-
can lead to a deployment deadlock with certain readiness probes:
542-
543-
* Kubernetes will expect the first node to pass a readiness probe
544-
* The readiness probe may require a fully booted node
545-
* The node will fully boot after it detects that its peers have come online
546-
* Kubernetes will not start any more pods until the first one boots
547-
* The deployment therefore is deadlocked
548-
549-
`podManagementPolicy: "Parallel"` avoids this problem, and the Kubernetes peer discovery plugin
550-
then deals with the [natural race condition present during parallel cluster formation](#initial-formation-race-condition).
551-
552-
553-
#### Use Most Basic Health Checks for RabbitMQ Pod Readiness Probes
554-
555-
A readiness probe that expects the node to be fully booted and have rejoined its cluster peers
556-
can deadlock a deployment that restarts all RabbitMQ pods and relies on the `OrderedReady` pod management policy.
557-
Deployments that use the `Parallel` pod management policy
558-
will not be affected.
559-
560-
One health check that does not expect a node to be fully booted and have schema tables synced is
561-
562-
```bash
563-
# a very basic check that will succeed for the nodes that are currently waiting for
564-
# a peer to sync schema from
565-
rabbitmq-diagnostics ping
566-
```
567-
568-
This basic check would allow the deployment to proceed and the nodes to eventually rejoin each other,
569-
assuming they are [compatible](./upgrade).
570-
571-
See [Schema Syncing from Online Peers](./clustering#restarting-schema-sync) in the [Clustering guide](./clustering).
473+
### Kubernetes Peer Discovery Overview
572474

475+
A [Kubernetes](https://kubernetes.io/)-based discovery mechanism
476+
is available via [a plugin](https://github.com/rabbitmq/rabbitmq-server/tree/main/deps/rabbitmq_peer_discovery_k8s).
573477

574-
### Examples
478+
Since peer discovery happens early during node boot, you should add `rabbitmq_peer_discovery_k8s` to the
479+
[`enabled_plugins` file](https://www.rabbitmq.com/docs/plugins#enabled-plugins-file).
480+
In case of a Kubernetes deployment, it is usually a `ConfigMap`.
575481

576-
A minimalistic [runnable example of Kubernetes peer discovery](https://github.com/rabbitmq/diy-kubernetes-examples)
577-
mechanism can be found on GitHub.
482+
Since RabbitMQ 4.1, this plugin only allows the node with the lowest ordinal index (almost always the pod with the `-0` suffix)
483+
to form a new cluster. This node is referred to as the seed node.
578484

579-
The example can be run using either MiniKube or Kind.
485+
All other nodes will join the seed node, or will forever keep trying to join it, if they can't.
580486

487+
In the most common scenario, this means that:
488+
* the pod with `-0` suffix will start immediately, effectively forming a new single-node cluster
489+
* any other pod will join the pod with `-0` suffix and synchronize the cluster metadata with it
581490

582491
### Configuration
583492

584-
To use Kubernetes for peer discovery, set the `cluster_formation.peer_discovery_backend`
585-
to `k8s` or `kubernetes` or its module name, `rabbit_peer_discovery_k8s`
586-
(note: the name of the module is slightly different from plugin name):
587-
588-
```ini
589-
cluster_formation.peer_discovery_backend = k8s
590-
591-
# the backend can also be specified using its module name
592-
# cluster_formation.peer_discovery_backend = rabbit_peer_discovery_k8s
593-
594-
# Kubernetes API hostname (or IP address). Default value is kubernetes.default.svc.cluster.local
595-
cluster_formation.k8s.host = kubernetes.default.example.local
596-
```
597-
598-
#### Kubernetes API Endpoint
599-
600-
It is possible to configure Kubernetes API port and URI scheme:
493+
**In most cases, no configuration should be necessary beyond enabling this plugin.**
601494

602-
```ini
603-
cluster_formation.peer_discovery_backend = k8s
604-
605-
cluster_formation.k8s.host = kubernetes.default.example.local
606-
# 443 is used by default
607-
cluster_formation.k8s.port = 443
608-
# https is used by default
609-
cluster_formation.k8s.scheme = https
495+
If you use [a different ordinal start value in your StatefulSet](https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/#ordinal-index),
496+
you have to configure this plugin to use it:
610497
```
611-
612-
#### Kubernetes API Access Token
613-
614-
Kubernetes token file path is configurable via `cluster_formation.k8s.token_path`:
615-
616-
```ini
617-
cluster_formation.peer_discovery_backend = k8s
618-
619-
cluster_formation.k8s.host = kubernetes.default.example.local
620-
# default value is /var/run/secrets/kubernetes.io/serviceaccount/token
621-
cluster_formation.k8s.token_path = /var/run/secrets/kubernetes.io/serviceaccount/token
498+
cluster_formation.k8s.ordinal_start = N
622499
```
500+
where `N` matches the `.spec.ordinals.start` value of the StatefulSet.
623501

624-
It must point to a local file that exists and is readable by RabbitMQ.
625-
626-
#### Kubernetes Namespace
627-
628-
`cluster_formation.k8s.namespace_path` controls when the K8S namespace is loaded from:
629-
630-
```ini
631-
cluster_formation.peer_discovery_backend = k8s
632-
633-
cluster_formation.k8s.host = kubernetes.default.example.local
634-
635-
# ...
636-
637-
# Default value: /var/run/secrets/kubernetes.io/serviceaccount/namespace
638-
cluster_formation.k8s.namespace_path = /var/run/secrets/kubernetes.io/serviceaccount/namespace
502+
If the plugin doesn't work for any reason (a very unusual Kubernetes configuration or issues with hostname resolution)
503+
and you have to force RabbitMQ to use a different seed node than it would automatically, you can do this:
639504
```
640-
641-
Just like with the token path key, `cluster_formation.k8s.namespace_path` must point to a local
642-
file that exists and is readable by RabbitMQ.
643-
644-
#### Kubernetes API CA Certificate Bundle
645-
646-
Kubernetes API [CA certificate bundle](./ssl#certificates-and-keys) file path is
647-
configured using `cluster_formation.k8s.cert_path`:
648-
649-
```ini
650-
cluster_formation.peer_discovery_backend = k8s
651-
652-
cluster_formation.k8s.host = kubernetes.default.example.local
653-
654-
# Where to load the K8S API access token from.
655-
# Default value: /var/run/secrets/kubernetes.io/serviceaccount/token
656-
cluster_formation.k8s.token_path = /var/run/secrets/kubernetes.io/serviceaccount/token
657-
658-
# Where to load K8S API CA bundle file from. It will be used when issuing requests
659-
# to the K8S API using HTTPS.
660-
#
661-
# Default value: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
662-
cluster_formation.k8s.cert_path = /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
505+
cluster_formation.k8s.seed_node = rabbit@seed-node-hostname
663506
```
664507

665-
Just like with the token path key, `cluster_formation.k8s.cert_path` must point to a local
666-
file that exists and is readable by RabbitMQ.
667-
668-
#### Peer Node Pods Can Use Hostnames or IP Addresses
669-
670-
When a list of peer nodes is computed from a list of pod containers returned by Kubernetes,
671-
either hostnames or IP addresses can be used. This is configurable using the
672-
`cluster_formation.k8s.address_type` key:
673-
674-
```ini
675-
cluster_formation.peer_discovery_backend = k8s
676-
677-
cluster_formation.k8s.host = kubernetes.default.example.local
678-
679-
cluster_formation.k8s.token_path = /var/run/secrets/kubernetes.io/serviceaccount/token
680-
cluster_formation.k8s.cert_path = /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
681-
cluster_formation.k8s.namespace_path = /var/run/secrets/kubernetes.io/serviceaccount/namespace
682-
683-
# should result set use hostnames or IP addresses
684-
# of Kubernetes API-reported containers?
685-
# supported values are "hostname" and "ip"
686-
cluster_formation.k8s.address_type = hostname
687-
```
688-
689-
Supported values are `ip` or `hostname`. `hostname` is
690-
the recommended option but has limitations: it can only be used with [stateful sets](https://kubernetes.io/docs/tasks/run-application/run-replicated-stateful-application/#statefulset) (also highly recommended)
691-
and [headless services](https://kubernetes.io/docs/concepts/services-networking/service/#headless-services).
692-
`ip` is used by default for better compatibility.
693-
694-
##### Peer Node Pod Name Suffix
695-
696-
It is possible to append a suffix to peer hostnames returned by Kubernetes using
697-
`cluster_formation.k8s.hostname_suffix`:
698-
699-
```ini
700-
cluster_formation.peer_discovery_backend = k8s
701-
702-
cluster_formation.k8s.host = kubernetes.default.example.local
703-
704-
cluster_formation.k8s.token_path = /var/run/secrets/kubernetes.io/serviceaccount/token
705-
cluster_formation.k8s.cert_path = /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
706-
cluster_formation.k8s.namespace_path = /var/run/secrets/kubernetes.io/serviceaccount/namespace
707-
708-
# no suffix is appended by default
709-
cluster_formation.k8s.hostname_suffix = rmq.eng.example.local
710-
```
711-
712-
Service name is `rabbitmq` by default but can be overridden using the
713-
`cluster_formation.k8s.service_name` key if needed:
714-
715-
```ini
716-
cluster_formation.peer_discovery_backend = k8s
717-
718-
cluster_formation.k8s.host = kubernetes.default.example.local
719-
720-
cluster_formation.k8s.token_path = /var/run/secrets/kubernetes.io/serviceaccount/token
721-
cluster_formation.k8s.cert_path = /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
722-
cluster_formation.k8s.namespace_path = /var/run/secrets/kubernetes.io/serviceaccount/namespace
723-
724-
# overrides Kubernetes service name. Default value is "rabbitmq".
725-
cluster_formation.k8s.service_name = rmq-qa
726-
```
508+
If `cluster_formation.k8s.seed_node` is configured, this plugin will just use this value as the seed node.
509+
If you do this, please open a GitHub issue and explain why the plugin didn't work for you, so we can improve it.
727510

728511
## Peer Discovery Using Consul {#peer-discovery-consul}
729512

0 commit comments

Comments
 (0)