ObolNetwork · KaloyanTanev · Jan 7, 2025 · Jan 7, 2025 · Jan 7, 2025 · Jan 7, 2025
diff --git a/docs/adv/advanced/multi-cluster-setup.mdx b/docs/adv/advanced/multi-cluster-setup.mdx
@@ -0,0 +1,184 @@
+---
+sidebar_position: 7
+description: Multi cluster setup
+---
+
+import Tabs from '@theme/Tabs';
+import TabItem from '@theme/TabItem';
+
+# Multi cluster setup
+
+:::caution
+Multi cluster setup should be used with caution as it is still in experimental phase.
+:::
+
+To spin up multiple clusters that use a single consensus layer client (beacon node) and execution layer client, the multi cluster setup in the [charon-distributed-validator-node](https://github.com/ObolNetwork/charon-distributed-validator-node) (CDVN) repository can be used. Multi cluster setup is for power users who want to spin up multiple clusters **for the same network** on the same machine. Some reasons for doing this might be:
+
+- squad staking with multiple squads;
+- receiving delegated stake from different parties, separating clusters to keep stake separate;
+- combinations of the above.
+
+## Concerns
+
+Each cluster requires separate Charon and validator client instances. Charon P2P ports need to be different for each cluster, separate validator clients need to point to different charon instances, any other changes regarding the accompanying infra should be taken into account (Prometheus, Grafana, etc.).
+
+## Setup
+
+Scripts in the [CDVN](https://github.com/ObolNetwork/charon-distributed-validator-node) repo can setup and manage a multi cluster CDVN directory.
+
+Those scripts separate the shared resources: the consensus layer client (beacon node), the execution layer client, and Grafana. Only Charon services with profile ["cluster"](https://github.com/ObolNetwork/charon-distributed-validator-node/blob/ad4044faf78bbe972437abb5dfb3b1e856776c22/docker-compose.yml#L82) are ran. The cluster-specific resources are separated into folders in a `clusters/` directory:
+
+```directory
+clusters
+└───{CLUSTER_NAME}          # cluster name
+│   │   .charon             # folder including secret material used by charon
+│   │   data                # data from the validator client and Prometheus
+│   │   lodestar            # scripts used by lodestar
+│   │   prometheus          # scripts and configs used by Prometheus
+│   │   .env                # environment variables used by the cluster
+│   │   docker-compose.yml  # docker compose used by the cluster
+└───{CLUSTER_NAME_2}
+└───{CLUSTER_NAME_...}
+└───{CLUSTER_NAME_N}
+```
+
+### Setup Multi cluster CDVN
+
+<Tabs groupId="cluster-creation-stage">
+  <TabItem value="already running" label="I already have single cluster running" default>
+
+    Run the setup by running the `make` command, specifying the `CLUSTER_NAME`:
+
+    ```shell
+      make name=CLUSTER_NAME
+    ```
+
+    As was already a cluster set and running, all the cluster specific data from the root directory will be moved to the first cluster in `clusters/` directory. You can expect a downtime of the node of a few seconds when running the setup command - this is stopping the cluster specific containers from the root docker compose and starting them from inside the cluster specific docker compose. Usually this is 2-5 seconds and is highly unlikely to cause an issue.
+
+    The setup command carries out the following actions:
+      - `clusters/` directory is created
+      - `.charon/` is copied to `clusters/{CLUSTER_NAME}/.charon/`
+      - `.env` is copied to `clusters/{CLUSTER_NAME}/.env`
+      - `data/lodestar/` is copied to `clusters/{CLUSTER_NAME}/data/lodestar/`
+      - `data/prometheus/` is copied to `clusters/{CLUSTER_NAME}/data/prometheus/`
+      - `lodestar/` is copied to `clusters/{CLUSTER_NAME}/lodestar/`
+      - `prometheus/` is copied to `clusters/{CLUSTER_NAME}/prometheus/`
+      - `docker-compose.yml` is copied to `clusters/{CLUSTER_NAME}/docker-compose.yml`
+      - `.charon/` is renamed to `.charon-migrated-to-multi/` and a README is added to it with details about the migration
+      - `data/lodestar/` is renamed to `data/lodestar-migrated-to-multi/` and a README is added to it with details about the migration
+      - `data/prometheus/` is renamed to `data/prometheus-migrated-to-multi/` and a README is added to it with details about the migration
+      - docker containers from `docker-compose.yml` for Charon, VC and Prometheus are stopped (if they are running)
+      - docker containers from `clusters/{CLUSTER_NAME}/docker-compose.yml` for Charon, VC and Prometheus are started (if they were running)
+  </TabItem>
+  <TabItem value="just starting" label="I am just starting with CDVN" default>
+
+    This section is for the case where you have only cloned the [CDVN repo](https://github.com/ObolNetwork/charon-distributed-validator-cluster.git), but not yet setup ENR keys and validator keys or started your node.
+
+    Run the setup by running the `make` command, specifying the `CLUSTER_NAME`:
+
+    ```shell
+      make name=CLUSTER_NAME
+    ```
+
+    The setup command carries out the following actions:
+      - `clusters/` directory is created
+      - `.env` is copied to `clusters/{CLUSTER_NAME}/.env`
+      - `lodestar/` is copied to `clusters/{CLUSTER_NAME}/lodestar/`
+      - `prometheus/` is copied to `clusters/{CLUSTER_NAME}/prometheus/`
+      - `docker-compose.yml` is copied to `clusters/{CLUSTER_NAME}/docker-compose.yml`
+      - `data/lodestar/` is renamed to `data/lodestar-migrated-to-multi/` and a README is added to it with details about the migration
+      - `data/prometheus/` is renamed to `data/prometheus-migrated-to-multi/` and a README is added to it with details about the migration
+
+    To continue with setting up your node, please refer to the [Quickstart guide](../../run/start/quickstart_group), while keeping in mind you should keep all the charon-specific data in the `clusters/{CLUSTER_NAME}/` directory instead of the root directory. (For example, the .charon folder and modifications to the .env file)
+
+  </TabItem>
+</Tabs>
+
+## Manage
+
+As there are now multiple clusters, each one with its own Charon and VC, management becomes a bit more complex.
+
+The private keys and ENRs of each Charon node should be separated, as should the data from each VC, and potentially each Prometheus instance as well.
+
+The base containers (consensus layer client, execution layer client, etc.) should also be managed with caution, as they now impact multiple clusters.
+
+### Manage clusters
+
+#### Add new cluster
+
+You can add a new cluster to the `clusters/` directory, by running the following command, specifying a name in place of the NEW_CLUSTER_NAME. A new folder with the specified name will be created. A free port is automatically chosen for the new libp2p port of the cluster.
+
+```shell
+  make multi-cluster-add-cluster name=NEW_CLUSTER_NAME
+```
+
+The structure of the new folder will look like this:
+
+```directory
+{NEW_CLUSTER_NAME}
+│   data                # initially empty. Once the node is started, the validator client and Prometheus data folders will be created inside this folder.
+│   lodestar            # scripts used by lodestar, copied from the root directory
+│   prometheus          # scripts and configs used by Prometheus, copied from the root directory
+│   .env                # environment variables used by the cluster, copied from the root directory
+│   docker-compose.yml  # docker compose used by the cluster, copied from the root directory
+```
+
+A few things can be configured, if desired:
+
+- The .env file found in `clusters/{NEW_CLUSTER_NAME}/.env` can be configured with some cluster-specific variables (i.e.: the choice of Charon relays);
+- The Prometheus config found in `clusters/{NEW_CLUSTER_NAME}/prometheus/prometheus.yml.example` (i.e.: if writing metrics to a different remote server);
+- The Docker compose found in `clusters/{NEW_CLUSTER_NAME}/docker-compose.yml` (i.e.: if you want to change configurations of the validator client). Keep in mind that only containers with profile `"cluster"` are started from here - if you make changes to any other container, they won't be taken into account.
+
+After the new cluster is created, all Charon specific tasks, like creating ENR, should be done **from inside the cluster's directory**.
+
+#### Delete cluster
+
+Clusters can also be deleted, by running the below command and specifying the `CLUSTER_NAME`. This is useful following completed voluntary exits of validators.
+
+:::danger
+By deleting a cluster you delete all private key material associated with it as well. Delete only if you know what you are doing.
+:::
+
+```shell
+  make multi-cluster-delete-cluster name=CLUSTER_NAME
+```
+
+#### Start cluster
+
+Start a cluster from the `clusters/` directory, by running the command, specifying the CLUSTER_NAME.
+
+```shell
+  make multi-cluster-start-cluster name=CLUSTER_NAME
+```
+
+This is to be done during the first startup of a new cluster, or when a machine has been restarted, or the cluster has stopped for any other reason.
+
+#### Stop cluster
+
+Stop a cluster from the `clusters/` directory, by running the command, specifying the CLUSTER_NAME.
+
+This is to be done in cases of planned maintenance, version updates, etc.
+
+```shell
+  make multi-cluster-stop-cluster name=CLUSTER_NAME
+```
+
+### Manage base
+
+Now that the validator stack (Charon, validator client) is decoupled and can be managed, the "base" containers can be managed on their own as well. These include the consensus layer client, execution layer client, MEV-boost client, and Grafana containers. Here the actions are simpler.
+
+#### Start base
+
+Start the base containers.
+
+```shell
+  make multi-cluster-start-base
+```
+
+#### Stop base
+
+Stop the base containers. Note that this impacts **all** of your clusters in `clusters/`.
+
+```shell
+  make multi-cluster-stop-base
+```