Skip to content

Add multi cluster CDVN page #500

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
184 changes: 184 additions & 0 deletions docs/adv/advanced/multi-cluster-setup.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,184 @@
---
sidebar_position: 7
description: Multi cluster setup
---

import Tabs from '@theme/Tabs';
import TabItem from '@theme/TabItem';

# Multi cluster setup

:::caution
Multi cluster setup should be used with caution as it is still in experimental phase.
:::

To spin up multiple clusters that use a single consensus layer client (beacon node) and execution layer client, the multi cluster setup in the [charon-distributed-validator-node](https://github.com/ObolNetwork/charon-distributed-validator-node) (CDVN) repository can be used. Multi cluster setup is for power users who want to spin up multiple clusters **for the same network** on the same machine. Some reasons for doing this might be:

- squad staking with multiple squads;
- receiving delegated stake from different parties, separating clusters to keep stake separate;
- combinations of the above.

## Concerns

Each cluster requires separate Charon and validator client instances. Charon P2P ports need to be different for each cluster, separate validator clients need to point to different charon instances, any other changes regarding the accompanying infra should be taken into account (Prometheus, Grafana, etc.).

## Setup

Scripts in the [CDVN](https://github.com/ObolNetwork/charon-distributed-validator-node) repo can setup and manage a multi cluster CDVN directory.

Those scripts separate the shared resources: the consensus layer client (beacon node), the execution layer client, and Grafana. Only Charon services with profile ["cluster"](https://github.com/ObolNetwork/charon-distributed-validator-node/blob/ad4044faf78bbe972437abb5dfb3b1e856776c22/docker-compose.yml#L82) are ran. The cluster-specific resources are separated into folders in a `clusters/` directory:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shall change the link once we merge the CDVN PR.


```directory
clusters
└───{CLUSTER_NAME} # cluster name
│ │ .charon # folder including secret material used by charon
│ │ data # data from the validator client and Prometheus
│ │ lodestar # scripts used by lodestar
│ │ prometheus # scripts and configs used by Prometheus
│ │ .env # environment variables used by the cluster
│ │ docker-compose.yml # docker compose used by the cluster
└───{CLUSTER_NAME_2}
└───{CLUSTER_NAME_...}
└───{CLUSTER_NAME_N}
```

### Setup Multi cluster CDVN

<Tabs groupId="cluster-creation-stage">
<TabItem value="already running" label="I already have single cluster running" default>

Run the setup by running the `make` command, specifying the `CLUSTER_NAME`:

```shell
make name=CLUSTER_NAME
```

As was already a cluster set and running, all the cluster specific data from the root directory will be moved to the first cluster in `clusters/` directory. You can expect a downtime of the node of a few seconds when running the setup command - this is stopping the cluster specific containers from the root docker compose and starting them from inside the cluster specific docker compose. Usually this is 2-5 seconds and is highly unlikely to cause an issue.

The setup command carries out the following actions:
- `clusters/` directory is created
- `.charon/` is copied to `clusters/{CLUSTER_NAME}/.charon/`
- `.env` is copied to `clusters/{CLUSTER_NAME}/.env`
- `data/lodestar/` is copied to `clusters/{CLUSTER_NAME}/data/lodestar/`
- `data/prometheus/` is copied to `clusters/{CLUSTER_NAME}/data/prometheus/`
- `lodestar/` is copied to `clusters/{CLUSTER_NAME}/lodestar/`
- `prometheus/` is copied to `clusters/{CLUSTER_NAME}/prometheus/`
- `docker-compose.yml` is copied to `clusters/{CLUSTER_NAME}/docker-compose.yml`
- `.charon/` is renamed to `.charon-migrated-to-multi/` and a README is added to it with details about the migration
- `data/lodestar/` is renamed to `data/lodestar-migrated-to-multi/` and a README is added to it with details about the migration
- `data/prometheus/` is renamed to `data/prometheus-migrated-to-multi/` and a README is added to it with details about the migration
- docker containers from `docker-compose.yml` for Charon, VC and Prometheus are stopped (if they are running)
- docker containers from `clusters/{CLUSTER_NAME}/docker-compose.yml` for Charon, VC and Prometheus are started (if they were running)
</TabItem>
<TabItem value="just starting" label="I am just starting with CDVN" default>

This section is for the case where you have only cloned the [CDVN repo](https://github.com/ObolNetwork/charon-distributed-validator-cluster.git), but not yet setup ENR keys and validator keys or started your node.

Run the setup by running the `make` command, specifying the `CLUSTER_NAME`:

```shell
make name=CLUSTER_NAME
```

The setup command carries out the following actions:
- `clusters/` directory is created
- `.env` is copied to `clusters/{CLUSTER_NAME}/.env`
- `lodestar/` is copied to `clusters/{CLUSTER_NAME}/lodestar/`
- `prometheus/` is copied to `clusters/{CLUSTER_NAME}/prometheus/`
- `docker-compose.yml` is copied to `clusters/{CLUSTER_NAME}/docker-compose.yml`
- `data/lodestar/` is renamed to `data/lodestar-migrated-to-multi/` and a README is added to it with details about the migration
- `data/prometheus/` is renamed to `data/prometheus-migrated-to-multi/` and a README is added to it with details about the migration

To continue with setting up your node, please refer to the [Quickstart guide](../../run/start/quickstart_group), while keeping in mind you should keep all the charon-specific data in the `clusters/{CLUSTER_NAME}/` directory instead of the root directory. (For example, the .charon folder and modifications to the .env file)

</TabItem>
</Tabs>

## Manage

As there are now multiple clusters, each one with its own Charon and VC, management becomes a bit more complex.

The private keys and ENRs of each Charon node should be separated, as should the data from each VC, and potentially each Prometheus instance as well.

The base containers (consensus layer client, execution layer client, etc.) should also be managed with caution, as they now impact multiple clusters.

### Manage clusters

#### Add new cluster

You can add a new cluster to the `clusters/` directory, by running the following command, specifying a name in place of the NEW_CLUSTER_NAME. A new folder with the specified name will be created. A free port is automatically chosen for the new libp2p port of the cluster.

```shell
make multi-cluster-add-cluster name=NEW_CLUSTER_NAME
```

The structure of the new folder will look like this:

```directory
{NEW_CLUSTER_NAME}
│ data # initially empty. Once the node is started, the validator client and Prometheus data folders will be created inside this folder.
│ lodestar # scripts used by lodestar, copied from the root directory
│ prometheus # scripts and configs used by Prometheus, copied from the root directory
│ .env # environment variables used by the cluster, copied from the root directory
│ docker-compose.yml # docker compose used by the cluster, copied from the root directory
```

A few things can be configured, if desired:

- The .env file found in `clusters/{NEW_CLUSTER_NAME}/.env` can be configured with some cluster-specific variables (i.e.: the choice of Charon relays);
- The Prometheus config found in `clusters/{NEW_CLUSTER_NAME}/prometheus/prometheus.yml.example` (i.e.: if writing metrics to a different remote server);
- The Docker compose found in `clusters/{NEW_CLUSTER_NAME}/docker-compose.yml` (i.e.: if you want to change configurations of the validator client). Keep in mind that only containers with profile `"cluster"` are started from here - if you make changes to any other container, they won't be taken into account.

After the new cluster is created, all Charon specific tasks, like creating ENR, should be done **from inside the cluster's directory**.

#### Delete cluster

Clusters can also be deleted, by running the below command and specifying the `CLUSTER_NAME`. This is useful following completed voluntary exits of validators.

:::danger
By deleting a cluster you delete all private key material associated with it as well. Delete only if you know what you are doing.
:::

```shell
make multi-cluster-delete-cluster name=CLUSTER_NAME
```

#### Start cluster

Start a cluster from the `clusters/` directory, by running the command, specifying the CLUSTER_NAME.

```shell
make multi-cluster-start-cluster name=CLUSTER_NAME
```

This is to be done during the first startup of a new cluster, or when a machine has been restarted, or the cluster has stopped for any other reason.

#### Stop cluster

Stop a cluster from the `clusters/` directory, by running the command, specifying the CLUSTER_NAME.

This is to be done in cases of planned maintenance, version updates, etc.

```shell
make multi-cluster-stop-cluster name=CLUSTER_NAME
```

### Manage base

Now that the validator stack (Charon, validator client) is decoupled and can be managed, the "base" containers can be managed on their own as well. These include the consensus layer client, execution layer client, MEV-boost client, and Grafana containers. Here the actions are simpler.

#### Start base

Start the base containers.

```shell
make multi-cluster-start-base
```

#### Stop base

Stop the base containers. Note that this impacts **all** of your clusters in `clusters/`.

```shell
make multi-cluster-stop-base
```
Loading