diff --git a/README.md b/README.md index 6260e45aa..2bfb09d92 100644 --- a/README.md +++ b/README.md @@ -10,6 +10,13 @@ Fleet Join/Leave is a feature that allows a member cluster to join and leave a f ## Quick Start +**This guide has two parts,** + +1. Steps to run agents on Kind clusters +2. Steps to run agents on AKS clusters + +**For the kind clusters everything needed to run the agents is already predefined in the makefile no changes are needed. But for the AKS cluster we need to make some changes to files within the fleet repo.** + --- ### Prerequisites @@ -20,53 +27,58 @@ Fleet Join/Leave is a feature that allows a member cluster to join and leave a f - [kubectl](https://kubernetes.io/docs/tasks/tools/install-kubectl/) version v1.22 - [kind](https://kind.sigs.k8s.io/) version v0.12.0 -### Install +## Steps to run agents on Kind clusters -1. Clone the repo to your machine +**1. Clone the repo to your machine** ```shell $ git clone https://github.com/Azure/fleet ``` -2. Navigate to fleet directory +**2. Navigate to fleet directory** ```shell $ cd fleet ``` -3. Set up `hub` and `member` kind clusters +**3. Set up `hub` and `member` kind clusters** + +The makefile uses **kindest/node:v1.23.3** if your version is higher/lower use the following command to pull the image for v1.23.3 + +```shell +$ docker pull kindest/node:v1.23.3 +``` + +then run, ```shell $ make create-hub-kind-cluster create-member-kind-cluster ``` -4. Build and load images to kind clusters (only if you don't have access to [fleet packages](https://github.com/orgs/Azure/packages?repo_name=fleet)) +**4. Build and load images to kind clusters** (since we are testing locally we don't have access to [fleet packages](https://github.com/orgs/Azure/packages?repo_name=fleet)) ```shell $ OUTPUT_TYPE=type=docker make docker-build-hub-agent docker-build-member-agent docker-build-refresh-token $ make load-hub-docker-image load-member-docker-image ``` -5. Install hub and member agents helm charts +**5. Install hub and member agents helm charts** ```shell $ make install-member-agent-helm ``` -### Demo - -1. Apply `memberCluster` to the hub cluster +**6. Apply `memberCluster` to the hub cluster** ```shell $ kind export kubeconfig --name hub-testing $ kubectl apply -f examples/fleet_v1alpha1_membercluster.yaml ``` -2. Check to make sure the `memberCluster` & `internalMemberCluster` resources status have been updated to 'Joined' +**7. Check to make sure the `memberCluster` & `internalMemberCluster` resources status have been updated to 'Joined'** ```shell -$ kubectl describe memberCluster.fleet.azure.com kind-member-testing -$ kubectl describe internalMemberCluster.fleet.azure.com kind-member-testing +$ kubectl describe membercluster kind-member-testing ```
@@ -75,11 +87,77 @@ $ kubectl describe internalMemberCluster.fleet.azure.com kind-member-testing ```shell Name: kind-member-testing Namespace: - ... +Labels: +Annotations: +API Version: fleet.azure.com/v1alpha1 +Kind: MemberCluster +Metadata: + Creation Timestamp: 2022-07-08T01:42:35Z + Generation: 1 + Managed Fields: + API Version: fleet.azure.com/v1alpha1 + Fields Type: FieldsV1 + fieldsV1: + f:metadata: + f:annotations: + .: + f:kubectl.kubernetes.io/last-applied-configuration: + f:spec: + .: + f:identity: + .: + f:apiGroup: + f:kind: + f:name: + f:namespace: + f:leaseDurationSeconds: + f:state: + Manager: kubectl-client-side-apply + Operation: Update + Time: 2022-07-08T01:42:35Z + API Version: fleet.azure.com/v1alpha1 + Fields Type: FieldsV1 + fieldsV1: + f:status: + .: + f:allocatable: + .: + f:cpu: + f:memory: + f:capacity: + .: + f:cpu: + f:memory: + f:conditions: + Manager: 67cec7a4-3386-4fd5-9de2-20397e7b0029 + Operation: Update + Subresource: status + Time: 2022-07-08T01:42:36Z + Resource Version: 868 + UID: 67cec7a4-3386-4fd5-9de2-20397e7b0029 +Spec: + Identity: + API Group: + Kind: ServiceAccount + Name: member-agent-sa + Namespace: fleet-system + Lease Duration Seconds: 30 + State: Join +Status: + Allocatable: + Cpu: 8 + Memory: 2032532Ki + Capacity: + Cpu: 8 + Memory: 2032532Ki + Conditions: + Last Transition Time: 2022-07-08T01:42:36Z + Message: + Observed Generation: 1 Reason: InternalMemberClusterHeartbeatReceived Status: True Type: HeartbeatReceived - Last Transition Time: 2022-06-27T19:26:38Z + Last Transition Time: 2022-07-08T01:42:36Z Message: Observed Generation: 1 Reason: MemberClusterJoined @@ -88,29 +166,127 @@ Namespace: Events: Type Reason Age From Message ---- ------ ---- ---- ------- - Normal NamespaceCreated 77s memberCluster Namespace was created - Normal InternalMemberClusterCreated 77s memberCluster Internal member cluster was created - Normal RoleCreated 77s memberCluster role was created - Normal RoleBindingCreated 77s memberCluster role binding was created - Normal MemberClusterJoined 17s memberCluster member cluster is joined + Normal NamespaceCreated 81s memberCluster Namespace was created + Normal InternalMemberClusterCreated 81s memberCluster Internal member cluster was created + Normal RoleCreated 81s memberCluster role was created + Normal RoleBindingCreated 81s memberCluster role binding was created + Normal MemberClusterJoined 81s memberCluster member cluster is joined ``` +

+ +```shell +$ kubectl describe internalmembercluster kind-member-testing -n fleet-kind-member-testing +``` +
Result +```shell +Name: kind-member-testing +Namespace: fleet-kind-member-testing +Labels: +Annotations: +API Version: fleet.azure.com/v1alpha1 +Kind: InternalMemberCluster +Metadata: + Creation Timestamp: 2022-07-08T01:42:36Z + Generation: 1 + Managed Fields: + API Version: fleet.azure.com/v1alpha1 + Fields Type: FieldsV1 + fieldsV1: + f:metadata: + f:ownerReferences: + .: + k:{"uid":"67cec7a4-3386-4fd5-9de2-20397e7b0029"}: + f:spec: + .: + f:leaseDurationSeconds: + f:state: + Manager: 67cec7a4-3386-4fd5-9de2-20397e7b0029 + Operation: Update + Time: 2022-07-08T01:42:36Z + API Version: fleet.azure.com/v1alpha1 + Fields Type: FieldsV1 + fieldsV1: + f:status: + .: + f:allocatable: + .: + f:cpu: + f:memory: + f:capacity: + .: + f:cpu: + f:memory: + f:conditions: + Manager: memberagent + Operation: Update + Subresource: status + Time: 2022-07-08T01:42:36Z + Owner References: + API Version: fleet.azure.com/v1alpha1 + Controller: true + Kind: MemberCluster + Name: kind-member-testing + UID: 67cec7a4-3386-4fd5-9de2-20397e7b0029 + Resource Version: 865 + UID: 1b544873-81b8-4bac-9624-d4208aa21fd1 +Spec: + Lease Duration Seconds: 60 + State: Join +Status: + Allocatable: + Cpu: 8 + Memory: 2032532Ki + Capacity: + Cpu: 8 + Memory: 2032532Ki + Conditions: + Last Transition Time: 2022-07-08T01:42:36Z + Message: + Observed Generation: 1 + Reason: InternalMemberClusterHeartbeatReceived + Status: True + Type: HeartbeatReceived + Last Transition Time: 2022-07-08T01:42:36Z + Message: + Reason: ReconcileSuccess + Status: True + Type: Synced + Last Transition Time: 2022-07-08T01:42:36Z + Message: + Observed Generation: 1 + Reason: InternalMemberClusterJoined + Status: True + Type: Joined + Last Transition Time: 2022-07-08T01:42:36Z + Message: + Observed Generation: 1 + Reason: InternalMemberClusterHealthy + Status: True + Type: Healthy +Events: + Type Reason Age From Message + ---- ------ ---- ---- ------- + Normal InternalMemberClusterHeartbeatReceived 8s (x6 over 4m8s) InternalMemberClusterController internal member cluster heartbeat received + Normal InternalMemberClusterJoined 8s (x6 over 4m8s) InternalMemberClusterController internal member cluster has joined + Normal InternalMemberClusterHealthy 8s (x6 over 4m8s) InternalMemberClusterController internal member cluster healthy + +```

-3. Change the state for `memberCluster` yaml file to be `Leave` and apply the change. +**8. Change the state for `memberCluster` yaml file to be `Leave` and apply the change** ```shell $ kubectl apply -f examples/fleet_v1alpha1_membercluster.yaml ``` -4. Check to make sure the `memberCluster` & `internalMemberCluster` resources status have been updated to 'Left' +**9. Check to make sure the `memberCluster` resource status have been updated to 'Left'** ```shell -$ kubectl describe memberCluster.fleet.azure.com kind-member-testing -$ kubectl describe internalMemberCluster.fleet.azure.com kind-member-testing +$ kubectl describe membercluster kind-member-testing ```
@@ -119,26 +295,105 @@ $ kubectl describe internalMemberCluster.fleet.azure.com kind-member-testing ```shell Name: kind-member-testing Namespace: - ... +Labels: +Annotations: +API Version: fleet.azure.com/v1alpha1 +Kind: MemberCluster +Metadata: + Creation Timestamp: 2022-07-08T01:42:35Z + Generation: 2 + Managed Fields: + API Version: fleet.azure.com/v1alpha1 + Fields Type: FieldsV1 + fieldsV1: + f:metadata: + f:annotations: + .: + f:kubectl.kubernetes.io/last-applied-configuration: + f:spec: + .: + f:identity: + .: + f:apiGroup: + f:kind: + f:name: + f:namespace: + f:leaseDurationSeconds: + Manager: kubectl-client-side-apply + Operation: Update + Time: 2022-07-08T01:42:35Z + API Version: fleet.azure.com/v1alpha1 + Fields Type: FieldsV1 + fieldsV1: + f:status: + .: + f:allocatable: + .: + f:cpu: + f:memory: + f:capacity: + .: + f:cpu: + f:memory: + f:conditions: + Manager: 67cec7a4-3386-4fd5-9de2-20397e7b0029 + Operation: Update + Subresource: status + Time: 2022-07-08T01:42:36Z + API Version: fleet.azure.com/v1alpha1 + Fields Type: FieldsV1 + fieldsV1: + f:spec: + f:state: + Manager: kubectl-edit + Operation: Update + Time: 2022-07-08T01:49:10Z + Resource Version: 1565 + UID: 67cec7a4-3386-4fd5-9de2-20397e7b0029 +Spec: + Identity: + API Group: + Kind: ServiceAccount + Name: member-agent-sa + Namespace: fleet-system + Lease Duration Seconds: 30 + State: Leave +Status: + Allocatable: + Cpu: 8 + Memory: 2032532Ki + Capacity: + Cpu: 8 + Memory: 2032532Ki + Conditions: + Last Transition Time: 2022-07-08T01:42:36Z + Message: + Observed Generation: 1 Reason: InternalMemberClusterHeartbeatReceived Status: True Type: HeartbeatReceived - Last Transition Time: 2022-06-27T19:26:38Z + Last Transition Time: 2022-07-08T01:49:10Z Message: - Observed Generation: 1 - Reason: MemberClusterJoined + Observed Generation: 2 + Reason: MemberClusterLeft Status: False Type: Joined + Last Transition Time: 2022-07-08T01:49:10Z + Message: + Reason: ReconcileSuccess + Status: True + Type: Synced Events: - Type Reason Age From Message - ---- ------ ---- ---- ------- - Normal NamespaceCreated 77s memberCluster Namespace was created - Normal InternalMemberClusterCreated 77s memberCluster Internal member cluster was created - Normal RoleCreated 77s memberCluster role was created - Normal RoleBindingCreated 77s memberCluster role binding was created - Normal MemberClusterJoined 17s memberCluster member cluster is joined - Normal InternalMemberClusterSpecUpdated 3m10s memberCluster internal member cluster spec is marked as Leave - Normal MemberClusterJoined 3m15s memberCluster member cluster is Left + Type Reason Age From Message + ---- ------ ---- ---- ------- + Normal NamespaceCreated 6m56s memberCluster Namespace was created + Normal InternalMemberClusterCreated 6m56s memberCluster Internal member cluster was created + Normal RoleCreated 6m56s memberCluster role was created + Normal RoleBindingCreated 6m56s memberCluster role binding was created + Normal MemberClusterJoined 6m56s memberCluster member cluster is joined + Normal InternalMemberClusterSpecUpdated 22s memberCluster internal member cluster spec is marked as Leave + Normal NamespaceDeleted 22s memberCluster namespace is deleted for member cluster + Normal MemberClusterLeft 22s memberCluster member cluster has left ``` @@ -152,6 +407,190 @@ delete kind clusters setup $ make clean-e2e-tests ``` +## Steps to run agents on AKS clusters + +--- + +Before starting create a text file and have a list of variables and their associated values for anything with double quotes put it in this text file to keep track, since we need to make changes to most of the commands before using them. + +List of variables that's supposed to be in your text file as you go through the commands, + +- "RegistryName" +- "RegistryLoginServer" +- "ResourceGroupName" +- "PRINCIPAL_ID" +- "CLIENT_ID" +- "HUB_URL" +- "MemberClusterCRName" + +### Prerequisites + +- Valid Azure subscription to create AKS clusters [setup subscription](https://docs.microsoft.com/en-us/cli/azure/manage-azure-subscriptions-azure-cli) +- Resource Group under subscription [setup resource group](https://docs.microsoft.com/en-us/azure/azure-resource-manager/management/manage-resource-groups-portal) ("ResourceGroupName" name of the resource group created) +- ACR inside the resource group to build & push docker images [setup ACR](https://docs.microsoft.com/en-us/azure/container-registry/container-registry-get-started-portal?tabs=azure-cli) ("RegistryLoginServer" login server field when you navigate to your registry on Azure portal, "RegistryName" name of Registry created) + +### 1. Create hub cluster with AAD, RBAC enabled + +```shell +$ az aks create --resource-group "ResourceGroupName" --name hubCluster --attach-acr "RegistryName" --node-count 1 --generate-ssh-keys --enable-aad --enable-azure-rbac +``` + +### 2. Create member cluster with managed identity enabled + +```shell +$ az aks create --resource-group "ResourceGroupName" --name memberCluster --attach-acr "RegistryName" --node-count 1 --generate-ssh-keys --enable-managed-identity +``` + +### 3. Admin access to install helm charts and apply CRs + +```shell +$ az aks get-credentials --resource-group "ResourceGroupName" --name hubCluster --admin +$ az aks get-credentials --resource-group "ResourceGroupName" --name memberCluster --admin +``` + +### 4. Switching contexts between clusters + +switching contexts provides access to the corresponding clusters. The contexts are defined in a config file, it's usual file path is /Users/username/.kube/config + +```shell +$ kubectl config use-context hubCluster-admin +$ kubectl config use-context memberCluster-admin +``` + +### 5. Login to ACR + +```shell +$ az acr login -n "RegistryName" +``` + +### 6. Build and push docker images + +In the Makefile which exists in **fleet/Makefile** and values.yaml files for both helm charts which exists in **fleet/charts/hub-agent/values.yaml** & **fleet/charts/member-agent/values.yaml** + +make the necessary changes. + +![Makefile](screenshots/img.png) +![hub-agent/values.yaml](screenshots/img1.png) +![member-agent/values.yaml](screenshots/img2.png) + +From the fleet directory run the following commands. This builds the docker images from local fleet directory and pushes the images to ACR. + +```shell +$ make docker-build-hub-agent +$ make docker-build-member-agent +$ make docker-build-refresh-token +``` + +### 7. Install helm charts and CRs: + +From the fleet directory run the following commands, + +```shell +$ helm install hub-agent ./charts/hub-agent/ +``` + +Each time we create an AKS cluster a resource group gets auto generated for us **MC_ResourceGroupName_ClusterName_Location** find the resource group and then go and click the **agent pool MSI object** and get the **"PRINCIPAL_ID"** which will be the name of the identity for the memberCluster CR, we can also find the **"CLIENT_ID"** here + +#### copy the code below and navigate to fleet/examples/fleet_v1alpha1_membercluster.yaml, paste the code and replace the "PRINCIPAL_ID" + +```shell +apiVersion: fleet.azure.com/v1alpha1 +kind: MemberCluster +metadata: + name: membercluster-sample +spec: + state: Join + identity: + name: "PRINCIPAL_ID" + kind: User + namespace: fleet-system + apiGroup: rbac.authorization.k8s.io + leaseDurationSeconds: 30 +``` + +then apply the CR, + +```shell +$ kubectl apply -f ./examples/fleet_v1alpha1_membercluster.yaml +``` + +switch cluster context to member cluster and run, **"CLIENT_ID"** is clientId from the **agent pool MSI object**, **"HUB_URL"** can be found in the .kube/config file in the hub cluster context section. + +```shell +$ helm install member-agent ./charts/member-agent/ --set azure.clientid="CLIENT_ID" --set config.provider=azure --set config.hubURL="HUB_URL" --set config.memberClusterName="MemberClusterCRName" +``` + +check events to see if member cluster has Joined. + +```shell +$ kubectl describe membercluster "MemberClusterCRName" + ``` + +After applying the member cluster CR the Join workflow completes and the member cluster gets marked as Joined with a condition. + +To trigger the leave workflow change the state from **Join** to **Leave** in the member cluster CR or change the CR's spec to **Leave** in the fleet/examples/fleet_v1alpha1_membercluster.yaml and apply the CR again. + +```shell +$ kubectl edit membercluster "MemberClusterCRName" +``` + +check events to see if member cluster has Left. + +```shell +$ kubectl describe membercluster "MemberClusterCRName" + ``` + +### 8. Verify the token file exists in the member cluster + +switch cluster context to member cluster, + +upgrade the AKS member cluster to use kubernetes version greater than 1.22 because we need ephemeral containers to access the token. we can check the kubernetes version for the AKS cluster by running, + +```shell +$ kubectl get nodes -A +``` + +run this command to get possible upgrades for your cluster if kubernetes version for cluster is less than 1.23, + +```shell +$ az aks get-upgrades --resource-group "ResourceGroupName" --name memberCluster --output table +``` + +then use a version greater than 1.22 replace the KUBERNETES_VERSION variable in command below this is done because ephemeral debug containers were introduced in 1.23 and run, + +```shell +az aks upgrade --resource-group "ResourceGroupName" --name memberCluster --kubernetes-version KUBERNETES_VERSION +``` + +after the upgrade run, + +```shell +$ kubectl debug node/nodeName -it --image=busybox +``` + +this opens a shell to access the files present within the node then run, + +```shell +$ find . -name provider-token +``` + +which returns something similar to this, + +
+Result + +```shell +/host/var/lib/kubelet/pods/podName/volumes/kubernetes.io~empty-dir/provider-token +``` + +

+ +navigate to the directory to find a file called token use vim to open it. + +### 9. Cleanup + +Delete the resource group under which every resource was created. This might take some time. + ## Code of Conduct --- diff --git a/charts/hub-agent/README.md b/charts/hub-agent/README.md index f709ded23..4fc03fbcb 100644 --- a/charts/hub-agent/README.md +++ b/charts/hub-agent/README.md @@ -5,9 +5,6 @@ ```console # Helm install with fleet-system namespace already created helm install hub-agent ./charts/hub-agent/ - -# Helm install and create namespace -helm install hub-agent ./charts/hubagent/ --namespace fleet-system --create-namespace ``` ## Upgrade Chart @@ -22,16 +19,17 @@ _See [helm install](https://helm.sh/docs/helm/helm_install/) for command documen ## Parameters -| Parameter | Description | Default | -|:----------------------|:--------------------------------------------------------------------|:-------------------------------------------------| -| replicaCount | The number of hub-agent replicas to deploy | `1` | -| image.repository | Image repository | `ghcr.io/azure/azure/fleet/hub-agent` | -| image.pullPolicy | Image pullPolicy | `Always` | -| image.tag | The image release tag to use | `v0.1.0` | -| namespace | Namespace that this Helm chart is installed on | `fleet-system` | -| serviceAccount.create | Whether to create service account | `true` | -| serviceAccount.name | Service account name | `hub-agent-sa` | -| resources | The resource request/limits for the container image | limits: 500m CPU, 1Gi, requests: 100m CPU, 128Mi | -| affinity | The node affinity to use for pod scheduling | `{}` | -| tolerations | The tolerations to use for pod scheduling | `[]` | +| Parameter | Description | Default | +|:----------------------|:----------------------------------------------------|:-------------------------------------------------| +| replicaCount | The number of hub-agent replicas to deploy | `1` | +| image.repository | Image repository | `ghcr.io/azure/azure/fleet/hub-agent` | +| image.pullPolicy | Image pullPolicy | `Always` | +| image.tag | The image release tag to use | `v0.1.0` | +| namespace | Namespace that this Helm chart is installed on | `fleet-system` | +| serviceAccount.create | Whether to create service account | `true` | +| serviceAccount.name | Service account name | `hub-agent-sa` | +| resources | The resource request/limits for the container image | limits: 500m CPU, 1Gi, requests: 100m CPU, 128Mi | +| affinity | The node affinity to use for pod scheduling | `{}` | +| tolerations | The tolerations to use for pod scheduling | `[]` | +| logVerbosity | Log level. Uses V logs (klog) | `2` | diff --git a/charts/member-agent/README.md b/charts/member-agent/README.md index 6b49755e2..62f4010cc 100644 --- a/charts/member-agent/README.md +++ b/charts/member-agent/README.md @@ -29,15 +29,16 @@ helm upgrade member-agent member-agent/ --namespace fleet-system ## Parameters -| Parameter | Description | Default | -|:-------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:------------------------------------------------| -| replicaCount | The number of member-agent replicas to deploy | `1` | -| image.repository | Image repository | `ghcr.io/azure/azure/fleet/member-agent` | -| image.pullPolicy | Image pullPolicy | `IfNotPresent` | -| image.tag | The image tag to use | `v0.1.0` | -| affinity | The node affinity to use for pod scheduling | `{}` | -| tolerations | The toleration to use for pod scheduling | `[]` | -| resources | The resource request/limits for the container image | limits: "2" CPU, 4Gi, requests: 100m CPU, 128Mi | -| namespace | Namespace that this Helm chart is installed on. | `fleet-system` | +| Parameter | Description | Default | +|:-------------------------|:------------------------------------------------------|:------------------------------------------------| +| replicaCount | The number of member-agent replicas to deploy | `1` | +| image.repository | Image repository | `ghcr.io/azure/azure/fleet/member-agent` | +| image.pullPolicy | Image pullPolicy | `IfNotPresent` | +| image.tag | The image tag to use | `v0.1.0` | +| affinity | The node affinity to use for pod scheduling | `{}` | +| tolerations | The toleration to use for pod scheduling | `[]` | +| resources | The resource request/limits for the container image | limits: "2" CPU, 4Gi, requests: 100m CPU, 128Mi | +| namespace | Namespace that this Helm chart is installed on. | `fleet-system` | +| logVerbosity | Log level. Uses V logs (klog) | `3` | ## Contributing Changes diff --git a/examples/fleet_v1alpha1_membercluster.yaml b/examples/fleet_v1alpha1_membercluster.yaml index 6c6d7c2eb..fa17f5c39 100644 --- a/examples/fleet_v1alpha1_membercluster.yaml +++ b/examples/fleet_v1alpha1_membercluster.yaml @@ -1,12 +1,12 @@ apiVersion: fleet.azure.com/v1alpha1 kind: MemberCluster metadata: - name: membercluster-sample + name: kind-member-testing spec: state: Join identity: name: member-agent-sa kind: ServiceAccount namespace: fleet-system - apiGroup: rbac.authorization.k8s.io + apiGroup: "" leaseDurationSeconds: 30 diff --git a/examples/fleet_v1alpha1_membership.yaml b/examples/fleet_v1alpha1_membership.yaml deleted file mode 100644 index e69de29bb..000000000 diff --git a/screenshots/img.png b/screenshots/img.png new file mode 100644 index 000000000..87acced16 Binary files /dev/null and b/screenshots/img.png differ diff --git a/screenshots/img1.png b/screenshots/img1.png new file mode 100644 index 000000000..cb148fc21 Binary files /dev/null and b/screenshots/img1.png differ diff --git a/screenshots/img2.png b/screenshots/img2.png new file mode 100644 index 000000000..38db17089 Binary files /dev/null and b/screenshots/img2.png differ