You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Please use the values.yaml file from `latest-config-files` folder in this repo for the above command.
@@ -58,133 +58,3 @@ The scaling setup is basically the same as the legacy documentation below, so pl
58
58
Also, docker in docker is setup, so in our github workflows we can specify images to use if we want (iree uses cpubuilder_ubuntu_jammy image for example), but as done in iree-turbine, we can just run workflows using the preconfigured custom image here without further setup and that works too.
59
59
60
60
And you're done (just make sure label matches installation name in workflow) :)
Cert-Manager is a Kubernetes add-on that automates the management and issuance of TLS (Transport Layer Security) certificates.
73
-
This is used for security reasons.
74
-
75
-
### Step 4: Install Github ARC and Authenticate
76
-
77
-
I do this using a personal token. So, if you don't have one, create a github token with these permissions:
78
-
79
-
```
80
-
repo (all)
81
-
admin:org (all) (mandatory for organization-wide runner)
82
-
admin:enterprise (all) (mandatory for enterprise-wide runner)
83
-
admin:public_key - read:public_key
84
-
admin:repo_hook - read:repo_hook
85
-
admin:org_hook
86
-
notifications
87
-
workflow
88
-
```
89
-
90
-
We will also be adding a webhook server as part of installing the actions-runner-controller, so we need to create a secret for the server to authenticate the github webhooks coming in.
The yaml file used above configures the actions runner controller service and the webhook server. I've added the yaml file I used (`runner-controller.yaml`) to this repo.
106
-
Here we tell it to configure a bunch of things for the runner controller, and we give it a docker image to use.
107
-
I've set it up to use `summerwind/actions-runner:ubuntu-22.04` which is the latest one provided by the github actions controller with dind enabled.
108
-
This works fine for us and passes all iree-turbine jobs (with no docker) and the iree jobs (these use multiple docker images and work through dind)
109
-
110
-
### Step 5: Configure GitHub Webhooks
111
-
112
-
I've set this up to use webhooks to drive the overall scaling of our cluster.
113
-
This scaling is performed based on the number of webhook events received from GitHub.
114
-
Here's an image on how that overall process works:
To configure this, first we need to expose the github-webhook server created above to the public, so it can receive from GitHub API.
120
-
To do this, get the current configuration if the server using this command:
121
-
`kubectl get svc actions-runner-controller-github-webhook-server -n actions-runner-system -o yaml > current-config.yaml`
122
-
123
-
Then, open up current-config.yaml and change spec type from `ClusterIP` to `LoadBalancer` in the yaml file and also delete the following lines which aren't neccesary after the switch.
124
-
Also change `http` to `https` in the config.
125
-
```
126
-
clusterIP: 10.0.11.74
127
-
clusterIPs:
128
-
- 10.0.11.74
129
-
internalTrafficPolicy: Cluster
130
-
ipFamilies:
131
-
- IPv4
132
-
ipFamilyPolicy: SingleStack
133
-
```
134
-
TODO(saienduri): Find a way to just configure it with a load balancer initially (just webhook server, not the service)
135
-
136
-
Then, to actually update the service to use the updated config:
137
-
```
138
-
kubectl apply -f current-config.yaml
139
-
```
140
-
141
-
Now that the server and webhook secret have been configured, you can go to the github org/repo to set up the github side of things.
142
-
Go to "Settings" -> "Webhooks".
143
-
Create a new webhook with address `http://<external-ip>/webhooks` and the content type as `application/json`.
144
-
Then in the secret section add the secret that we added earlier.
145
-
For events, you can pick "Let me select individual events" and then choose push, workflow, and workflow jobs.
146
-
If you don't know the external IP of the webhook server you can run:
Specifically, we tell the actions runner controller how much resources we need (45 cores, 50 GB).
156
-
We also give it a runner label that we use in the actual workflow `runs-on:` (I use azure-linux in the yaml)
157
-
You can use the yaml in this repo (runner-deployment.yaml) in the following command:
158
-
159
-
`kubectl apply -f runner-deployment.yaml`
160
-
161
-
### Step 7: Configure HRA
162
-
163
-
This is to configure GitHub Actions Runner Controller's HorizontalRunnerAutoscaler (HRA).
164
-
With the GitHub Actions Runner Controller in a Kubernetes cluster, each runner corresponds to a single container within a pod, and each pod only runs one runner.
165
-
This particular design of the Actions Runner Controller makes sure that each runner operates in its own isolated environment, for the best security of concurrent CI jobs running.
166
-
So, you can think of HRA as a specialized version of HPA, and we don't need it in the GitHub ARC context.
167
-
Here, we tell HRA to scale the GitHub Actions runners based on the webhooks we configured earlier.
168
-
Specifically, we trigger an autoscale everytime there is a webhook event for a workflow, so a runner will be requested.
169
-
It will also downscale appropriately.
170
-
You can use the yaml in this repo (horizontal-scale.yaml) for the following command:
171
-
172
-
`kubectl apply -f horizontal-scale.yaml`
173
-
174
-
Basically there are two levels of autoscaling.
175
-
HRA adjusts the number of pods to meet the runner demand.
176
-
If the number of pods increases beyond the capacity of the current nodes, the Cluster Autoscaler (the thing we setup at the very start) steps in to scale up the node pool, adding more nodes to provide the necessary resources for the additional pods.
177
-
178
-
179
-
Now, change your workflows appropriately to match the labels set in the runner-deployment.yaml and enjoy the AKS + ARC magic :)
0 commit comments