Skip to content

Latest commit



819 lines (574 loc) · 27.6 KB

File metadata and controls

819 lines (574 loc) · 27.6 KB

Kubernetes Deployment with Load Balancing

| Installing Microk8s | Adding Initial Nodes | Building and Deploying Services | Adding Nodes to Existing Deployment | Sending Requests | Uninstalling | Examples | Useful Commands | Limitations |

This sample demonstrates how to set up a Kubernetes cluster using MicroK8s, how to deploy Intel(R) Deep Learning Streamer (Intel(R) DL Streamer) Pipeline Server to the cluster, and how to use HAProxy to load balance requests.



Term Definition
Pipeline Server Intel(R) DL Streamer Pipeline Server microservice thats runs pipelines.
HAProxy HAProxy open source load balancer and application delivery controller.
MicroK8s microk8s minimal production Kubernetes distribution.
MQTT MQTT open source message bus.
Node Physical or virtual machine.
Leader Node hosting the Kubernetes control plane to which worker nodes are added. The leader node can also host pipeline servers and run pipelines.
Worker Nodes hosting pipeline servers. Worker nodes are added to increase the computational resources of the cluster.
Pod The smallest deployable unit of computing in a Kubernetes cluster, typically a single container.
leader-ip Host IP address of Leader.


The following steps, installation and deployment scripts have been tested on Ubuntu 20.04. Other operating systems may have additional requirements and are beyond the scope of this document.

Installing MicroK8s

For each node that will be in the cluster run the following commands to install MicroK8s along with its dependencies. These steps must be performed on each node individually. Please review the contents of microk8s/ and microk8s/ as these scripts will install additional components on your system as well as make changes to your groups and environment variables.

Step 1: Install MicroK8s Base


cd ./samples/kubernetes
sudo -E ./microk8s/

Expected Output

Assigning <user name> to microk8s group

NOTE: If you are running behind a proxy please ensure that your NO_PROXY and no_proxy environment variables are set correctly to allow cluster nodes to communicate directly. You can run these commands to set this up automatically:

UPDATE_NO_PROXY=true sudo -E ./microk8s/
su - $USER

Step 2: Activate Group Membership

Your user is now a member of a newly added 'microk8s' group. However, the current terminal session will not be aware of this until you issue this command:


newgrp microk8s
groups | grep microk8s

Expected Output

<snip> microk8s <snip>

Step 3: Install MicroK8s Add-Ons

Next we need to install add-on components into the cluster. These enable docker registry and dns.



Note that this script may take several minutes to complete.

Expected Output

Metrics-Server is enabled
DNS is enabled
Ingress is enabled
Metrics-Server is enabled
DNS is enabled
The registry is enabled

Step 4: Wait for Kubernetes System Pods to Reach Running State

At this point we need to wait for the Kubernetes system pods to reach the running state. This may take a few minutes.

Check that the installation was successful by confirming STATUS is Running for all pods. Pods will cycle through ContainerCreating, Pending, and Waiting states but all should eventually reach the Running state. After a few minutes if all pods do not reach the Running state refer to application cluster troubleshooting tips for more help.

Troubleshooting Tip: If you see Pending or ContainerCreating after waiting more than a few minutes, you may need to modify your environment variables with respect to proxy settings and restart MicroK8s. Do this by running microk8s stop, modifying the environment variables in your shell, and then running microk8s start. Then check the status of pods by running this command again.


microk8s kubectl get pods --all-namespaces

Expected Output

NAMESPACE            NAME                                         READY   STATUS    RESTARTS   AGE
kube-system          calico-node-mhvlc                            1/1     Running   0          4m28s
kube-system          metrics-server-8bbfb4bdb-pl6g7               1/1     Running   0          3m1s
kube-system          calico-kube-controllers-f7868dd95-mkjjk      1/1     Running   0          4m30s
kube-system          dashboard-metrics-scraper-78d7698477-pgpkj   1/1     Running   0          86s
kube-system          coredns-7f9c69c78c-8vjr4                     1/1     Running   0          86s
ingress              nginx-ingress-microk8s-controller-rjcpr      1/1     Running   0          86s
kube-system          kubernetes-dashboard-85fd7f45cb-h82gk        1/1     Running   0          86s
kube-system          hostpath-provisioner-5c65fbdb4f-42pdn        1/1     Running   0          86s
container-registry   registry-9b57d9df8-vtmsj                     1/1     Running   0          86s

Step 5: Setup Proxy Server DNS

Note: This step is required if you are running behind proxy, skip otherwise.

Use the following steps to set up the MicroK8s DNS service correctly.

1. Identify host network’s configured DNS servers

systemd-resolve --status | grep "Current DNS" --after-context=3
Expected Output
Current DNS Server:
       DNS Servers: <ip1>

2. Disable MicroK8s DNS

microk8s disable dns
Expected Output
Disabling DNS
Reconfiguring kubelet
Removing DNS manifest
serviceaccount "coredns" deleted
configmap "coredns" deleted
deployment.apps "coredns" deleted
service "kube-dns" deleted "coredns" deleted "coredns" deleted
DNS is disabled

3. Enable DNS with Host DNS Server

microk8s enable dns:<ip1>,<ip2>,<ip3>
Expected Output
Enabling DNS
Applying manifest
serviceaccount/coredns created
configmap/coredns created
deployment.apps/coredns created
service/kube-dns created created created
Restarting kubelet
DNS is enabled

4. Confirm Update

sh -c "until microk8s.kubectl rollout status deployments/coredns -n kube-system -w; do sleep 5; done"
Expected Output
deployment "coredns" successfully rolled out

Adding Initial Nodes to the Cluster

Note: This step is only required if you have 2 or more nodes, skip otherwise.

Step 1: Select Leader and Add Nodes

Choose one of your nodes as the leader node.

For each additional node that will be in the cluster, issue the following command on the leader node. You will need to do this once for each node you want to add.


microk8s add-node

Expected Output

You should see output as follows, including the IP address of the primary/controller host and unique token for the node you are adding to use during connection to the cluster.

From the node you wish to join to this cluster, run the following:
microk8s join <leader-ip>:25000/02c66e66e811fe2c697b1cd5d31bfba2/023e49528889

If the node you are adding is not reachable through the default interface you can use one of the following:
 microk8s join <leader-ip>:25000/02c66e66e811fe2c697b1cd5d31bfba2/023e49528889
 microk8s join

Step 2: Join Nodes to Cluster

Run join command shown in the above response on each worker node to be added.


microk8s join <leader-ip>:25000/02c66e66e811fe2c697b1cd5d31bfba2/023e49528889

Expected Output

Contacting cluster at <leader-ip>
Waiting for this node to finish joining the cluster. ..

If you encounter an error Connection failed. Invalid token (500) your token may have expired or you already used it for another node. To resolve, run the add-node command on the leader node to get a new token.

Step 3: Confirm Cluster Nodes

To confirm what nodes are running in your cluster, run:


microk8s kubectl get no

Expected Output

vcplab003    Ready    <none>   3d    v1.21.5-3+83e2bb7ee39726
vcplab002    Ready    <none>   84s   v1.21.5-3+83e2bb7ee39726

Building and Deploying Services to the Cluster

Follow the steps below to build and deploy the Pipeline Server, HAProxy and MQTT services to the cluster.

Step 1: Deploy MQTT

This will enable listening to metadata using MQTT broker.



Expected Output

MQTT instance is up and running

Step 2: Build and Deploy Pipeline Server(s)

Update Configuration with Number of Replicas

Update the number of replicas in the Pipeline Server deployment configuration pipeline-server-worker/pipeline-server.yaml#L8 to match the number of nodes in the cluster.

Build and Deploy

This command adds host system proxy settings to pipeline-server-worker/pipeline-server.yaml and deploys it.

The following command uses the pre built docker image from intel/dlstreamer-pipeline-server:0.7.1. To use a local image instead run BASE_IMAGE=dlstreamer-pipeline-server-gstreamer:latest ./pipeline-server-worker/

Expected Output
All Pipeline Server instances are up and running

Check Status

microk8s kubectl get pods
Expected Output
NAME                                          READY   STATUS
mqtt-deployment-7d85664dc7-f976h              1/1     Running
pipeline-server-deployment-7479f5d494-2wkkk   1/1     Running

Step 3: Build and Deploy HAProxy

This will enable load balancing of Pipeline Server REST Requests through the cluster on port 31000.

Note: Pipeline Server pod(s) must be up and running before building and deploying HAProxy.

Build and Deploy

Expected Output
HAProxy Service started

Check Status

Check status of all pods

microk8s kubectl get pods
Expected Output
NAME                                          READY   STATUS
mqtt-deployment-7d85664dc7-f976h              1/1     Running
pipeline-server-deployment-7479f5d494-2wkkk   1/1     Running
haproxy-deployment-7d79cf66f5-4d92n           1/1     Running

Adding Nodes to an Existing Deployment

Step 1: Prepare New Nodes

To add nodes to an existing deployment first follow the steps outlined in Installing MicroK8s and Joining Nodes to the Cluster for the nodes to be added to the deployment.

Step 2: Update Pipeline Server Configuration with Number of Replicas

Update the number of replicas in the Pipeline Server deployment configuration pipeline-server-worker/pipeline-server.yaml#L8 to match the number of nodes in the cluster.

Step 3: Redeploy Pipeline Server

Using the node selected as the leader, redploy the pipeline server instances.



Expected Output

All Pipeline Server instances are up and running

Step 4: Check Status


microk8s kubectl get pods | grep 'pipeline-server'

Expected Output

pipeline-server-deployment-7479f5d494-2wkkk   1/1     Running
pipeline-server-deployment-7479f5d494-2knop   1/1     Running

Step 5: Rebuild and Redeploy HAProxy

This will add the new Pipeline Server pod(s) to the HAProxy config.

Note: Pipeline Server pod(s) must be up and running before building and deploying HAProxy.



Expected Output

HAProxy Service started

Step 6: Check Status


microk8s kubectl get pods

Expected Output

NAME                                          READY   STATUS
pipeline-server-deployment-7479f5d494-2wkkk   1/1     Running
pipeline-server-deployment-7479f5d494-2knop   1/1     Running
haproxy-deployment-7d79cf66f5-4d92n           1/1     Running
mqtt-deployment-7d85664dc7-f976h              1/1     Running

Sending Pipeline Server Requests to the Cluster

Once pods have been deployed, clients can send pipeline server requests to the cluster via the leader node. The HAProxy service is responsible for load balancing pipeline server requests accross the cluster using a round-robin algorithm.

When pipeline servers are deployed, they can also be configured to stop taking new requests based on a MAX_RUNNING_PIPELINES setting and/or a TARGET_FPS setting.

Pipeline servers that reach the configured MAX_RUNNING_PIPELINES or have a pipeline instance running with an FPS below the TARGET_FPS become unavailable for new requests.

Once all the pipeline servers in the cluster become unavailable, clients receive a 503 Service Unavailable error from the load balancer. Both MAX_RUNNING_PIPELINES and TARGET_FPS are set in pipeline-server-worker/pipeline-server.yaml.

Step 1: Start Pipelines on the Cluster

As an example, the following curl request starts processing the homes_00425.mkv media file with the object_detection/person_vehicle_bike pipeline. This command can be issued multiple times to start multiple concurrent pipelines on the cluster.

In below command, replace <leader-ip> at two places with host ip address of the leader node.


 curl <leader-ip>:31000/pipelines/object_detection/person_vehicle_bike -X POST -H \
  'Content-Type: application/json' -d \
    "source": {
        "uri": "",
        "type": "uri"
    "destination": {
        "metadata": {
            "type": "mqtt",
            "host": "<leader-ip>:31020",
            "topic": "inference-results"

Expected Output


Step 2: View Pipeline Results via MQTT


docker run -it  --entrypoint mosquitto_sub  eclipse-mosquitto:1.6 --topic inference-results -p 31020 -h <leader-ip>

Expected Output



Step 1: Undeploy Pipeline Server, HAProxy and MQTT services

Remove Pipeline Server deployment

microk8s kubectl delete -f pipeline-server-worker/pipeline-server.yaml

Remove HAProxy deployment

microk8s kubectl delete -f haproxy/haproxy.yaml

Remove MQTT deployment

microk8s kubectl delete -f mqtt/mqtt.yaml

Step 2: Remove Node

Confirm running nodes

To confirm what nodes are running in your cluster, run:

microk8s kubectl get no
Expected Output
<node-name>    Ready      <none>   96d   v1.21.9-3+5bfa682137fad9

Drain Node

Drain the node, run below command in worker node you want to remove

microk8s kubectl drain <node-name>
Expected Output
node/<node-name> drained

Leave Cluster

Run below command in worker node you want to remove to leave the cluster

microk8s leave

Remove Node

Run below command on leader node

microk8s remove-node <node-name/ip>

Step 3: Uninstall MicroK8s



Expected Output

Remove/Purge microk8s
microk8s removed


These examples will show the following with a target of 30fps per stream:

  • Running a single stream on a single node and exceeding target fps indicating a stream density of at least 1.
  • Running two streams on a single node and seeing both of them processing below target fps showing a stream density of 2 cannot be met.
  • Adding a second node to cluster and seeing two streams exceeding target fps, thus doubling stream density to 2.

The examples require vaclient so the container dlstreamer-pipeline-server-gstreamer must be built as per these instructions.

Single node with MQTT

Start stream as follows

vaclient/ run object_detection/person_vehicle_bike --server-address http://<leader-ip>:31000 --destination type mqtt --destination host <leader-ip>:31020 --destination topic person-vehicle-bike

Output should be like this (with different instance id and timestamps)

Starting pipeline object_detection/person_vehicle_bike, instance = e6846cce838311ecaf588a37d8d13e4f
Pipeline running - instance_id = e6846cce838311ecaf588a37d8d13e4f
Timestamp 1533000000
- vehicle (1.00) [0.39, 0.13, 0.89, 1.00]
- vehicle (0.99) [0.41, 0.01, 0.63, 0.17]
Timestamp 1567000000
- vehicle (1.00) [0.39, 0.13, 0.88, 1.00]
- vehicle (0.98) [0.41, 0.01, 0.63, 0.17]
Timestamp 1600000000
- vehicle (1.00) [0.39, 0.13, 0.88, 0.99]
- vehicle (0.98) [0.41, 0.01, 0.63, 0.17]

Now stop stream using CTRL+C

Stopping Pipeline...
Pipeline stopped
- vehicle (0.99) [0.39, 0.13, 0.89, 1.00]
- vehicle (0.99) [0.42, 0.00, 0.63, 0.17]
avg_fps: 52.32

Single Node with Two Streams

For two streams, we won't use MQTT but will measure fps to see if both streams can be processed at 30fps (i.e. can we attain a stream density of 2). Note the use of model-instance-id so pipelines can share resources.

vaclient/ run object_detection/person_vehicle_bike --server-address http://<leader-ip>:31000 --parameter detection-model-instance-id person-vehicle-bike-cpu --number-of-streams 2
Starting pipeline 1
Starting pipeline object_detection/person_vehicle_bike, instance = 646559b0860811ec839b1c697aaaa6b4
Pipeline 1 running - instance_id = 646559b0860811ec839b1c697aaaa6b4
Starting pipeline 2
Starting pipeline object_detection/person_vehicle_bike, instance = 65030b7e860811ec839b1c697aaaa6b4
Pipeline 2 running - instance_id = 65030b7e860811ec839b1c697aaaa6b4
2 pipelines running.
Pipeline status @ 7s
- instance=646559b0860811ec839b1c697aaaa6b4, state=RUNNING, 30fps
- instance=65030b7e860811ec839b1c697aaaa6b4, state=RUNNING, 26fps
Pipeline status @ 12s
- instance=646559b0860811ec839b1c697aaaa6b4, state=RUNNING, 29fps
- instance=65030b7e860811ec839b1c697aaaa6b4, state=RUNNING, 26fps
Pipeline status @ 17s
- instance=646559b0860811ec839b1c697aaaa6b4, state=RUNNING, 28fps
- instance=65030b7e860811ec839b1c697aaaa6b4, state=RUNNING, 27fps
Pipeline status @ 22s
- instance=646559b0860811ec839b1c697aaaa6b4, state=RUNNING, 28fps
- instance=65030b7e860811ec839b1c697aaaa6b4, state=RUNNING, 27fps
Pipeline status @ 27s
- instance=646559b0860811ec839b1c697aaaa6b4, state=RUNNING, 28fps
- instance=65030b7e860811ec839b1c697aaaa6b4, state=RUNNING, 27fps

Results show that we can't quite get to a stream density of 2.

Use CTRL+C to stop streams.

Stopping Pipeline...
Pipeline stopped
Stopping Pipeline...
Pipeline stopped
Pipeline status @ 26s
- instance=8db81ca8860d11ecb68672a0c3d9157b, state=ABORTED, 28fps
- instance=8ea33c42860d11ecb68672a0c3d9157b, state=ABORTED, 27fps
avg_fps: 26.78

Note: The avg_fps metric is determined by the last instance in the list, it is the not the average across all instances.

Two Streams on Two Nodes

We'll add a second node to see if we can get a stream density of 2.

First add a second node as per Adding Nodes to Existing Deployment.

Now we run two streams and monitor fps using the same request as before. This time the work should be shared across the two nodes so we anticipate a higher fps for both streams.

vaclient/ run object_detection/person_vehicle_bike --server-address http://<leader-ip>:31000 --parameter detection-model-instance-id cpu --number-of-streams 2 
Starting pipeline 1
Starting pipeline object_detection/person_vehicle_bike, instance = 1ddd102e861111ecb68672a0c3d9157b
Pipeline 1 running - instance_id = 1ddd102e861111ecb68672a0c3d9157b
Starting pipeline 2
Starting pipeline object_detection/person_vehicle_bike, instance = 0fd59b54861111ecbc0856b37602a80f
Pipeline 2 running - instance_id = 0fd59b54861111ecbc0856b37602a80f
2 pipelines running.
Pipeline status @ 7s
- instance=1ddd102e861111ecb68672a0c3d9157b, state=RUNNING, 54fps
- instance=0fd59b54861111ecbc0856b37602a80f, state=RUNNING, 53fps
Pipeline status @ 12s
- instance=1ddd102e861111ecb68672a0c3d9157b, state=RUNNING, 53fps
- instance=0fd59b54861111ecbc0856b37602a80f, state=RUNNING, 53fps
Pipeline status @ 17s
- instance=1ddd102e861111ecb68672a0c3d9157b, state=RUNNING, 53fps
- instance=0fd59b54861111ecbc0856b37602a80f, state=RUNNING, 53fps
Stopping Pipeline...
Pipeline stopped
Stopping Pipeline...
Pipeline stopped
Pipeline status @ 18s
- instance=1ddd102e861111ecb68672a0c3d9157b, state=ABORTED, 53fps
- instance=0fd59b54861111ecbc0856b37602a80f, state=ABORTED, 53fps
avg_fps: 53.27

See that both streams are over 30fps so a stream density of 2 has been achieved.

Useful Commands

# Check running nodes
microk8s kubectl get no

# Check running nodes with detailed information
microk8s kubectl get nodes -o wide

# Check running nodes information in yaml format
microk8s kubectl get nodes -o yaml

# Decribe all nodes and details
microk8s kubectl describe nodes

# Describe specific node
microk8s kubectl describe nodes <node-name>

# Get nodes with details of pods running on them
microk8s kubectl get po -A -o wide | awk '{print $6,"\t",$4,"\t",$8,"\t",$2}'

# Deletes pod, after deleting pod, kubernetes may automatically start new on based on replicas
microk8s kubectl delete pod name <pod-name>

# Add Service to kubernetes cluster using yaml file
microk8s kubectl apply -f <file.yaml>

# Delete an existing service from cluster
microk8s kubectl delete -f <file.yaml>

# Delete Pipeline Server from cluster
microk8s kubectl delete -f pipeline-server-worker/pipeline-server.yaml

# Delete HAProxy from cluster
microk8s kubectl delete -f haproxy/haproxy.yaml

# Get pods from all namespaces
microk8s kubectl get pods --all-namespaces

# Get pods from default namespace
microk8s kubectl get pods

# Get and follow logs of pod
microk8s kubectl logs -f <pod-name>

# Exec into pod
microk8s kubectl exec -it <pod-name> -- /bin/bash

# Restart a deployment
microk8s kubectl rollout restart deployment <service-name>-deployment

# Restart All Pipeline Server deployments
microk8s kubectl rollout restart deploy pipeline-server-deployment

# Restart HAProxy Service
microk8s kubectl rollout restart deploy haproxy-deployment

# Start microk8s
microk8s start

# Stop microk8s
microk8s stop

# Uninstall microk8s
microk8s reset
sudo snap remove --purge microk8s

# Remove a node from cluster(to be run on a node that needs to be removed)
microk8s leave


  • Every time a new Intel® DL Streamer Pipeline Server pod is added or an existing pod restarted, HAProxy needs to be reconfigured and deployed by running below commands

  • We cannot yet query full set of pipeline statuses across all Pipeline Server pods. This means GET <leader-ip>:31000/pipelines/status may not return complete list.