This page is here to provide documentation of non-standard-K8s functionality that can be used with virtual node pods!
Pod Annotation | Short Summary | Doc Link |
---|---|---|
microsoft.containerinstance.virtualnode.ccepolicy | Run in Confidential ACI with provided policy | Confidential Containers |
microsoft.containerinstance.virtualnode.subnets.primary | Run within a specific Subnet | Subnet Override |
microsoft.containerinstance.virtualnode.identity | Run using a provided Azure Identity | Managed Identity |
microsoft.containerinstance.virtualnode.injectkubeproxy | Controlling Kube-Proxy Usage | Kube-Proxy |
microsoft.containerinstance.virtualnode.injectdns | Controlling K8s DNS Usage | K8s DNS |
microsoft.containerinstance.virtualnode.zones | Requesting Azure Zone Deployment | Zones |
microsoft.containerinstance.virtualnode.imagecachepod | Image caching request for Standby Pools | Image Caching |
virtual node Downlevel API | Short Summary | Doc Link |
---|---|---|
===VIRTUALNODE2.CC.THIM.ENDPOINT=== | Replaced with THIM Endpoint | THIM Downlevel APIs |
===VIRTUALNODE2.CC.THIM.ADDRESS=== | Replaced with THIM Address | THIM Downlevel APIs |
The general method for controlling non-K8s behavior of virtual nodes at the pod level is via pod annotations.
GENERAL NOTE: Annotations below all need to be applied to the appropriate part of the K8s resource so that they will be on the pods themselves. For a pod YAML file, this would be the metadata
for the file itself, while for a Deployment / ScaleSet / etc. YAML the annotation would be in the template
's metadata
.
Example of annotations for Pod YAML (it's in the main metadata!)
apiVersion: v1
kind: Pod
metadata:
annotations:
microsoft.containerinstance.virtualnode.injectdns: "false"
name: demo-pod
spec:
containers:
- command:
- /bin/bash
- -c
- 'counter=1; while true; do echo "Hello, World! Counter: $counter"; counter=$((counter+1)); sleep 1; done'
image: mcr.microsoft.com/azure-cli
name: hello-world-counter
resources:
limits:
cpu: 2250m
memory: 2256Mi
requests:
cpu: 100m
memory: 128Mi
nodeSelector:
virtualization: virtualnode2
tolerations:
- effect: NoSchedule
key: virtual-kubelet.io/provider
operator: Exists
Example of annotations for Deployment YAML (it's in the template metadata!)
apiVersion: apps/v1
kind: Deployment
metadata:
annotations:
labels:
type: scaletest
name: deploy-alpine
spec:
replicas: 3
selector:
matchLabels:
type: scaletest
template:
metadata:
annotations:
microsoft.containerinstance.virtualnode.injectkubeproxy: 'false'
labels:
type: scaletest
spec:
containers:
- image: mcr.microsoft.com/oss/nginx/nginx:1.17.3-alpine
name: mypod
resources:
limits:
cpu: 2250m
memory: 2256Mi
requests:
cpu: 100m
memory: 128Mi
nodeSelector:
virtualization: virtualnode2
tolerations:
- effect: NoSchedule
key: virtual-kubelet.io/provider
operator: Exists
Confidential containers are a high security offering from ACI that allows customers to have a high degree of confidence what they are running and what that image is allowed to do.
Overview of Confidential Containers on ACI
In order to have virtual node create your containers as Confidential, you must add a pod annotation which will contain the CCE policy the pod will run using:
microsoft.containerinstance.virtualnode.ccepolicy
In order to generate that policy, utilize the ConfCom extension which can be added into Az CLI. To add it, run:
az extension add -n confcom
Using that tool for virtual nodes is simple, just provide your YAML file with the --virtual-node-yaml parameter like so:
az confcom acipolicygen --virtual-node-yaml <yourYamlFile>.yaml
This will not only generate the CCE policy, but it will inject the policy annotation into the right section of the file.
Example Confidential YAML
apiVersion: v1
kind: Pod
metadata:
annotations:
microsoft.containerinstance.virtualnode.ccepolicy: package policy

import future.keywords.every
import future.keywords.in

api_version := "0.10.0"
framework_version := "0.2.3"

fragments := [
  {
    "feed": "mcr.microsoft.com/aci/aci-cc-infra-fragment",
    "includes": [
      "containers",
      "fragments"
    ],
    "issuer": "did:x509:0:sha256:I__iuL25oXEVFdTP_aBLx_eT1RPHbCQ_ECBQfYZpt9s::eku:1.3.6.1.4.1.311.76.59.1.3",
    "minimum_svn": "1"
  }
]

containers := [{"allow_elevated":false,"allow_stdio_access":true,"capabilities":{"ambient":[],"bounding":["CAP_AUDIT_WRITE","CAP_CHOWN","CAP_DAC_OVERRIDE","CAP_FOWNER","CAP_FSETID","CAP_KILL","CAP_MKNOD","CAP_NET_BIND_SERVICE","CAP_NET_RAW","CAP_SETFCAP","CAP_SETGID","CAP_SETPCAP","CAP_SETUID","CAP_SYS_CHROOT"],"effective":["CAP_AUDIT_WRITE","CAP_CHOWN","CAP_DAC_OVERRIDE","CAP_FOWNER","CAP_FSETID","CAP_KILL","CAP_MKNOD","CAP_NET_BIND_SERVICE","CAP_NET_RAW","CAP_SETFCAP","CAP_SETGID","CAP_SETPCAP","CAP_SETUID","CAP_SYS_CHROOT"],"inheritable":[],"permitted":["CAP_AUDIT_WRITE","CAP_CHOWN","CAP_DAC_OVERRIDE","CAP_FOWNER","CAP_FSETID","CAP_KILL","CAP_MKNOD","CAP_NET_BIND_SERVICE","CAP_NET_RAW","CAP_SETFCAP","CAP_SETGID","CAP_SETPCAP","CAP_SETUID","CAP_SYS_CHROOT"]},"command":["nginx","-g","daemon off;"],"env_rules":[{"pattern":"PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin","required":false,"strategy":"string"},{"pattern":"NGINX_VERSION=1.17.3","required":false,"strategy":"string"},{"pattern":"NJS_VERSION=0.3.5","required":false,"strategy":"string"},{"pattern":"PKG_RELEASE=1","required":false,"strategy":"string"},{"pattern":"TERM=xterm","required":false,"strategy":"string"},{"pattern":"(?i)(FABRIC)_.+=.+","required":false,"strategy":"re2"},{"pattern":"HOSTNAME=.+","required":false,"strategy":"re2"},{"pattern":"T(E)?MP=.+","required":false,"strategy":"re2"},{"pattern":"FabricPackageFileName=.+","required":false,"strategy":"re2"},{"pattern":"HostedServiceName=.+","required":false,"strategy":"re2"},{"pattern":"IDENTITY_API_VERSION=.+","required":false,"strategy":"re2"},{"pattern":"IDENTITY_HEADER=.+","required":false,"strategy":"re2"},{"pattern":"IDENTITY_SERVER_THUMBPRINT=.+","required":false,"strategy":"re2"},{"pattern":"azurecontainerinstance_restarted_by=.+","required":false,"strategy":"re2"},{"pattern":"[A-Z0-9_]+_SERVICE_HOST=.+","required":false,"strategy":"re2"},{"pattern":"[A-Z0-9_]+_SERVICE_PORT=.+","required":false,"strategy":"re2"},{"pattern":"[A-Z0-9_]+_SERVICE_PORT_[A-Z0-9_]+=.+","required":false,"strategy":"re2"},{"pattern":"[A-Z0-9_]+_PORT=.+","required":false,"strategy":"re2"},{"pattern":"[A-Z0-9_]+_PORT_[0-9]+_TCP=.+","required":false,"strategy":"re2"},{"pattern":"[A-Z0-9_]+_PORT_[0-9]+_TCP_PROTO=.+","required":false,"strategy":"re2"},{"pattern":"[A-Z0-9_]+_PORT_[0-9]+_TCP_PORT=.+","required":false,"strategy":"re2"},{"pattern":"[A-Z0-9_]+_PORT_[0-9]+_TCP_ADDR=.+","required":false,"strategy":"re2"}],"exec_processes":[{"command":["/bin/sh"],"signals":[]},{"command":["/bin/bash"],"signals":[]}],"id":"mcr.microsoft.com/oss/nginx/nginx:1.17.3-alpine","layers":["7f062c5ebb3dc6d3df7f25fa687bbec0f61530536267ad6d6afa32501f5340a6","297dd26b51191f85928508fb368e6b064502c128be6f51fc5cb302d3b253d730"],"mounts":[{"destination":"/var/run/secrets/kubernetes.io/serviceaccount","options":["rbind","rshared","ro"],"source":"sandbox:///tmp/atlas/emptydir/.+","type":"bind"},{"destination":"/etc/hosts","options":["rbind","rshared","rw"],"source":"sandbox:///tmp/atlas/emptydir/.+","type":"bind"},{"destination":"/dev/termination-log","options":["rbind","rshared","rw"],"source":"sandbox:///tmp/atlas/emptydir/.+","type":"bind"},{"destination":"/etc/hostname","options":["rbind","rshared","rw"],"source":"sandbox:///tmp/atlas/emptydir/.+","type":"bind"},{"destination":"/etc/resolv.conf","options":["rbind","rshared","rw"],"source":"sandbox:///tmp/atlas/emptydir/.+","type":"bind"}],"name":"mypod","no_new_privileges":false,"seccomp_profile_sha256":"","signals":[15],"user":{"group_idnames":[{"pattern":"","strategy":"any"}],"umask":"0022","user_idname":{"pattern":"","strategy":"any"}},"working_dir":"/"},{"allow_elevated":false,"allow_stdio_access":true,"capabilities":{"ambient":[],"bounding":["CAP_CHOWN","CAP_DAC_OVERRIDE","CAP_FSETID","CAP_FOWNER","CAP_MKNOD","CAP_NET_RAW","CAP_SETGID","CAP_SETUID","CAP_SETFCAP","CAP_SETPCAP","CAP_NET_BIND_SERVICE","CAP_SYS_CHROOT","CAP_KILL","CAP_AUDIT_WRITE"],"effective":["CAP_CHOWN","CAP_DAC_OVERRIDE","CAP_FSETID","CAP_FOWNER","CAP_MKNOD","CAP_NET_RAW","CAP_SETGID","CAP_SETUID","CAP_SETFCAP","CAP_SETPCAP","CAP_NET_BIND_SERVICE","CAP_SYS_CHROOT","CAP_KILL","CAP_AUDIT_WRITE"],"inheritable":[],"permitted":["CAP_CHOWN","CAP_DAC_OVERRIDE","CAP_FSETID","CAP_FOWNER","CAP_MKNOD","CAP_NET_RAW","CAP_SETGID","CAP_SETUID","CAP_SETFCAP","CAP_SETPCAP","CAP_NET_BIND_SERVICE","CAP_SYS_CHROOT","CAP_KILL","CAP_AUDIT_WRITE"]},"command":["/pause"],"env_rules":[{"pattern":"PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin","required":true,"strategy":"string"},{"pattern":"TERM=xterm","required":false,"strategy":"string"}],"exec_processes":[],"layers":["16b514057a06ad665f92c02863aca074fd5976c755d26bff16365299169e8415"],"mounts":[],"no_new_privileges":false,"seccomp_profile_sha256":"","signals":[],"user":{"group_idnames":[{"pattern":"","strategy":"any"}],"umask":"0022","user_idname":{"pattern":"","strategy":"any"}},"working_dir":"/"}]

allow_properties_access := true
allow_dump_stacks := true
allow_runtime_logging := true
allow_environment_variable_dropping := true
allow_unencrypted_scratch := false
allow_capability_dropping := true

mount_device := data.framework.mount_device
unmount_device := data.framework.unmount_device
mount_overlay := data.framework.mount_overlay
unmount_overlay := data.framework.unmount_overlay
create_container := data.framework.create_container
exec_in_container := data.framework.exec_in_container
exec_external := data.framework.exec_external
shutdown_container := data.framework.shutdown_container
signal_container_process := data.framework.signal_container_process
plan9_mount := data.framework.plan9_mount
plan9_unmount := data.framework.plan9_unmount
get_properties := data.framework.get_properties
dump_stacks := data.framework.dump_stacks
runtime_logging := data.framework.runtime_logging
load_fragment := data.framework.load_fragment
scratch_mount := data.framework.scratch_mount
scratch_unmount := data.framework.scratch_unmount

reason := {"errors": data.framework.errors}
name: confidential-alpine
spec:
containers:
- image: mcr.microsoft.com/oss/nginx/nginx:1.17.3-alpine
name: mypod
resources:
limits:
cpu: 2250m
memory: 2256Mi
requests:
cpu: 100m
memory: 128Mi
nodeSelector:
virtualization: virtualnode2
tolerations:
- effect: NoSchedule
key: virtual-kubelet.io/provider
operator: Exists
When managing K8s deployments where multiple containers need to be aligned with dynamic setups, some customers use variables in the YAML and have HELM replace them with real values at deployment time. This is a great system for more complex deployments, but Confidential Policies are designed to reject configurations that they do not recognize!
However, you can still use a very similar process to the above policy generation to continue getting the benefits from both HELM's dynamic chart capabilities and Confidential's security mechanisms!
Instead of deploying directly from HELM using something like helm install
, make use of the helm template
command which will output a static YAML file that is no longer using dynamic values. At that point, using the policy gen command from the above section on that YAML will update the YAML with the appropriate Confidential Policy, and you can deploy it via kubectl
. The static YAML generation and Confidential Policy generation steps will need to be re-run any time the HELM charts are updated to ingest those updates.
For testing / developing containers before the functionality is locked in, often it is useful to run with a very permissive policy. The most permissive policy is below, which provides effectively NO security guarantees... allowing a container to be run with any payload and debug execution allowed, but still running inside the specialized confidential hardware and with the attestation services running.
This should NOT be used for any production workloads, just as a tool for initial experimentation.
"microsoft.containerinstance.virtualnode.ccepolicy":"cGFja2FnZSBwb2xpY3kKCmFwaV9zdm4gOj0gIjAuMTAuMCIKCm1vdW50X2RldmljZSA6PSB7ImFsbG93ZWQiOiB0cnVlfQptb3VudF9vdmVybGF5IDo9IHsiYWxsb3dlZCI6IHRydWV9CmNyZWF0ZV9jb250YWluZXIgOj0geyJhbGxvd2VkIjogdHJ1ZSwgImVudl9saXN0IjogbnVsbCwgImFsbG93X3N0ZGlvX2FjY2VzcyI6IHRydWV9CnVubW91bnRfZGV2aWNlIDo9IHsiYWxsb3dlZCI6IHRydWV9IAp1bm1vdW50X292ZXJsYXkgOj0geyJhbGxvd2VkIjogdHJ1ZX0KZXhlY19pbl9jb250YWluZXIgOj0geyJhbGxvd2VkIjogdHJ1ZSwgImVudl9saXN0IjogbnVsbH0KZXhlY19leHRlcm5hbCA6PSB7ImFsbG93ZWQiOiB0cnVlLCAiZW52X2xpc3QiOiBudWxsLCAiYWxsb3dfc3RkaW9fYWNjZXNzIjogdHJ1ZX0Kc2h1dGRvd25fY29udGFpbmVyIDo9IHsiYWxsb3dlZCI6IHRydWV9CnNpZ25hbF9jb250YWluZXJfcHJvY2VzcyA6PSB7ImFsbG93ZWQiOiB0cnVlfQpwbGFuOV9tb3VudCA6PSB7ImFsbG93ZWQiOiB0cnVlfQpwbGFuOV91bm1vdW50IDo9IHsiYWxsb3dlZCI6IHRydWV9CmdldF9wcm9wZXJ0aWVzIDo9IHsiYWxsb3dlZCI6IHRydWV9CmR1bXBfc3RhY2tzIDo9IHsiYWxsb3dlZCI6IHRydWV9CnJ1bnRpbWVfbG9nZ2luZyA6PSB7ImFsbG93ZWQiOiB0cnVlfQpsb2FkX2ZyYWdtZW50IDo9IHsiYWxsb3dlZCI6IHRydWV9CnNjcmF0Y2hfbW91bnQgOj0geyJhbGxvd2VkIjogdHJ1ZX0Kc2NyYXRjaF91bm1vdW50IDo9IHsiYWxsb3dlZCI6IHRydWV9Cg=="
In order to slightly loosen the policy for a Pod to allow certain types of debugging activities like allowing an exec session to shell into the pod with sh or bash, you can generate a policy using the --debug-mode
arg:
az confcom acipolicygen -k <yourYamlFile>.yaml --debug-mode
By default, virtual node pods will run in the subnet configured in the HELM chart as the default ACI subnet. However, some customers may want to run pods in their own isolated subnets (or in a subnet with only a specific set of other pods), and this can be achieved using the subnet override annotation.
microsoft.containerinstance.virtualnode.subnets.primary
Example: microsoft.containerinstance.virtualnode.subnets.primary: /subscriptions/000000-0000-0000-053ca49ab4b5/resourceGroups/definitely_a_fake_RG/providers/Microsoft.Network/virtualNetworks/the_VNET_For_This_Subnet/subnets/your_subnet_name
For some Azure interactions it can be very convenient (and a good security practice) to utilize Azure Managed Identities to make the requests, rather than having your code deal with the unpleasantness of rotating credentials. virtual node can hook up to Azure Container Instances functionality for running containers with a Managed Identity via a pod annotation:
microsoft.containerinstance.virtualnode.identity
Example: microsoft.containerinstance.virtualnode.identity: /subscriptions/000000-0000-0000-053ca49ab4b5/resourceGroups/definitely_a_fake_RG/providers/Microsoft.ManagedIdentity/userAssignedIdentities/my_MI_name
The Kube-Proxy is a standard K8s component that provides benefits like modifying local IP route tables for K8s internal network usage. However, if you do not require this functionality (or explicitly don't want it), the kube-proxy can be disabled for the virtual node pods via this annotation:
microsoft.containerinstance.virtualnode.injectkubeproxy: "false"
The default behavior for K8s is to include the Kube-Proxy so that is the behavior if the annotation is not provided.
Confidential containers do not support Kube-Proxy usage as it breaks some security guarantees, so regardless what value is provided for this annotation a Confidential pod will ignore it and load without a Kube-Proxy.
By default, K8s Pods are expected to utilize the K8s cluster's DNS. If you want to avoid that interaction, you can add this annotation
microsoft.containerinstance.virtualnode.injectdns: "false"
If provided as false, ACI's default DNS will be used by this pod instead of K8s.
Azure has a concept of Availability Zones, which are separated groups of datacenters that exist within the same region. If your scenario calls for it, you can specify a zone for your pod to be hosted on within your given region.
microsoft.containerinstance.virtualnode.zones: "<semi-colon delimited string of zones>"
NOTE: Today, ACI only supports providing a single zone as part of the request to allocate a sandbox for your pod. If you provide multiple, you should get an informative error effectively saying you can only provide one.
When using the node level configuration to specify a default zone, if this pod annotation is set it will take precedence over that. When a node level zone is set and you want a particular pod to use no zone, set the pod level annotation with an empty string value.
virtual node has a couple of downlevel APIs which don't behave quite like K8s downlevel APIs. They work such that if for a POD if the VALUE of on ENV var is exactly equal to one of the virtual node Downlevel APIs, it will be replaced server size with the appropriate "real" value.
THIM (Trusted Hardware Identity Management) is part of the attestation service used for Confidential ACI. In order to avoid hardcoding the address to interact with the attestation service, customers can instead set an environment variable to either of the below and then use the value of that in their container to access THIM:
===VIRTUALNODE2.CC.THIM.ENDPOINT===
, which will be replaced by something like http://169.254.128.1:2377/metadata/THIM/amd/certification
===VIRTUALNODE2.CC.THIM.ADDRESS===
, which will be replaced by something like 169.254.128.1:2377
Example Pod YAML using the THIM Downlevel APIs:
apiVersion: v1
kind: Pod
metadata:
annotations:
microsoft.containerinstance.virtualnode.injectkubeproxy: 'false'
name: thim-downlevel
spec:
containers:
- command:
- /bin/bash
- -c
- 'counter=1; while true; do echo "Hello, World! Counter: $counter"; counter=$((counter+1)); sleep 1; done'
image: mcr.microsoft.com/azure-cli
name: managed-identity-container
env:
- name: THIM_ENDPOINT
value: ===VIRTUALNODE2.CC.THIM.ENDPOINT===
- name: whateverNameYouWant
value: ===VIRTUALNODE2.CC.THIM.ADDRESS===
resources:
limits:
cpu: 2250m
memory: 2256Mi
requests:
cpu: 100m
memory: 128Mi
nodeSelector:
type: virtual-kubelet
virtualization: virtualnode2
tolerations:
- effect: NoSchedule
key: virtual-kubelet.io/provider
operator: Exists
Which, assuming you were running a Confidential pod with an image which includes CURL, you could then run something like this to get the THIM attestation:
curl GET $THIM_ENDPOINT -H "Metadata: true"