You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
If we delete a pod which used whereabouts to allocate an additional IP from defined pool of addresses when whereabouts pod wasn't working on the same node, such IP address will be stuck until we delete it manually from several custom resources
Expected behavior
Garbage collector should check at the start of whereabouts if all allocated IPs in manifests still attached to pods (pods still exist)
To Reproduce
Steps to reproduce the behavior:
Create the following net-attach-def, we are going to use only 1 IP address to reproduce this issue asap:
Check events in the pod, we see that the IP address has been added
│ Normal Scheduled 5s default-scheduler Successfully assigned default/super-pod to master1-1
│ Normal AddedInterface 3s multus Add eth0 [10.10.20.249/32] from k8s-pod-network
│ Normal AddedInterface 3s multus Add net1 [10.10.3.30/24] from default/super-net
Scale down whereabouts ds to zero and wait until the pods are gone
Check the pod events - the pod is stuck in ContainerCreating state
ERRORED: error configuring pod [default/super-pod2] networking: [default/super-pod2/340a8be4-0b0c-41dd-aadf-54af1bf052e6:super-net]: error adding container to network "super-net": error at storage engine: Could not allocate IP in range: ip: 10.10.3.30 / - 10.10.3.30 / range: 10.10.3.0/24 / excludeRanges: []
As we can see, whereabouts still thinks that this IP address is allocated to super-pod but this pod is not present anymore. So this IP is stuck forever until we remove it manually like this:
Once we remove these two parts, the new pod will be able to allocate actually freed IP address.
We also have a bash script to make this procedure more automated:
#!/usr/bin/env bashset -u
if [[ $#-gt 0 ]] ;thenexport KUBECONFIG=$1fiforIPin$(kubectl get overlappingrangeipreservations -n kube-system | cut "-d " -f1);doif [[ "${IP}"!="NAME" ]] ;then
POD=$(kubectl get overlappingrangeipreservations -n kube-system "${IP}" -o jsonpath='{.spec.podref}')
RESULT=$(kubectl get pod -n "$(echo ${POD}| cut -d/ -f1)""$(echo ${POD}| cut -d/ -f2)"2>&1)if [[ $?-ne 0 ]] ;thenifecho"${RESULT}"| grep -q 'NotFound';thenecho"Pod ${POD} not found in the cluster. Deleting IP ${IP}"
kubectl delete overlappingrangeipreservations -n kube-system "${IP}"if [[ $?-eq 0 ]] ;thenecho"OverlappingRangeIPReservation ${IP} deleted"fiforIPRANGEin$(kubectl get ippools.whereabouts.cni.cncf.io -n kube-system | cut "-d " -f1);doif [[ "${IPRANGE}"!="NAME" ]] ;then
KEY=$(kubectl get ippools.whereabouts.cni.cncf.io "${IPRANGE}" -n kube-system -o json | jq -crM --arg pod "${POD}"'.spec.allocations | map_values(select(.podref==$pod)) | keys[0]')
kubectl get ippools.whereabouts.cni.cncf.io "${IPRANGE}" -n kube-system -o json | jq -crM --arg key "${KEY}"'del(.spec.allocations[$key])'| kubectl replace ippools.whereabouts.cni.cncf.io -f -
if [[ $?-eq 0 ]] ;thenecho"IPPool ${IPRANGE} replaced"fifidonefififidone
The question is that whereabouts doesn't check allocated IP addresses at start and it is possibility to have IP addresses that are stuck
Environment:
Whereabouts version : 0.8.0
Kubernetes version (use kubectl version): doesn't matter, reproduced on 1.30 and 1.31
Network-attachment-definition: see above
Whereabouts configuration (on the host): N/A
OS (e.g. from /etc/os-release): Ubuntu
Kernel (e.g. uname -a): Linux master1-1 6.8.0-1021-aws #23-Ubuntu SMP Mon Dec 9 23:59:34 UTC 2024 x86_64 x86_64 x86_64 GNU/Linu
Others: N/A
Additional info / context
As far as I can see, there is only one predicate in the code for deletion event (when a pod is actually deleted from a cluster):
What I can suggest is adding a finalizer to a pod to not let it go if whereabouts didn't remove the finalizer yet or just check garbage on start as well
The text was updated successfully, but these errors were encountered:
Describe the bug
If we delete a pod which used
whereabouts
to allocate an additional IP from defined pool of addresses whenwhereabouts
pod wasn't working on the same node, such IP address will be stuck until we delete it manually from several custom resourcesExpected behavior
Garbage collector should check at the start of whereabouts if all allocated IPs in manifests still attached to pods (pods still exist)
To Reproduce
Steps to reproduce the behavior:
kubectl -n kube-system patch daemonset whereabouts -p '{"spec": {"template": {"spec": {"nodeSelector": {"non-existing": "true"}}}}}'
kubectl delete pod -n default super-pod
kubectl -n kube-system patch daemonset whereabouts --type json -p='[{"op": "remove", "path": "/spec/template/spec/nodeSelector/non-existing"}]'
ContainerCreating
stateAs we can see,
whereabouts
still thinks that this IP address is allocated tosuper-pod
but this pod is not present anymore. So this IP is stuck forever until we remove it manually like this:And now we should remove the entry for the IP from the ipool:
"30": id: a56b06006c6a3e3a1eb26db82b8cd5db008f20627576e3b1c7926776bffc9ed0 ifname: net1 podref: default/super-pod
Once we remove these two parts, the new pod will be able to allocate actually freed IP address.
We also have a bash script to make this procedure more automated:
The question is that whereabouts doesn't check allocated IP addresses at start and it is possibility to have IP addresses that are stuck
Environment:
kubectl version
): doesn't matter, reproduced on 1.30 and 1.31uname -a
):Linux master1-1 6.8.0-1021-aws #23-Ubuntu SMP Mon Dec 9 23:59:34 UTC 2024 x86_64 x86_64 x86_64 GNU/Linu
Additional info / context
As far as I can see, there is only one predicate in the code for deletion event (when a pod is actually deleted from a cluster):
What I can suggest is adding a finalizer to a pod to not let it go if whereabouts didn't remove the finalizer yet or just check garbage on start as well
The text was updated successfully, but these errors were encountered: