Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[windows] Intermittent DNS resolution failure where DNS requests go to port 1024 #354

Open
mauriciopoppe opened this issue Jan 27, 2025 · 0 comments

Comments

@mauriciopoppe
Copy link

Background

GKE uses Antrea and OVS in Windows nodes, a customer reported that the DNS resolution failed intermittently. From a GKE cluster with Windows nodes they created a Pod which did the following while true; do dnslooup www.google.com; done; and reported that the lookup failed sometimes, we noted that the DNS request went to port 1024 instead of 53.

Reproducible Scenario

GKE teams created the following reproducible scenario, while we reproduced it in a GKE cluster it can be reproduced in a vanilla kubernetes distro.

  1. Create a kubernetes cluster with Windows nodes, install Antrea and OVS (non-containerized daemon) through https://github.com/antrea-io/antrea/blob/main/docs/windows.md
  2. Deploy the following windows Pod which sets up a vanilla http server
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: win-webserver
  name: win-webserver
spec:
  replicas: 1
  selector:
    matchLabels:
      app: win-webserver
  template:
    metadata:
      labels:
        app: win-webserver
      name: win-webserver
    spec:
     containers:
      - name: windowswebserver
        image: mcr.microsoft.com/windows/servercore:ltsc2019
        command:
        - powershell.exe
        - -command
        - "<#code used from https://gist.github.com/19WAS85/5424431#> ; $$listener = New-Object System.Net.HttpListener ; $$listener.Prefixes.Add('http://*:80/') ; $$listener.Start() ; $$callerCounts = @{} ; Write-Host('Listening at http://*:80/') ; while ($$listener.IsListening) { ;$$context = $$listener.GetContext() ;$$requestUrl = $$context.Request.Url ;$$clientIP = $$context.Request.RemoteEndPoint.Address ;$$response = $$context.Response ;Write-Host '' ;Write-Host('> {0}' -f $$requestUrl) ;  ;$$count = 1 ;$$k=$$callerCounts.Get_Item($$clientIP) ;if ($$k -ne $$null) { $$count += $$k } ;$$callerCounts.Set_Item($$clientIP, $$count) ;$$ip=(Get-NetAdapter | Get-NetIpAddress); $$header='<html><body><H1>Windows Container Web Server</H1>' ;$$callerCountsString='' ;$$callerCounts.Keys | % { $$callerCountsString+='<p>IP {0} callerCount {1} ' -f $$ip[1].IPAddress,$$callerCounts.Item($$_) } ;$$footer='</body></html>' ;$$content='{0}{1}{2}' -f $$header,$$callerCountsString,$$footer ;Write-Output $$content ;$$buffer = [System.Text.Encoding]::UTF8.GetBytes($$content) ;$$response.ContentLength64 = $$buffer.Length ;$$response.OutputStream.Write($$buffer, 0, $$buffer.Length) ;$$response.Close() ;$$responseStatus = $$response.StatusCode ;Write-Host('< {0}' -f $$responseStatus)  } ; "
     nodeSelector:
      kubernetes.io/os: windows
  1. The pod takes some time to become ready, exec into the pod kubectl exec -it POD_NAME -- cmd
  2. Create the following batch file:
echo @echo off > infinite_nslookup.bat
echo :loopStart >> infinite_nslookup.bat
echo   nslookup www.google.com >> infinite_nslookup.bat
echo goto loopStart >> infinite_nslookup.bat
  1. Execute the previously created file infinite_nslookup.bat. It will take around a minute for DNS requests to start failing. I can see the failed DNS requests going to port 1024.

Debugging

Debugging from a GKE networking expert, we captured a trace attached as ovt-trace.txt:

  1. Before the issue is observed, there are 1274 open UDP connections. Once the issue is observed, there are 7849 connections. That's a very noticable increase in connections and likely confirms my theory: we are overflowing some table due to too many UDP connections. The number of TCP connections actually slightly decreases (233->188).

  2. The flow dump reveals the smoking gun. Here we have a packet that is clearly showing as being sent to port 53, yet OVS remaps it to port 1024:

recirc_id(0x1502),ct_state(new|trk|dnat),ct_zone(),ct_mark(),ct_label(),ct_tuple4(src=100.78.22.75,dst=100.112.128.74,proto=17,tp_src=55691,tp_dst=53),dp_hash(0x4211aca5),eth(src=00:15:5d:d6:bc:78,dst=00:15:5d:00:3a:01),in_port(5),eth_type(0x0800),ipv4(src=100.78.22.75,dst=100.78.16.72,proto=17,tos=0,ttl=128,frag=no),udp(src=55691,dst=1024), packets:0, bytes:0, used:never, actions:set(eth(src=00:15:5d:d6:bc:78,dst=00:15:5d:00:3a:02)),4

Notice how we initially have tp_dst=53, but then UDP(src=55691,dst=1024). It is trivial to find more instances by doing egrep =53.*=1024 ovs-trace.txt:

recirc_id(0x1502),ct_state(new|trk|dnat),ct_zone(),ct_mark(),ct_label(),ct_tuple4(src=100.78.22.75,dst=100.112.128.74,proto=17,tp_src=55691,tp_dst=53),dp_hash(0x4211aca5),eth(src=00:15:5d:d6:bc:78,dst=00:15:5d:00:3a:01),in_port(5),eth_type(0x0800),ipv4(src=100.78.22.75,dst=100.78.16.72,proto=17,tos=0,ttl=128,frag=no),udp(src=55691,dst=1024), packets:0, bytes:0, used:never, actions:set(eth(src=00:15:5d:d6:bc:78,dst=00:15:5d:00:3a:02)),4
recirc_id(0x1505),ct_state(new|trk|dnat),ct_zone(),ct_mark(),ct_label(),ct_tuple4(src=100.78.22.75,dst=100.112.128.74,proto=17,tp_src=55714,tp_dst=53),dp_hash(0xdf6ae13d),eth(src=00:15:5d:d6:bc:78,dst=00:15:5d:00:3a:01),in_port(5),eth_type(0x0800),ipv4(src=100.78.22.75,dst=100.78.18.76,proto=17,tos=0,ttl=128,frag=no),udp(src=55714,dst=1024), packets:0, bytes:0, used:never, actions:set(eth(src=00:15:5d:d6:bc:78,dst=00:15:5d:00:3a:02)),4
recirc_id(0x1508),ct_state(new|trk|dnat),ct_zone(),ct_mark(),ct_label(),ct_tuple4(src=100.78.22.75,dst=100.112.128.74,proto=17,tp_src=55693,tp_dst=53),dp_hash(0x125489c6),eth(src=00:15:5d:d6:bc:78,dst=00:15:5d:00:3a:01),in_port(5),eth_type(0x0800),ipv4(src=100.78.22.75,dst=100.78.16.73,proto=17,tos=0,ttl=128,frag=no),udp(src=55693,dst=1024), packets:0, bytes:0, used:never, actions:set(eth(src=00:15:5d:d6:bc:78,dst=00:15:5d:00:3a:02)),4
recirc_id(0x1507),ct_state(new|trk|dnat),ct_zone(),ct_mark(),ct_label(),ct_tuple4(src=100.78.22.75,dst=100.112.128.74,proto=17,tp_src=55706,tp_dst=53),dp_hash(0x208a0891),eth(src=00:15:5d:d6:bc:78,dst=00:15:5d:00:3a:01),in_port(5),eth_type(0x0800),ipv4(src=100.78.22.75,dst=100.78.17.202,proto=17,tos=0,ttl=128,frag=no),udp(src=55706,dst=1024), packets:0, bytes:0, used:never, actions:set(eth(src=00:15:5d:d6:bc:78,dst=00:15:5d:00:3a:02)),4
...

Source code debugging:

Ok, so I think I found the code path that sets the port to 1024. Here, MIN_NAT_EPHEMERAL_PORT is set to 1024:

https://github.com/openvswitch/ovs/blob/d92003120dd9213de9452dff3fca0b46af79c68a/datapath-windows/ovsext/Conntrack-nat.c#L288

And then here, the port is overriden with minPort, which is set to MIN_NAT_EPHEMERAL_PORT:

https://github.com/openvswitch/ovs/blob/d92003120dd9213de9452dff3fca0b46af79c68a/datapath-windows/ovsext/Conntrack-nat.c#L398

This code seems to be cycling through all ports from 1024 to 65535, and once it runs out of ports, the code to override the port with 1024 is triggered:

https://github.com/openvswitch/ovs/blob/d92003120dd9213de9452dff3fca0b46af79c68a/datapath-windows/ovsext/Conntrack-nat.c#L365

Also, it seems weird that the destination port is being overriden. In a NAT scenario, I would expect the source port to be overriden, but somehow we end up with destination port override.

Platform details

OS: Windows Server LTSC2019 or LTSC2022
OVS Version: 2.14.4.
Attached file: ovs-trace.txt

ovs-trace.txt

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant