-
-
Notifications
You must be signed in to change notification settings - Fork 967
[client] Use Go's resolver for darwin for control plane and ICE traffic #4833
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughIntroduces an environment- and platform-aware DNS resolver and wires it into dialing paths: adds Changes
Sequence Diagram(s)sequenceDiagram
actor Caller
participant DialerFactory as NewDialer / gRPC dialer
participant Resolver as NewResolver()
participant NetDialer as net.Dialer
participant DNS as net.Resolver.LookupNetIP
participant Remote as RemoteHost
Caller->>DialerFactory: request DialContext(...)
DialerFactory->>Resolver: NewResolver()
Resolver-->>DialerFactory: *net.Resolver
DialerFactory->>NetDialer: create with Resolver (assigned)
Caller->>NetDialer: DialContext(ctx, network, addr)
NetDialer->>DNS: Resolver.LookupNetIP(ctx, ipNet, host)
DNS-->>NetDialer: resolved IPs
NetDialer->>Remote: TCP connect to resolved IP
Remote-->>Caller: connection established / error
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes
Possibly related PRs
Suggested reviewers
Poem
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
client/net/resolver.go (1)
24-43: Consider handling invalid environment variable values explicitly.The switch statement on lines 26-33 silently falls through for unrecognized values (e.g.,
NB_DNS_RESOLVER=foo), which will then proceed to the darwin check. While this may be intentional, it could lead to confusion during debugging. Consider either logging a warning for invalid values or documenting this fall-through behavior more explicitly in the function comment.Example approach with logging:
func NewResolver() *net.Resolver { if resolver := os.Getenv(EnvResolver); resolver != "" { switch strings.ToLower(resolver) { case "system": return net.DefaultResolver case "go": return &net.Resolver{ PreferGo: true, } + default: + // Invalid value, fall through to platform defaults + log.Debugf("Invalid %s value: %s, using platform defaults", EnvResolver, resolver) } }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
client/grpc/dialer_generic.go(1 hunks)client/internal/stdnet/stdnet.go(5 hunks)client/net/dialer.go(1 hunks)client/net/resolver.go(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (3)
client/grpc/dialer_generic.go (2)
client/net/dialer.go (1)
Dialer(9-11)client/net/resolver.go (1)
NewResolver(24-43)
client/internal/stdnet/stdnet.go (1)
client/net/resolver.go (1)
NewResolver(24-43)
client/net/dialer.go (1)
client/net/resolver.go (1)
NewResolver(24-43)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (19)
- GitHub Check: Client / Unit (amd64)
- GitHub Check: Client / Unit (386)
- GitHub Check: Relay / Unit (386)
- GitHub Check: Relay / Unit (amd64, -race)
- GitHub Check: Management / Unit (amd64, sqlite)
- GitHub Check: Management / Unit (amd64, mysql)
- GitHub Check: Client (Docker) / Unit
- GitHub Check: Management / Unit (amd64, postgres)
- GitHub Check: Signal / Unit (amd64)
- GitHub Check: Windows
- GitHub Check: Linux
- GitHub Check: Darwin
- GitHub Check: JS / Lint
- GitHub Check: release_ui_darwin
- GitHub Check: Android / Build
- GitHub Check: Client / Unit
- GitHub Check: Client / Unit
- GitHub Check: release
- GitHub Check: Client / Unit
🔇 Additional comments (6)
client/net/resolver.go (1)
10-14: LGTM! Clear environment variable definition.The constant is well-named and properly documented with the allowed values.
client/net/dialer.go (1)
16-18: LGTM! Resolver integration is correct.The change properly initializes the dialer with the custom resolver, aligning with the PR objectives to use Go's resolver on darwin for control plane traffic.
client/internal/stdnet/stdnet.go (3)
21-21: LGTM! Resolver field and import added correctly.The import alias and resolver field are properly integrated into the Net struct. The field is safe without mutex protection since it's only set during initialization and remains immutable afterwards.
Also applies to: 47-47
58-58: LGTM! Consistent resolver initialization.Both constructors properly initialize the resolver using
nbnet.NewResolver(), ensuring the custom DNS resolution behavior is applied to ICE traffic as intended by the PR.Also applies to: 79-79
118-118: LGTM! Resolver usage is correct.The change from
net.DefaultResolver.LookupNetIPton.resolver.LookupNetIPproperly applies the custom resolver to DNS lookups. Since the resolver is initialized in both constructors, no nil check is needed here.client/grpc/dialer_generic.go (1)
32-34: LGTM! Consistent resolver usage in non-root path.The change ensures the non-root Linux dialing path also uses the custom resolver, maintaining consistency with the root path (line 39) which uses
nbnet.NewDialer()that was updated to use the same resolver in client/net/dialer.go. Both paths now properly apply the darwin-specific resolver workaround for control plane traffic.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (2)
client/net/resolver.go (2)
27-38: Consider a more visible log level for invalid configuration.Invalid
NB_DNS_RESOLVERvalues are logged at Debug level, which may be too quiet for operators troubleshooting DNS issues. Consider usinglog.Warnfso configuration mistakes are more visible.Also, the function documentation mentions "GODEBUG" as an override mechanism but doesn't explain how to use it. Either document the GODEBUG usage pattern or remove the reference to avoid confusion.
Apply this diff to improve logging visibility:
- log.Debugf("Invalid %s value: %s, using platform defaults", EnvResolver, resolver) + log.Warnf("Invalid %s value: %s, using platform defaults", EnvResolver, resolver)
26-47: Verify the fix addresses the macOS getaddrinfo hang issue and add unit tests.This implementation assumes that switching to the pure Go resolver (
PreferGo: true) will prevent the macOS getaddrinfo hang after sleep/wake. Please verify this approach actually resolves the issue through testing or provide references.Additionally, consider adding unit tests to cover:
- Environment variable handling (valid values, invalid values, empty)
- Platform detection logic
- Return value correctness for each code path
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
client/net/resolver.go(1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
- GitHub Check: Client / Unit
- GitHub Check: Client / Unit
- GitHub Check: Windows
- GitHub Check: Darwin
- GitHub Check: Linux
- GitHub Check: Client / Unit
- GitHub Check: JS / Lint
- GitHub Check: Android / Build
- GitHub Check: release_ui_darwin
- GitHub Check: release_ui
- GitHub Check: release
🔇 Additional comments (1)
client/net/resolver.go (1)
12-16: LGTM! Clear constant definition.The environment variable constant is well-documented with acceptable values clearly specified.
| // NewResolver creates a DNS resolver with appropriate settings based on platform and configuration. | ||
| // On Darwin (macOS), it defaults to the pure Go resolver to avoid getaddrinfo hangs after sleep/wake. | ||
| // This is particularly important for connections using this package's Dialer, which bypasses the NetBird | ||
| // overlay network for control plane traffic. Since these connections target external infrastructure | ||
| // (management, signal, relay servers), it is safe to ignore split DNS configurations that would | ||
| // normally be provided by the system resolver. | ||
| // On other platforms, it uses the system resolver (cgo). | ||
| // This behavior can be overridden using the NB_DNS_RESOLVER environment variable or GODEBUG. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
Verify the split DNS assumption for control plane traffic.
The comment states it's "safe to ignore split DNS configurations" for control plane traffic because these connections "target external infrastructure." However, in enterprise or self-hosted scenarios, management/signal/relay servers may be internal resources that rely on split DNS. Users with private deployments behind split DNS could experience connectivity issues.
Consider documenting this limitation or adding a configuration option for environments where control plane servers are internal.
---
🏁 Script executed:
```shell
#!/bin/bash
# Examine the NewResolver function implementation
cat client/net/resolver.go | head -60
Length of output: 1472
🏁 Script executed:
#!/bin/bash
# Search for NewResolver usage across the codebase
rg -n "NewResolver" -A3 -B3 --type=goLength of output: 15091
🏁 Script executed:
#!/bin/bash
# Search for split DNS references
rg -n "split.*dns|DNS.*split" -i --type=goLength of output: 174
🏁 Script executed:
#!/bin/bash
# Search for control plane related DNS or resolver usage
rg -n "control.*plane|management.*server|signal.*server|relay.*server" -i --type=go -C2 | head -100Length of output: 6138
Update documentation to guide users on split DNS scenarios in enterprise deployments.
The concern about split DNS is valid. While the code provides override via NB_DNS_RESOLVER=system environment variable, the documentation doesn't explain when or why users should use it. In self-hosted or enterprise deployments where management/signal/relay servers are internal and rely on split DNS, users will encounter connectivity issues without knowing how to resolve them.
Recommended improvements:
- Document in the comment that
NB_DNS_RESOLVER=systemshould be used for deployments with internal servers behind split DNS - Explain what the
GODEBUGoverride does and provide an example - Add troubleshooting guidance for split DNS scenarios
This allows users to self-serve when they encounter the issue rather than discovering it only through trial and error.
🤖 Prompt for AI Agents
In client/net/resolver.go around lines 18 to 25, expand the existing comment to
document split DNS guidance: add a short note that in self-hosted or enterprise
deployments where management/signal/relay servers are internal and reachable
only via split DNS, users should set NB_DNS_RESOLVER=system to force the system
resolver; briefly explain the GODEBUG override (e.g., how to force the pure Go
resolver via GODEBUG=netdns=go or use system via netdns=cgo) and include a
concrete example of setting the env var, and append a one-paragraph
troubleshooting tip listing symptoms (failed resolution to internal hostnames,
connection timeouts after suspend/resume) and the recommended steps (check split
DNS, try NB_DNS_RESOLVER=system, then GODEBUG variations, and where to file
logs).



Describe your changes
Under certain conditions (sleep/wake), macOS can get stuck on
getaddrinfosyscalls, rendering the system DNs broken.To avoid this, we use Go's resolver. Go's resolver has the drawback that it doesn't take split DNS or any other
scutil --dnssettings into account. So we can only use that when it is safe to use/etc/resolv.confdirectly, e.g., control plane traffic (our custom nbnet.Dialer plus stdnet's resolve methods internally used by ICE).This behavior can be overridden by using the NB_DNS_RESOLVER environment variable.
Issue ticket number and link
Stack
Checklist
Documentation
Select exactly one:
Docs PR URL (required if "docs added" is checked)
Paste the PR link from https://github.com/netbirdio/docs here:
https://github.com/netbirdio/docs/pull/__
Summary by CodeRabbit
✏️ Tip: You can customize this high-level summary in your review settings.