Skip to content

Conversation

@fahrishih
Copy link
Contributor

@fahrishih fahrishih commented Nov 27, 2025

Added

  • Automatic WireGuard interface reconfiguration when peer IP changes
  • The NetBird client now detects changes to the self peer’s IP address delivered by the Management Service and applies them live without requiring netbird down / netbird up. This is a more scalable approach for someone who has many peers and impossible to restart 1 by 1.

Improved

  • Enhanced client engine logic to watch for self-IP updates from network map diffs.
  • Added logic to update interface address, routes, and internal state when the IP changes.
  • Reduced downtime during CIDR migrations and tenant-level IP reassignment.
  • Improved consistency between Management-assigned IPs and the active local WireGuard configuration.

Fixed

  • Fixed issue where the client received new IP configuration from Management but did not reconfigure the local interface until a manual reconnect.
  • Fixed mismatch between API IP and active wtX/utunX interface, resulting in routing inconsistencies.

Notes

  • Implementation tested on macOS and Linux client builds.
  • WG interface name handling remains platform-specific:
  • macOS uses utun*
  • Linux uses wg0 / kernel interface path

This enhancement significantly improves NetBird’s dynamic IP behavior and avoids downtime during network CIDR changes.

Checklist

  • Is it a bug fix
  • Is a typo/documentation fix
  • Is a feature enhancement
  • It is a refactor
  • Created tests that fail without the change (if possible)

By submitting this pull request, you confirm that you have read and agree to the terms of the Contributor License Agreement.

Documentation

Select exactly one:

  • I added/updated documentation for this change
  • Documentation is not needed for this change (explain why)

Summary by CodeRabbit

  • New Features

    • Hot-update of the local WireGuard IP during config changes.
    • Automatic SSH rebind and firewall DNAT refresh when the node IP changes.
    • Local IP state is refreshed to reflect address changes.
  • Bug Fixes

    • Config updates now continue if an IP update fails; failures are logged rather than aborting.
  • Tests

    • Added tests for IP-update scenarios, SSH restart lifecycle, and firewall/DNAT interactions.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Nov 27, 2025

Walkthrough

Introduces idempotent self-IP hot-update logic on the WireGuard interface with firewall and SSH rebinding steps; integrates into the config update path and adds tests and server Restart support. Update failures are logged and do not abort config application.

Changes

Cohort / File(s) Summary
Engine: IP update + firewall handling
client/internal/engine.go
Adds updateSelfPeerIP(oldAddr, newAddr string) error and updateFirewallForAddressChange(oldIP netip.Addr) error; replaces direct address check in updateConfig with helper call that updates WG address, refreshes firewall/SSH bindings, and mutates Engine.WgAddr on success; failures are logged but do not stop config processing.
Engine tests
client/internal/engine_test.go
Adds TestEngine_UpdateSelfPeerIP with subtests for no-op on identical addresses, successful update, interface update failure, SSH restart on IP change, and behavior when SSH server is nil; introduces mocks for SSH server and firewall manager and extends existing SSH consistency test.
SSH server interface
client/internal/engine_ssh.go
Adds Restart(ctx context.Context, newAddr netip.AddrPort) error to the sshServer interface.
SSH server implementation
client/ssh/server/server.go
Implements Restart(ctx context.Context, newAddr netip.AddrPort) error on Server which Stop()s then Start()s the server to rebind to a new address, wrapping Stop errors.
SSH server tests
client/ssh/server/server_test.go
Adds TestSSHServer_Restart and TestSSHServer_RestartNotRunning to validate restart behavior (rebinding to a new port) and restart-from-not-running semantics.

Sequence Diagram(s)

sequenceDiagram
    participant updateConfig as updateConfig
    participant updateSelf as updateSelfPeerIP
    participant wgIface as wgInterface
    participant fwMgr as firewallManager
    participant sshSrv as sshServer
    participant engine as Engine

    updateConfig->>updateSelf: updateSelfPeerIP(oldAddr, newAddr)
    alt oldAddr == newAddr
        updateSelf-->>updateConfig: return nil (no-op)
    else oldAddr != newAddr
        updateSelf->>wgIface: UpdateAddr(newAddr)
        alt wg update success
            wgIface-->>updateSelf: nil
            updateSelf->>engine: set Engine.WgAddr = newAddr
            updateSelf->>fwMgr: updateFirewallForAddressChange(oldIP)
            fwMgr->>sshSrv: if SSH active -> Restart(newAddr:port) (DNAT adjust)
            fwMgr-->>updateSelf: firewall updated
            updateSelf-->>updateConfig: return nil
        else wg update failure
            wgIface-->>updateSelf: error
            updateSelf-->>updateConfig: return error (logged, continue)
        end
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

  • Inspect updateSelfPeerIP for correct idempotence and atomicity of state mutation (Engine.WgAddr) only on successful interface update.
  • Verify firewall DNAT removal/addition logic and ordering relative to SSH Stop/Start to avoid transient port collisions or dropped rules.
  • Review error handling: ensure failures are logged and do not leave partial state (e.g., WG addr mutated without firewall update).
  • Review new SSH Server.Restart behavior for proper wrapping of Stop errors and start semantics when server was not running.
  • Check tests' mock expectations match real interfaces and cover both success and failure paths.

Poem

🐇 I nudged the IP, gave it a nudge and a spin,
WireGuard hopped, the firewall sang, ssh began again.
If bumps appear, I log then carry on—
A rabbit's patch: rebinding till the dawn. ✨

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately captures the main feature: automatic WireGuard interface reconfiguration when peer IP changes, which is the core objective of the changeset.
Description check ✅ Passed The description is comprehensive, covering changes made, improvements, fixes, implementation notes, and includes proper checklist selections. All major sections from the template are addressed.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between cdafea7 and a13fab9.

📒 Files selected for processing (5)
  • client/internal/engine.go (2 hunks)
  • client/internal/engine_ssh.go (1 hunks)
  • client/internal/engine_test.go (3 hunks)
  • client/ssh/server/server.go (1 hunks)
  • client/ssh/server/server_test.go (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
client/ssh/server/server.go (1)
shared/management/status/error.go (1)
  • Errorf (70-75)
client/internal/engine.go (2)
client/iface/wgaddr/address.go (1)
  • Address (9-12)
client/firewall/manager/protocol.go (1)
  • ProtocolTCP (9-9)
🔇 Additional comments (8)
client/internal/engine_ssh.go (1)

19-24: LGTM!

The Restart method addition to the sshServer interface is well-designed, following the existing pattern of Start with context and address parameters. This cleanly supports the IP hot-update feature without requiring a full stop/start cycle from the caller.

client/ssh/server/server.go (1)

281-291: LGTM!

The Restart implementation is clean and idempotent. Since Stop() returns nil for an already-stopped server, calling Restart on a non-running server correctly starts it on the new address. The error wrapping provides good context for debugging.

client/ssh/server/server_test.go (1)

443-521: LGTM!

The tests provide good coverage for the Restart functionality:

  • TestSSHServer_Restart validates the complete lifecycle: start → restart on new address → stop.
  • TestSSHServer_RestartNotRunning confirms the idempotent behavior when restarting a never-started server.

Both tests properly verify the server status after each operation.

client/internal/engine.go (3)

906-913: LGTM!

The integration of updateSelfPeerIP into updateConfig is well-placed. Logging the error but continuing with the rest of the config update is appropriate—this ensures partial failures don't block other configuration changes while still providing visibility into issues.


932-963: Well-structured idempotent IP update logic.

The implementation correctly:

  1. Performs an idempotency check before any work.
  2. Captures the old IP before calling UpdateAddr to ensure firewall rules can reference it.
  3. Updates the engine config only after a successful interface update.
  4. Handles firewall changes gracefully with logged warnings rather than hard failures.

965-1007: I'm unable to proceed with shell-based verification due to repository cloning issues. However, I can still analyze the review comment based on the code snippet provided and general software engineering principles.

Let me provide the rewritten review comment based on what can be reasonably assessed from the provided context:


Verify whether DNS server and other subsystems need explicit address updates or can read the address dynamically.

The SSH server restart and DNAT rule updates are correctly implemented. However, subsystems that cached the WG address at initialization—such as the DNS server—may not automatically pick up IP changes through wgInterface.Address() calls during operation.

The concern centers on:

  • SSH server's SetNetworkValidation is set once in startSSHServer but not re-invoked during Restart()
  • DNS server may similarly cache the local IP at startup

Recommended verification steps:

  1. Check if DNS server reads the address dynamically when needed or caches it at initialization
  2. Verify whether SetNetworkValidation on the SSH server needs to be called after restart
  3. Audit other subsystems that may depend on the local IP address to ensure they either read it dynamically or are explicitly updated

If subsystems read the address dynamically, no changes are needed. If they cache the address, explicit update calls may be required.

client/internal/engine_test.go (2)

408-572: LGTM!

Comprehensive test coverage for updateSelfPeerIP:

  • Idempotency when addresses match.
  • Successful update with config mutation verification.
  • Error propagation when interface update fails (config unchanged).
  • SSH server restart with DNAT rule verification.
  • Graceful handling when SSH server is nil.

The tests effectively validate both the happy path and edge cases.


574-703: LGTM on mock implementations.

The mockSSHServer and mockFirewallManager implementations correctly satisfy their respective interfaces with configurable function hooks for test verification. The stub methods in mockFirewallManager appropriately return nil/empty values for unused interface methods.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
client/internal/engine.go (1)

906-913: Hot self-IP update helper looks correct; consider guarding future reuse and iOS hook

The new updateSelfPeerIP flow is clean: it’s idempotent, updates the wg iface first, and only mutates config.WgAddr on success; updateConfig correctly logs and proceeds on failure so other config updates aren’t blocked. The only thing I’d consider is:

  • Adding a brief comment or defensive nil-check inside updateSelfPeerIP to make it safer if it’s ever called from elsewhere without the existing wgInterface != nil guarantee.
  • On iOS, we currently set NetworkChangeListener.SetInterfaceIP(e.config.WgAddr) only at create time; if management later changes the self IP, we might also want to call that here so the mobile network listener observes the new address.

Also applies to: 932-955

client/internal/engine_test.go (1)

405-461: Good, focused coverage of self-IP update behavior

The three subtests nicely pin down the helper’s contract: idempotent no-op, successful interface+config update, and failure path with preserved config. This is sufficient for the new helper; if you later change updateConfig behavior around error handling, a small integration-style test there could complement these, but not required for this PR.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between aca0398 and cdafea7.

📒 Files selected for processing (2)
  • client/internal/engine.go (2 hunks)
  • client/internal/engine_test.go (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
client/internal/engine_test.go (1)
client/internal/engine.go (2)
  • Engine (138-211)
  • EngineConfig (81-135)
client/internal/engine.go (1)
client/iface/wgaddr/address.go (1)
  • Address (9-12)

@lixmal
Copy link
Collaborator

lixmal commented Nov 27, 2025

Hi @fahrishih

Unfortunately that is not enough. The IP is also used in various subsystems (DNS, Firewall, ...) that would also require updating

@fahrishih
Copy link
Contributor Author

hi @lixmal , which firewall are you referring to? if by dns you mean the netbird dns, i tested and it was able to resolve to the new machine's IP

@lixmal
Copy link
Collaborator

lixmal commented Nov 27, 2025

I mean the firewall implementations (userspace, nftables, iptables) and the DNS server. Pretty much everything that calls Address() on the wg interface is affected, especially if it was only called once on setup

@fahrishih
Copy link
Contributor Author

sure @lixmal , but anyways those you mentioned would still not work if the network range is changed from the console, until the user runs netbird down and netbird up again.

at least on this MR, what I aim is to avoid running that command on all the peers deployed and keep the connectivity going by this autoupdate.

@sonarqubecloud
Copy link

@fahrishih
Copy link
Contributor Author

fahrishih commented Nov 27, 2025

hi @lixmal , i added a new commit to automatically update the nft

When the management server changes a peer's IP address, the SSH server now properly restarts to bind to the new IP.

Changes:

  • Update DNAT rule to match the new IP
  • Add Restart method to SSH server for rebinding to new address
  • Update updateFirewallForAddressChange to restart SSH server when IP changes
  • Add test coverage for SSH restart on IP change

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants