Skip to content

Conversation

@Jerome228
Copy link

Pull Request

Description

For single Control Plane setup, add --apiserver-advertise-address with the local/private IP address to init command. Otherwise, an incorrect IP address could be used to initialise the cluster (NAT IP address will be used if it is the default route of the control plane server). In a local setup, workers are not able to join the cluster by using the NAT IP address.

Does this fix a reported issue?

  • Yes, Issue Number: #[issue number]
  • No

Type of Change

  • Bugfix
  • Feature
  • Documentation Update
  • Other

How Has This Been Tested?

To reproduce the bug (using Virtualbox debian 13 VMs): one for the control plane, one or two for workers. All of them have 3 network interfaces: lo, eth0 for NAT and eth1 for Host-Only.

  • Test without this change: cluster initialized with the NAT IP address, the workers can not connect to the control plane, the join command is failing from workers.
  • Test with this change: private IP address of the control plane is used to initialize the cluster. Any worker than can reach the control plane using the private network (Host-only in this case) can join the cluster.

Checklist

  • My code follows the style guidelines of this project
  • I have performed a self-review of my code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • Any dependent changes have been merged and published in downstream modules

For single Control Plane setup, add --apiserver-advertise-address with the local/private IP address to init command. Otherwise, the join command could use an incorrect IP adress (NAT IP address will be used if it is in the defult route of the control plane server). The workers are not able to join the cluster by using the NAT IP address.
@Muthukumar-Subramaniam
Copy link
Owner

Muthukumar-Subramaniam commented Nov 15, 2025

Hi @Jerome228 ,

Thanks for raising this enhancement and for providing such clear testing details — really appreciate the effort and the scenario you highlighted.

I want to point out one caveat: using ansible_host directly for the advertise address can cause issues in setups where the control plane host is defined with a hostname or an FQDN when DNS is involved, instead of passing the IP address directly.

For example:

[muthuks@fedora inst-k8s-ansible]$ cat host-control-plane 
test-k8s-cp1.lab.local
[muthuks@fedora inst-k8s-ansible]$ 
[muthuks@fedora inst-k8s-ansible]$ host test-k8s-cp1.lab.local
test-k8s-cp1.lab.local has address 10.10.20.7
[muthuks@fedora inst-k8s-ansible]$ 

If ansible_host is a hostname like above, kubeadm will fail during bootstrap if we set --apiserver-advertise-address={{ ansible_host }}

To validate the safest approach for all users, could you please test using:

--apiserver-advertise-address={{ ansible_default_ipv4.address }}

This variable always contains the primary interface IP detected by Ansible, making it independent of whether users specify an FQDN or an IP in their inventory for the control plane node.

For context — I haven’t come across this issue in my environment because my custom automated KVM-based lab uses a single NIC per node, and the network stack is tightly managed with a well-defined primary IP. Even on multi-NIC setups, as long as the primary management IP is configured correctly, the control plane will not face issues when using:

--apiserver-advertise-address={{ ansible_default_ipv4.address }}

If this works in your setup as well, please update the PR with this change.
If it does not, ensure that the control plane’s network stack is configured so that the primary IP is properly and persistently set.

Let me know the outcome so we can move forward with the safest implementation for everyone.

@Muthukumar-Subramaniam Muthukumar-Subramaniam added bug Something isn't working question Further information is requested labels Nov 15, 2025
@Jerome228
Copy link
Author

Hello @Muthukumar-Subramaniam ,
Thanks for the quick reply.

You are right, using ansible_host is not the best way. On my setup, It worked because the inventory looks like:

master ansible_host=192....

But ansible_default_ipv4.address is not the best way to handle this (see here for example). Using this on my setup gave the same issue I faced before since NAT interface is the default gateway on my VMs.

I suggest --apiserver-advertise-address={{ ansible_facts['env'].SSH_CONNECTION.split(' ')[2] }}. This provides the IP address used by ansible host to reach the target VMs.

@Jerome228 Jerome228 marked this pull request as draft November 16, 2025 13:49
@Jerome228
Copy link
Author

#2 submitted. Closing this one.
If --apiserver-advertise-address={{ ansible_facts['env'].SSH_CONNECTION.split(' ')[2] }} is not safe enough, may be a simple warning/requirement in the documentation is needed to notify users about this ?
For example, make sure that the IP address in the default gateway returned by the command ip route on the control plane is reachable from all workers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working question Further information is requested

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants