docs(networking): document publishing.proxyProtocol + ouroboros hairpin-NAT fix#527
docs(networking): document publishing.proxyProtocol + ouroboros hairpin-NAT fix#527
Conversation
✅ Deploy Preview for cozystack ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
📝 WalkthroughWalkthroughAdds a new documentation page explaining PROXY-protocol support and the ouroboros hairpin-NAT fix, and updates the Platform Package reference to document new publishing fields related to PROXY handling. ChangesPROXY-Protocol & Platform Fields Documentation
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Code Review
This pull request introduces documentation for the PROXY-protocol and the hairpin-NAT fix using the ouroboros component in Cozystack. It explains the technical challenges of intra-cluster traffic when PROXY-protocol is enabled and provides detailed instructions for host and tenant-level configurations, including cleanup recipes. Feedback was provided to address a technical inaccuracy regarding how to verify PROXY-protocol headers and to improve the reliability of the cleanup script's regex matching.
| The flag does not configure the L4 load balancer in front of ingress-nginx for you — that lives outside the cluster (cloud LB, F5, MetalLB+haproxy, hcloud-cloud-controller-manager, …). The upstream LB **must** already be injecting PROXY-protocol v1 headers before the flag flips on. Verify with: | ||
|
|
||
| ```bash | ||
| nc -v <lb-public-ip> 443 <<<"" | ||
| ``` | ||
|
|
||
| and look for the leading `PROXY TCP4 …\r\n` frame in the response. Without that, every external request to ingress-nginx breaks the moment the flag flips on. |
There was a problem hiding this comment.
The verification step using nc to look for a PROXY header in the response is likely incorrect for standard PROXY protocol implementations. The PROXY protocol header is sent by the proxy (the Load Balancer) to the upstream server (ingress-nginx), not back to the client. An external client connecting to the LB would not see this header in the response. To verify that the LB is correctly injecting the header, it is recommended to check the ingress-nginx controller logs for successful PROXY protocol header parsing or use a tool like tcpdump on the ingress nodes to see the incoming traffic from the LB.
| existing=$(kubectl --namespace kube-system get configmap coredns \ | ||
| --output jsonpath='{.data.Corefile}') | ||
| cleaned=$(printf '%s\n' "$existing" \ | ||
| | sed '/# === BEGIN ouroboros/,/# === END ouroboros ===/d') |
There was a problem hiding this comment.
The sed command might fail to match the end marker if it doesn't exactly match the string used by the controller. Line 31 indicates the markers include (do not edit by hand). The regex /# === END ouroboros ===/ will not match a line like # === END ouroboros (do not edit by hand) === because of the trailing === in the regex. It's safer to use a more flexible match that covers both cases.
| | sed '/# === BEGIN ouroboros/,/# === END ouroboros ===/d') | |
| | sed '/# === BEGIN ouroboros/,/# === END ouroboros/d') |
bf08d0c to
7fbc317
Compare
db10a0a to
412a315
Compare
There was a problem hiding this comment.
🧹 Nitpick comments (2)
content/en/docs/next/networking/hairpin-proxy-protocol.md (1)
62-62: ⚡ Quick winConsider breaking up the 200+ word sentence for better readability.
The technical content about the cozystack-coredns wrapper is accurate and important, but the single-sentence structure (spanning from "cozystack ships" through the test assertion) makes it difficult to parse. Consider restructuring into 3-4 shorter sentences: (1) wrapper overview with import directive, (2) Helm three-way merge behavior with
notExists: data, (3) why this matters for tenant vs host deployment modes, (4) the test assertion safeguard.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@content/en/docs/next/networking/hairpin-proxy-protocol.md` at line 62, The long single sentence should be split into 3–4 shorter sentences to improve readability: (1) describe the wrapper overview and mention the Corefile import directive and the coredns-custom ConfigMap mounting, (2) explain Helm three-way merge behavior and the fact the wrapper renders the ConfigMap with no data: field so apiserver-side ouroboros.adds survive chart upgrades (reference ouroboros and helm.sh/resource-policy: keep), (3) note the tenant vs host deployment difference (reference coredns-import and coredns modes and that host CoreDNS is Talos-managed), and (4) call out the test invariant in packages/system/coredns/tests/coredns_custom_test.yaml asserting notExists: data. Keep sentences short and sequential so each point is clear.content/en/docs/next/operations/configuration/platform-package.md (1)
66-66: 💤 Low valueConsider restructuring the long
publishing.exposuredescription for better readability.The description is comprehensive but exceeds 150 words in a single table cell, making it challenging to scan. Consider breaking the deprecation timeline and requirements into a bulleted sub-list or shorter sentences. The technical content and KEP-5707 warning are valuable and accurate.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@content/en/docs/next/operations/configuration/platform-package.md` at line 66, The table cell for the publishing.exposure description is too long and should be broken into clearer, shorter pieces: split the single long sentence about exposure modes, deprecation (KEP-5707) and requirements into 2–3 short sentences and move the operational requirements into a compact bullet/sub-list (e.g., mode behavior, deprecation timeline, and prerequisites for loadBalancer such as Cilium L2/BGP and at least one publishing.externalIPs entry) so the cell reads as a concise summary followed by a short list of requirements; keep the identifiers publishing.exposure, publishing.externalIPs, loadBalancer, externalIPs and KEP-5707 intact for clarity.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@content/en/docs/next/networking/hairpin-proxy-protocol.md`:
- Line 62: The long single sentence should be split into 3–4 shorter sentences
to improve readability: (1) describe the wrapper overview and mention the
Corefile import directive and the coredns-custom ConfigMap mounting, (2) explain
Helm three-way merge behavior and the fact the wrapper renders the ConfigMap
with no data: field so apiserver-side ouroboros.adds survive chart upgrades
(reference ouroboros and helm.sh/resource-policy: keep), (3) note the tenant vs
host deployment difference (reference coredns-import and coredns modes and that
host CoreDNS is Talos-managed), and (4) call out the test invariant in
packages/system/coredns/tests/coredns_custom_test.yaml asserting notExists:
data. Keep sentences short and sequential so each point is clear.
In `@content/en/docs/next/operations/configuration/platform-package.md`:
- Line 66: The table cell for the publishing.exposure description is too long
and should be broken into clearer, shorter pieces: split the single long
sentence about exposure modes, deprecation (KEP-5707) and requirements into 2–3
short sentences and move the operational requirements into a compact
bullet/sub-list (e.g., mode behavior, deprecation timeline, and prerequisites
for loadBalancer such as Cilium L2/BGP and at least one publishing.externalIPs
entry) so the cell reads as a concise summary followed by a short list of
requirements; keep the identifiers publishing.exposure, publishing.externalIPs,
loadBalancer, externalIPs and KEP-5707 intact for clarity.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 073192ff-dc56-47b4-835e-8bb2a1816d48
📒 Files selected for processing (2)
content/en/docs/next/networking/hairpin-proxy-protocol.mdcontent/en/docs/next/operations/configuration/platform-package.md
New page under content/en/docs/next/networking/ explaining how to turn on PROXY-protocol at the cozystack host ingress-nginx and the bundled ouroboros component that fixes the resulting hairpin-NAT problem (intra-cluster traffic to the cluster's own public hostnames arriving at ingress-nginx without the PROXY header it now requires). Covers: - The hairpin-NAT failure mode: kube-proxy / Cilium-KPR short-circuits the LB IP for in-cluster lookups, the connection bypasses the upstream LB and arrives at ingress-nginx without a PROXY header, ingress-nginx closes the connection. - Why KEP-1860 ipMode=Proxy is not a fit for the cozystack default stack (Cilium + kubeProxyReplacement drops the LB frontend entirely when ipMode != VIP, breaking Service ingress for L2/BGP-announced IPs). - ouroboros itself (a Go reimplementation of compumike/hairpin-proxy) and the host vs tenant mode split: coredns mode mutates the Talos-managed Corefile in place on the host, coredns-import mode writes plugin-only snippets into a separate coredns-custom ConfigMap that the cozystack-coredns Corefile imports. - The single platform flag publishing.proxyProtocol: true (host scope), the per-tenant addons.ouroboros knob, and how PROXY-protocol gets wired onto tenant ingress-nginx via valuesOverride (unlike the host flag, the tenant addon does not auto-wire it because tenants commonly have different upstream-LB setups). - The upstream-LB precondition: the L4 LB in front of ingress-nginx must already be injecting PROXY-v1 headers before the flag flips on. Two practical verification recipes (tcpdump on the ingress-nginx node before flipping, or ingress-nginx logs after the LB-side change) — `nc` from a laptop will never see a frame because PROXY-protocol travels LB-to-backend, not back to clients. - The Talos re-render flap window: a machine-config push wipes the BEGIN/END markers in the live host Corefile and the controller re-applies on next reconcile, leaving a brief intra-cluster DNS failure window during the roll. - The disable hazard on both layers (no cleanup hook for coredns or coredns-import mode in the upstream chart), the lookup-based acknowledge gates that catch the obvious wrong path, the known hole when an operator manually deletes the Package or HelmRelease before flipping the flag back, and the JSON-patch cleanup recipes (host: jq-built merge-patch on the live Corefile that preserves labels, annotations, and Talos-managed metadata; tenant: null the ouroboros.override key in coredns-custom). - Air-gapped operators must mirror two extra non-cozystack registry locations: oci://ghcr.io/lexfrei/charts/ouroboros and ghcr.io/lexfrei/ouroboros — this package is intentionally not mirrored under ghcr.io/cozystack/*. Assisted-By: Claude <noreply@anthropic.com> Signed-off-by: Aleksei Sviridkin <f@lex.la>
412a315 to
52d75b4
Compare
What this PR does
Adds
content/en/docs/next/networking/hairpin-proxy-protocol.mdto document the newpublishing.proxyProtocolflag + the bundled ouroboros hairpin-NAT fix that lands in cozystack/cozystack#2561.The page covers:
ipMode=Proxyis not a fit for the cozystack default stack (Cilium +kubeProxyReplacement: truedrops the LB frontend entirely whenipMode != VIP).compumike/hairpin-proxy— seelexfrei/ouroboros) and the host-vs-tenant mode split (corednsmutates the Talos-managed Corefile in place;coredns-importwrites plugin-only snippets into a separatecoredns-customConfigMap that the cozystack-coredns chart imports).publishing.proxyProtocol: true(host scope), the per-tenantaddons.ouroboros.enabledknob, and how PROXY-protocol gets wired onto tenant ingress-nginx viavaluesOverride.external-dnsmode), the lookup-based acknowledge gates that catch the obvious wrong path, the known hole when an operator manually deletes the Package / HelmRelease before flipping the flag back, and the JSON-patch cleanup recipes (host + tenant).lexfrei/ouroboros, not mirrored underghcr.io/cozystack/*.Pairs with cozystack/cozystack#2561.
Release note
Summary by CodeRabbit