Skip to content

feat(helm): add TLS termination for Envoy Gateway ingress#2015

Draft
zhaohuabing wants to merge 1 commit into
NVIDIA:mainfrom
zhaohuabing:feat/eg-ingress-tls-termination
Draft

feat(helm): add TLS termination for Envoy Gateway ingress#2015
zhaohuabing wants to merge 1 commit into
NVIDIA:mainfrom
zhaohuabing:feat/eg-ingress-tls-termination

Conversation

@zhaohuabing

@zhaohuabing zhaohuabing commented Jun 26, 2026

Copy link
Copy Markdown
Contributor

Summary

The chart's optional Gateway API ingress only rendered a plaintext HTTP listener, so the OpenShell gateway could not be exposed over TLS. This adds an HTTPS listener option that terminates TLS at the Envoy Gateway and forwards plaintext gRPC to the gateway pod, with guardrails and docs for the supported configuration.

Related Issue

Close #2017

Changes

  • templates/gateway.yaml — renders an HTTPS listener with tls.mode: Terminate and certificateRefs when grpcRoute.gateway.listener.protocol=HTTPS; the default HTTP listener is unchanged. Two fail guards: empty certificateRefs, and HTTPS without server.disableTls=true (the chart does not render a BackendTLSPolicy for re-encryption, so the backend hop must be plaintext).
  • values.yaml — adds grpcRoute.gateway.listener.tls.certificateRefs and clarifies protocol/port usage.
  • ci/values-gateway-tls.yaml — new CI overlay exercising the HTTPS branch in lint/render.
  • docs/kubernetes/ingress.mdx — documents HTTPS termination setup, and clarifies that Envoy Gateway only terminates TLS (no OIDC SecurityPolicy, which is browser-only); client identity uses OIDC bearer tokens, with the client-credentials grant for headless agents.
  • .agents/skills/debug-openshell-cluster/SKILL.md — adds HTTPS-ingress troubleshooting rows (plaintext-backend mismatch, unauthenticated-after-connect).
  • README.md — regenerated chart values table.

Testing

  • helm lint (defaults + all CI variants) and helm template verified: default renders the unchanged HTTP listener; HTTPS renders Terminate + certificateRefs; both fail guards fire as expected.

  • mise run markdown:lint:md, mise run license:check, and mise run helm:docs:check pass.

  • codex exec review --uncommitted run to convergence — no remaining findings.

  • mise run pre-commit passes — relevant checks (helm lint, markdown, license, helm-docs) pass; rust:lint fails on a pre-existing local env issue (missing z3.h system header), unrelated to these Helm/docs-only changes

  • Unit tests added/updated — N/A (Helm template + docs only)

  • E2E tests added/updated (if applicable) — not run locally (requires a live k3d + Envoy Gateway cluster); recommend running the Gateway API e2e path in CI

Checklist

  • Follows Conventional Commits
  • Commits are signed off (DCO)
  • Architecture docs updated (if applicable) — no architecture/ change needed; user-facing docs (docs/kubernetes/ingress.mdx) and the chart README/skill updated per AGENTS.md

The chart's optional Gateway API ingress only rendered a plaintext HTTP
listener, so the gateway could not be exposed over TLS. Add an HTTPS
listener option that terminates TLS at the Envoy Gateway and forwards
plaintext gRPC to the gateway pod.

- gateway.yaml renders an HTTPS listener with `tls.mode: Terminate` and
  `certificateRefs` when `grpcRoute.gateway.listener.protocol=HTTPS`,
  keeping the default HTTP listener unchanged. Guards fail the render when
  `certificateRefs` is empty or `server.disableTls` is not true (the chart
  does not render a BackendTLSPolicy for re-encryption).
- values.yaml adds `grpcRoute.gateway.listener.tls.certificateRefs`.
- ci/values-gateway-tls.yaml exercises the HTTPS branch in lint/render.
- docs/kubernetes/ingress.mdx documents HTTPS setup and clarifies that
  Envoy Gateway only terminates TLS (no OIDC SecurityPolicy); client
  identity uses OIDC bearer tokens, with the client-credentials grant for
  headless agents.
- debug-openshell-cluster skill gains HTTPS-ingress troubleshooting rows.
- Regenerated the chart README values table.

Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com>
@copy-pr-bot

copy-pr-bot Bot commented Jun 26, 2026

Copy link
Copy Markdown

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@zhaohuabing zhaohuabing marked this pull request as draft June 26, 2026 07:30
@TaylorMutch TaylorMutch self-assigned this Jun 26, 2026

@TaylorMutch TaylorMutch left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple small comments, but I think this is looking good so far.

{{- fail "grpcRoute.gateway.listener.tls.certificateRefs is required when grpcRoute.gateway.listener.protocol is HTTPS" }}
{{- end }}
{{- if not .Values.server.disableTls }}
{{- fail "grpcRoute.gateway.listener.protocol=HTTPS terminates TLS at Envoy Gateway, which forwards plaintext gRPC to the gateway pod; set server.disableTls=true so the pod listens plaintext (this chart does not render a BackendTLSPolicy for re-encryption to a TLS backend)" }}

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we document a way to enable this? Possibly as a follow-up; it would be great to support end-to-end TLS, but I think this PR is a good first step.

# in the Gateway's namespace. May reference a cert-manager-issued Secret
# or the existing openshell-server-tls Secret (its SANs must include the
# external hostname).
certificateRefs: []

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a certificate is provided by e.g. cert-manager, would that be supported automatically?

@TaylorMutch

Copy link
Copy Markdown
Collaborator

Ah, just realized this is in draft still. Sorry!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support TLS termination for the Envoy Gateway ingress

2 participants