-
Notifications
You must be signed in to change notification settings - Fork 138
Open
Labels
area/inference-extensionRelated to the Gateway API Inference ExtensionRelated to the Gateway API Inference ExtensiondocumentationImprovements or additions to documentationImprovements or additions to documentation
Milestone
Description
As a user, I want to know how to use NGF with the inference extension, so I can route traffic intelligently to my AI workloads in Kubernetes.
Acceptance Criteria:
- Add a user guide on how to route traffic to AI workloads using NGF
- Should cover how to install the Gateway API Inference Extension CRDs, and deploy NGF with the feature flag enabled
- Should cover how to deploy an InferencePool and EPP, and how to configure an HTTPRoute to reference the InferencePool
- Explain how to secure traffic between the NGINX pod and the EPP using cert-manager (mentioning that by default, we create self-signed certs)
- Show examples on model name redirects and traffic splitting
- Link to Gateway API inference extension docs where it makes sense (for example, these docs may better describe the InferenceObjective CRD and how a user should handle those)
Metadata
Metadata
Assignees
Labels
area/inference-extensionRelated to the Gateway API Inference ExtensionRelated to the Gateway API Inference ExtensiondocumentationImprovements or additions to documentationImprovements or additions to documentation
Type
Projects
Status
🆕 New