Skip to content

Commit 59f0742

Browse files
enable istio as a provider + configuring destinationRule
Signed-off-by: greg pereira <[email protected]>
1 parent 58ac08d commit 59f0742

File tree

3 files changed

+46
-3
lines changed

3 files changed

+46
-3
lines changed

config/charts/inferencepool/README.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@ To install via the latest published chart in staging (--version v0 indicates la
1616
```txt
1717
$ helm install vllm-llama3-8b-instruct \
1818
--set inferencePool.modelServers.matchLabels.app=vllm-llama3-8b-instruct \
19-
--set provider.name=[none|gke] \
19+
--set provider.name=[none|gke|istio] \
2020
oci://us-central1-docker.pkg.dev/k8s-staging-images/gateway-api-inference-extension/charts/inferencepool --version v0
2121
```
2222

@@ -75,7 +75,7 @@ Use `--set inferencePool.modelServerType=triton-tensorrt-llm` to install for Tri
7575
$ helm install triton-llama3-8b-instruct \
7676
--set inferencePool.modelServers.matchLabels.app=triton-llama3-8b-instruct \
7777
--set inferencePool.modelServerType=triton-tensorrt-llm \
78-
--set provider.name=[none|gke] \
78+
--set provider.name=[none|gke|istio] \
7979
oci://us-central1-docker.pkg.dev/k8s-staging-images/gateway-api-inference-extension/charts/inferencepool --version v0
8080
```
8181

@@ -124,7 +124,7 @@ The following table list the configurable parameters of the chart.
124124
| `inferenceExtension.extraContainerPorts` | List of additional container ports to expose. Defaults to `[]`. |
125125
| `inferenceExtension.extraServicePorts` | List of additional service ports to expose. Defaults to `[]`. |
126126
| `inferenceExtension.logVerbosity` | Logging verbosity level for the endpoint picker. Defaults to `"3"`. |
127-
| `provider.name` | Name of the Inference Gateway implementation being used. Possible values: `gke`. Defaults to `none`. |
127+
| `provider.name` | Name of the Inference Gateway implementation being used. Possible values: [`gke`, `none`, `istio`]. Defaults to `none`. |
128128
| `inferenceExtension.enableLeaderElection` | Enable leader election for high availability. When enabled, only one EPP pod (the leader) will be ready to serve traffic. It is recommended to set `inferenceExtension.replicas` to a value greater than 1 when this is set to `true`. Defaults to `false`. |
129129

130130

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
{{- if eq .Values.provider.name "istio" }}
2+
---
3+
{{- if .Values.istio.destinationRule.enabled }}
4+
apiVersion: networking.istio.io/v1beta1
5+
kind: DestinationRule
6+
metadata:
7+
name: {{ include "gateway-api-inference-extension.name" . }}
8+
spec:
9+
host: {{ .Values.istio.destinationRule.host | default (printf "%s.%s.svc.cluster.local" (include "gateway-api-inference-extension.name" .) .Release.Namespace) }}
10+
{{- if .Values.istio.destinationRule.trafficPolicy }}
11+
trafficPolicy:
12+
{{- toYaml .Values.istio.destinationRule.trafficPolicy | nindent 4 }}
13+
{{- end }}
14+
{{- with .Values.istio.destinationRule.subsets }}
15+
subsets:
16+
{{- toYaml . | nindent 4 }}
17+
{{- end }}
18+
{{- with .Values.istio.destinationRule.exportTo }}
19+
exportTo:
20+
{{- toYaml . | nindent 4 }}
21+
{{- end }}
22+
{{- with .Values.istio.destinationRule.workloadSelector }}
23+
workloadSelector:
24+
{{- toYaml . | nindent 4 }}
25+
{{- end }}
26+
{{- end }}
27+
{{- end }}

config/charts/inferencepool/values.yaml

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -45,10 +45,26 @@ inferencePool:
4545
# matchLabels:
4646
# app: vllm-llama3-8b-instruct
4747

48+
# Options: ["gke", "istio", "none"]
4849
provider:
4950
name: none
5051

5152
gke:
5253
monitoringSecret:
5354
name: inference-gateway-sa-metrics-reader-secret
5455
namespace: default
56+
57+
istio:
58+
destinationRule:
59+
enabled: true
60+
# Provide a way to override the default calculated host
61+
host: ""
62+
# Optional: Apply a mesh-wide traffic policy
63+
trafficPolicy: {}
64+
# Optional: Define subsets for versioned routing (e.g., by labels)
65+
subsets: []
66+
# Optional: Control which namespaces can access this DestinationRule
67+
exportTo: []
68+
# Optional: Apply only to specific workloads (via selector labels)
69+
workloadSelector: {}
70+

0 commit comments

Comments
 (0)