Skip to content

Conversation

kondratyevd
Copy link
Collaborator

One can now set the following parameters in values.yaml:

envoy:
  dynamic_routing:
    enabled: true
  lua_filter:
    enabled: true
    lua_config: "cfg/envoy-filter-dynamic.lua"

In envoy-filter-dynamic.lua, we add logic to inject an HTTP header the value of which contains the address to which the request will be routed by Envoy.

  • For inference requests, we extract model name from request body and redirect the request to model-specific load balancer.
  • For model repository index requests, we will redirect the request to a repository index aggregator.

@kondratyevd
Copy link
Collaborator Author

This repo has a prototype python framework to dynamically load/unload models to different Triton servers: https://github.com/kondratyevd/supersonic-model-loader

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants