You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/my-website/docs/proxy/load_balancing.md
+31-38Lines changed: 31 additions & 38 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -13,6 +13,23 @@ For more details on routing strategies / params, see [Routing](../routing.md)
13
13
14
14
:::
15
15
16
+
## How Load Balancing Works
17
+
18
+
LiteLLM automatically distributes requests across multiple deployments of the same model using its built-in router. the proxy routes traffic to optimize performance and reliability.
19
+
20
+
"simple-shuffle" routing strategy is used by default
21
+
22
+
### Routing Strategies
23
+
24
+
| Strategy | Description | When to Use |
25
+
|----------|-------------|-------------|
26
+
|**simple-shuffle** (recommended) | Randomly distributes requests | General purpose, good for even load distribution |
27
+
|**least-busy**| Routes to deployment with fewest active requests | High concurrency scenarios |
28
+
|**usage-based-routing** (bad for perf) | Routes to deployment with lowest current usage (RPM/TPM) | When you want to respect rate limits evenly |
29
+
|**latency-based-routing**| Routes to fastest responding deployment | Latency-critical applications |
30
+
|**cost-based-routing**| Routes to deployment with lowest cost | Cost-sensitive applications |
0 commit comments