You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When setting concurrency limits, you should choose appropriate values based on your specific workload patterns. This section provides guidance on how to calibrate these limits effectively.
141
+
142
+
### Using Prometheus metrics and logs for calibration
143
+
144
+
Prometheus metrics provide quantitative insights into usage patterns and the impact of each type of RPC on Gitaly node resources. Several key metrics are particularly valuable for this analysis:
145
+
146
+
- Resource consumption metrics per-RPC. Gitaly offloads most heavy operations to `git` processes and so the command usually shelled out to is the Git binary.
147
+
Gitaly exposes collected metrics from those commands as logs and Prometheus metrics.
148
+
-`gitaly_command_cpu_seconds_total` - Sum of CPU time spent by shelling out, with labels for `grpc_service`, `grpc_method`, `cmd`, and `subcmd`.
149
+
-`gitaly_command_real_seconds_total` - Sum of real time spent by shelling out, with similar labels.
150
+
- Recent limiting metrics per-RPC:
151
+
-`gitaly_concurrency_limiting_in_progress` - Number of concurrent requests being processed.
152
+
-`gitaly_concurrency_limiting_queued` - Number of requests for an RPC for a given repository in waiting state.
153
+
-`gitaly_concurrency_limiting_acquiring_seconds` - Duration a request waits due to concurrency limits before processing.
154
+
155
+
These metrics provide a high-level view of resource utilization at a given point in time. The `gitaly_command_cpu_seconds_total` metric is particularly effective for
156
+
identifying specific RPCs that consume substantial CPU resources. Additional metrics are available for more detailed analysis as described in
157
+
[Monitoring Gitaly](monitoring.md).
158
+
159
+
While metrics capture overall resource usage patterns, they typically do not provide per-repository breakdowns. Therefore, logs serve as a complementary data source. To analyze logs:
160
+
161
+
1. Filter logs by identified high-impact RPCs.
162
+
1. Aggregate filtered logs by repository or project.
163
+
1. Visualize aggregated results on a time-series graph.
164
+
165
+
This combined approach of using both metrics and logs provides comprehensive visibility into both system-wide resource usage and repository-specific patterns. Analysis tools such as Kibana or similar log aggregation platforms can facilitate this process.
166
+
167
+
### Adjusting limits
168
+
169
+
If you find that your initial limits are not efficient enough, you might need to adjust them. With adaptive limiting, precise limits are less critical because the system
170
+
automatically adjusts based on resource usage.
171
+
172
+
Remember that concurrency limits are scoped by repository. A limit of 30 means allowing at most 30 simultaneous in-flight requests per repository. If the limit is reached,
173
+
requests are queued and only rejected if the queue is full or the maximum waiting time is reached.
174
+
138
175
## Adaptive concurrency limiting
139
176
140
177
{{< history >}}
@@ -181,7 +218,12 @@ The adaptive limiter calibrates the limits every 30 seconds and:
181
218
Otherwise, the limits increase by one until reaching the upper bound. For more information about technical implementation
182
219
of this system, refer to [the related design document](https://handbook.gitlab.com/handbook/engineering/architecture/design-documents/gitaly_adaptive_concurrency_limit/).
183
220
184
-
Adaptive limiting is enabled for each RPC or pack-objects cache individually. However, limits are calibrated at the same time.
221
+
Adaptive limiting is enabled for each RPC or pack-objects cache individually. However, limits are calibrated at the same time. Adaptive limiting has the following configurations:
222
+
223
+
-`adaptive` sets whether the adaptiveness is enabled.
224
+
-`max_limit` is the maximum concurrency limit. Gitaly increases the current limit until it reaches this number. This should be a generous value that the system can fully support under typical conditions.
225
+
-`min_limit` is the is the minimum concurrency limit of the configured RPC. When the host machine has a resource problem, Gitaly quickly reduces the limit until reaching this value. Setting `min_limit` to 0 could completely shut down processing, which is typically undesirable.
226
+
-`initial_limit` provides a reasonable starting point between these extremes.
185
227
186
228
### Enable adaptiveness for RPC concurrency
187
229
@@ -224,15 +266,6 @@ gitaly['configuration'] = {
224
266
}
225
267
```
226
268
227
-
In this example:
228
-
229
-
-`adaptive` sets whether the adaptiveness is enabled. If set, the `max_per_repo` value is ignored in favor of the following configuration.
230
-
-`initial_limit` is the per-repository concurrency limit to use when Gitaly starts.
231
-
-`max_limit` is the minimum per-repository concurrency limit of the configured RPC. Gitaly increases the current limit
232
-
until it reaches this number.
233
-
-`min_limit` is the is the minimum per-repository concurrency limit of the configured RPC. When the host machine has a resource problem,
234
-
Gitaly quickly reduces the limit until reaching this value.
235
-
236
269
For more information, see [RPC concurrency](#limit-rpc-concurrency).
237
270
238
271
### Enable adaptiveness for pack-objects concurrency
For more information, see [pack-objects concurrency](#limit-pack-objects-concurrency).
292
+
293
+
### Calibrating adaptive concurrency limits
259
294
260
-
-`adaptive` sets whether the adaptiveness is enabled. If set, the value of `max_concurrency` is ignored in favor of the following configuration.
261
-
-`initial_limit` is the per-IP concurrency limit to use when Gitaly starts.
262
-
-`max_limit` is the minimum per-IP concurrency limit for pack-objects. Gitaly increases the current limit until it reaches this number.
263
-
-`min_limit` is the is the minimum per-IP concurrency limit for pack-objects. When the host machine has a resources problem, Gitaly quickly
264
-
reduces the limit until it reaches this value.
295
+
Adaptive concurrency limiting is very different from the usual way that GitLab protects Gitaly resources. Rather than relying on static thresholds that may be either too restrictive or too permissive, adaptive limiting intelligently responds to actual resource conditions in real-time.
265
296
266
-
For more information, see [pack-objects concurrency](#limit-pack-objects-concurrency).
297
+
This approach eliminates the need to find "perfect" threshold values through extensive calibration as described in
298
+
[Calibrating concurrency limits](#calibrating-concurrency-limits). During failure scenarios, the adaptive limiter reduces limits exponentially (for example, 60 → 30 → 15 → 10)
299
+
and then automatically recovers by incrementally raising limits when the system stabilizes.
300
+
301
+
When calibrating adaptive limits, you can prioritize flexibility over precision.
302
+
303
+
#### RPC categories and configuration examples
304
+
305
+
Expensive Gitaly RPCs, which should be protected, can be categorized into two general types:
306
+
307
+
- Pure Git data operations.
308
+
- Time sensitive RPCs.
309
+
310
+
Each type has distinct characteristics that influence how concurrency limits should be configured. The following examples illustrate the reasoning behind
311
+
limit configuration. They can also be used as a starting point.
312
+
313
+
##### Pure Git data operations
314
+
315
+
These RPCs involve Git pull, push, and fetch operations, and possess the following characteristics:
316
+
317
+
- Long-running processes.
318
+
- Significant resource utilization.
319
+
- Computationally expensive.
320
+
- Not time-sensitive. Additional latency is generally acceptable.
321
+
322
+
RPCs in `SmartHTTPService` and `SSHService` fall into the pure Git data operations category. A configuration example:
323
+
324
+
```ruby
325
+
{
326
+
rpc:"/gitaly.SmartHTTPService/PostUploadPackWithSidechannel", # or `/gitaly.SmartHTTPService/SSHUploadPackWithSidechannel`
327
+
adaptive:true,
328
+
min_limit:10, # Minimum concurrency to maintain even under extreme load
329
+
initial_limit:40, # Starting concurrency when service initializes
330
+
max_limit:60, # Maximum concurrency under ideal conditions
331
+
max_queue_wait:"60s",
332
+
max_queue_size:300
333
+
}
334
+
```
335
+
336
+
##### Time-sensitive RPCs
337
+
338
+
These RPCs serve GitLab itself and other clients with different characteristics:
339
+
340
+
- Typically part of online HTTP requests or Sidekiq background jobs.
341
+
- Shorter latency profiles.
342
+
- Generally less resource-intensive.
343
+
344
+
For these RPCs, the timeout configuration in GitLab should inform the `max_queue_wait` parameter. For instance, `get_tree_entries` typically has a medium timeout of 30 seconds in GitLab:
345
+
346
+
```ruby
347
+
{
348
+
rpc:"/gitaly.CommitService/GetTreeEntries",
349
+
adaptive:true,
350
+
min_limit:5, # Minimum throughput maintained under resource pressure
351
+
initial_limit:10, # Initial concurrency setting
352
+
max_limit:20, # Maximum concurrency under optimal conditions
353
+
max_queue_size:50,
354
+
max_queue_wait:"30s"
355
+
}
356
+
```
357
+
358
+
### Monitoring adaptive limiting
359
+
360
+
To observe how adaptive limits are behaving in production environments, refer to the monitoring tools and metrics described in
0 commit comments