Open
Description
batched-parallel
mode has performance issues because it uses a threadpool. even with rate limiting, there are still performance limitations to it in its current form
there are some ideas for how to fix this:
Add a prestep that partitions messages by fingerprint and passes them to run_task_with_multiprocessing
- similar to
sentry/src/sentry/spans/consumers/process/factory.py
Lines 315 to 324 in 7d723dc
- BatchStep to process a batch of messages, and produce a list of batches of messages, where each sublist is a list of messages with the same fingerprint
- Unbatch returns each of those individually, so that the multiprocessing step gets batches of messages for each fingerprint, rather than individual messages
- First, parallel step to deserialize, then batch, then process
Adding a timer to
could be useful too to gain insight into performance