-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Poor performance of DefaultLongTaskTimer#takeSnapshot #3993
Comments
Thank you for the issue! Do you have benchmarks that shows the bottleneck or is this a theoretical issue?
Could you please elaborate this? Which method and where can race condition arise? |
Thank you for the analysis. The DefaultLongTaskTImer's takeSnapshot implementation does not scale well with many active tasks currently. Out of curiosity, what order of magnitude of active tasks are you expecting at a time? 1,000? 1,000,000? |
DefaultLongTaskTimer#takeSnapshot
I created pull request #4001 |
If you may sacrifice |
DefaultLongTaskTimer#takeSnapshot
Is this planned to be taken up anytime soon? We're also experiencing similar behavior when we let our application run for a long time. We're using spring boot and micrometer with default configuration and we scrape the metrics with a prometheus server once in a minute. We notice that the CPU usage of the system is going up continuously. If we increase the frequency of scraping, the CPU usage goes up faster. First, the micrometer endpoint starts to respond slower and eventually the entire application gets effected. |
@fdulger We had the same issue in our system. The problem started after we updated Spring, which now registers its own DefaultMeterObservationHandler. We already had one handler registered, so this caused a conflict. |
Fyi, since Also, I would really look into why some of your |
Poor performance of
MicrometerCollector.collect()
which usesDefaultLongTaskTimer
.DefaultLongTaskTimer
contains a lot of calls ofConcurrentLinkedDeque.size()
withO(n)
complexity during histogram creation.Micrometer version: 1.10.3
Might be worth saving the size of deque or full snapshot at the beginning of the method
DefaultLongTaskTimer.takeSnapshot()
execution. Also method may contain data races between histogram creation and updates ofactiveTasks
.The text was updated successfully, but these errors were encountered: