-
Notifications
You must be signed in to change notification settings - Fork 6.1k
8369449: Spec: introduce new JVMTI function GetTotalGCCpuTime #27879
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
👋 Welcome back sspitsyn! A progress list of the required criteria for merging this PR into |
❗ This change is not yet ready to be integrated. |
Webrevs
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for making this happen! I think it looks good from my point of view, I have just one question, is it safe to skip the check for os::is_thread_cpu_time_supported
? One might ask why CPUTimeUsage
does not handle that, but since these methods are also used by internal GC functionality this was intentionally omitted for performance reasons (and some GCs like G1 won't run if thread CPU time is not supported so that call is not always needed). I'm no JVMTI expert so this might be fine, like how its fine for G1 to omit calling it, but just wanted to ask.
Thank you for looking at it and comment! Yes, I was thinking about this check but have not came to any conclusion yet. Yes, I was thinking if this check would be simpler to add to the |
Would it be possible to expand a bit on this, specifically how it might be used, and whether it might be used with other JVMTI functions (there aren't functions to get the process CPU time for example). As a general point, the JVMTI spec doesn't have much support for monitoring GC. It has GC start/end events that date from the original spec when collectors were all STW and it hasn't evolved since to model more modern collectors. I'm sure CPU time spend on GC is important to many profilers but I'm also wondering if JVMTI is the right API for modern profilers to be using. JVMTI is suited to debuggers and other tooling but it's less clear that it is relevant for profiling now. (n passing, I see GetAvailableProcessors is in the Timers section of the spec, is that the right place for this?) |
Certainly, thanks for asking. Researchers in GC are using the GC start/end events (https://dl.acm.org/doi/10.1145/3669940.3707217, https://ieeexplore.ieee.org/document/9804613, https://dl.acm.org/doi/10.1145/3764118, https://dl.acm.org/doi/10.1145/3652024.3665510 etc.) to understand various costs pertaining to GC. I believe one USP of using a JVMTI agent is that it does not require modification of the benchmarking code and allows usage of powerful features made available by framework such as libpfm. So these JVMTI hooks are used to read CPU performance counters to get some estimations of various metrics, be it CPU time, cache-misses etc. However this imposes severe limitations especially when it comes GCs with concurrent parts. This patch will expand the capabilities for these users using JVMTI agents to profile applications.
I think there is no need for JVMTI to export such functionality as it should be trivial for a C/C++ application to get process CPU time through other means. However getting GC CPU time (outside the scope of stop-the-world) would be less trivial without this patch.
I believe JVMTI is still relevant for profiling as it is being used in the GC research community. On a general note, I think exposing GC CPU time at various APIs will serve a different scope of users. At this level we support users that uses advanced profiling techniques with external libraries. Hope this helps. |
src/hotspot/share/prims/jvmtiH.xsl
Outdated
JVMTI_VERSION_11 = 0x300B0000, | ||
JVMTI_VERSION_19 = 0x30130000, | ||
JVMTI_VERSION_21 = 0x30150000, | ||
JVMTI_VERSION_25 = 0x30170000, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be JVMTI_VERSION_26
if this is targeted for version 26?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also the hex value represents JDK 23 not 25.
if (os::is_thread_cpu_time_supported()) { | ||
jc.can_get_current_thread_cpu_time = 1; | ||
jc.can_get_thread_cpu_time = 1; | ||
jc.can_get_gc_cpu_time = 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering, would a user trying to call GetTotalGCCpuTime
if can_get_gc_cpu_time
is not successfully set to 1 be undefined behavior? The specs say "To possess a capability, the agent must add the capability (https://docs.oracle.com/en/java/javase/25/docs/specs/jvmti.html#capability). If yes maybe we can discard the extra call to os::is_thread_cpu_time_supported
in JvmtiEnv::GetTotalGCCpuTime
? That seems to align with the pattern to not have that check in the other methods as you pointed out.
</errors> | ||
</function> | ||
|
||
<function id="GetGCCpuTimerInfo" phase="any" callbacksafe="safe" num="157" since="26"> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be called GetTotalGCCpuTimerInfo
?
There may be two issues with the patch as is: Calling If |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a few issues with the spec and implementation, but they are easily addressed.
I'm not sure this functionality is really JVMTI worthy but if Jonas thinks this is useful for GC profiling then I will take hs word for it.
This is an unsigned value. If tested or printed as a jlong (signed value) | ||
it may appear to be a negative number. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to contradict the description that it could be relative to a value in the future and hence negative - thus it can't be "unsigned". But then I don't see how it can be as described - for this timer to be useful it must be tracking nanoseconds of CPU time consumed by GC since CPU time tracking commenced, sometime after VM startup. This has to be similar to how thread CPU time is defined.
src/hotspot/share/prims/jvmtiH.xsl
Outdated
JVMTI_VERSION_11 = 0x300B0000, | ||
JVMTI_VERSION_19 = 0x30130000, | ||
JVMTI_VERSION_21 = 0x30150000, | ||
JVMTI_VERSION_25 = 0x30170000, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also the hex value represents JDK 23 not 25.
src/hotspot/share/prims/jvmtiEnv.cpp
Outdated
JvmtiEnv::GetTotalGCCpuTime(jlong* nanos_ptr) { | ||
{ | ||
MutexLocker hl(Heap_lock); | ||
if (!os::is_thread_cpu_time_supported() || |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it isn't supported then you can't have the capability and so won't reach here.
src/hotspot/share/prims/jvmtiEnv.cpp
Outdated
Universe::heap()->is_shutting_down()) { | ||
*nanos_ptr = -1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems wrong to me and violates the timer-info spec of this timer not jumping backwards. I think you have to cache the last returned value for this function and if you cannot calculate an updated value because of VM shutdown, then that previous value should be returned.
/cc hotspot-gc |
@albertnetymk |
Right, there will be a deadlock if |
Yes. I asked Jonas about it when we started our conversation and I rely on his justification which is in the enhancement description.
Good catch, thanks. Fixed now.
Good catch, thanks. Fixed now.
Good comment. In fact, I wanted to move this to the capability check. But there is already check for
I agree. I've already noticed this issue and was thinking where and how to fix it. Thank you for the suggestion. It can be fixed this way but I feel it is better to fix on the GC side. Will discuss it with Jonas. |
Alan, thank you for looking at this, the comments and good questions. I agree, we need to look at some level of completeness here. It includes the process CPU time and potentially more functionality to support modern GC's (I hope to get more insight from the GC team here). One question I've got is about all these timers and a possibility to reuse some of them. Is it right to have new timer for each CPU metric? We need some kind of overall design for all this. As I've posted earlier in reply to David for this enhancement, I rely on Jonas's justification which is in the enhancement description.
Yes, support for modern collectors is on the table for some time. Now seems to be a good time to make some steps in this direction.
I hope, Jonas has answered this.
Yes, it seems this function is a little bit out of this section scope. But I do not see a better section for this, and it does not deserve its own section yet. :) |
Right thinking, thanks. In fact, I had a plan to move this check to the capability side after the week ends. :)
I was already thinking on this for some time. I wanted to generalize this timer a little bit for a case if we ever decide to add more GC related functions. I'm not sure that all timers have to be strictly bound to the
It is nice you have caught this early. It would be nice to sort out this issue, and it is better to fix it on the GC side. I'd suggest to file a bug to separate this issue.
I was thinking if we could limit these versions to the LTS releases in the future but on another thought it probably makes sense to add JVMTI_VERSION_26. |
The GarbageCollectionStart and GarbageCollectionFinish events are specified to sent when the VM is "stopped" (VM agnostic wording). The only JVMTI functions specified to be allowed are the raw monitor and the env local storage functions. |
I've no doubt that it is useful but it also reasonable to ask if JVMTI is the right API for monitoring GC in 2025. This is a neglected area, the existing events date from 20 years ago, and pre-date concurrent collectors and other advancements. Is this the start of a revival of JVMTI for profiling tools? If we could start again what features would a GC monitor interface have to help troubleshooting, performance monitoring, and research? Would we confident modelling a VM and GC agnostic interface or would there be aspects that are very GC specific? |
Thank you, Alan. There was some offline Slack conversation with Ron and Jonas. Conclusion is the JMX support should be enough for this. So, I've closed the enhancement as WNF and this PR as well. |
With JDK-8359110 a framework to measure GC CPU time was introduced.
It will be exposed in JMX as
MemoryMXBean.getTotalGcCpuTime()
. There is also interest to get the same performance data from JVMTI.The following API's are being added with this enhancement:
Introduce:
can_get_gc_cpu_time
jvmtiError GetGCCpuTimerInfo(jvmtiEnv* env, jvmtiTimerInfo* info_ptr)
jvmtiError GetTotalGCCpuTime(jvmtiEnv* env, jlong* nanos_ptr)
CSR: 8370159: Spec: introduce new JVMTI function GetTotalGCCpuTime
Testing:
Progress
Issues
Reviewing
Using
git
Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/27879/head:pull/27879
$ git checkout pull/27879
Update a local copy of the PR:
$ git checkout pull/27879
$ git pull https://git.openjdk.org/jdk.git pull/27879/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 27879
View PR using the GUI difftool:
$ git pr show -t 27879
Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/27879.diff
Using Webrev
Link to Webrev Comment