Skip to content

Conversation

sspitsyn
Copy link
Contributor

@sspitsyn sspitsyn commented Oct 17, 2025

With JDK-8359110 a framework to measure GC CPU time was introduced.
It will be exposed in JMX as MemoryMXBean.getTotalGcCpuTime(). There is also interest to get the same performance data from JVMTI.
The following API's are being added with this enhancement:

Introduce:

  • new capability: can_get_gc_cpu_time
  • new JVMTI functions:
    • jvmtiError GetGCCpuTimerInfo(jvmtiEnv* env, jvmtiTimerInfo* info_ptr)
    • jvmtiError GetTotalGCCpuTime(jvmtiEnv* env, jlong* nanos_ptr)

CSR: 8370159: Spec: introduce new JVMTI function GetTotalGCCpuTime

Testing:

  • TBD: Mach5 tiers 1-6

Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change requires CSR request JDK-8370159 to be approved

Issues

  • JDK-8369449: Spec: introduce new JVMTI function GetTotalGCCpuTime (Enhancement - P4) ⚠️ Issue is not open.
  • JDK-8370159: Spec: introduce new JVMTI function GetTotalGCCpuTime (CSR)

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/27879/head:pull/27879
$ git checkout pull/27879

Update a local copy of the PR:
$ git checkout pull/27879
$ git pull https://git.openjdk.org/jdk.git pull/27879/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 27879

View PR using the GUI difftool:
$ git pr show -t 27879

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/27879.diff

Using Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented Oct 17, 2025

👋 Welcome back sspitsyn! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Oct 17, 2025

❗ This change is not yet ready to be integrated.
See the Progress checklist in the description for automated requirements.

@openjdk
Copy link

openjdk bot commented Oct 17, 2025

@sspitsyn The following labels will be automatically applied to this pull request:

  • hotspot
  • serviceability

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing lists. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the rfr Pull request is ready for review label Oct 17, 2025
@mlbridge
Copy link

mlbridge bot commented Oct 17, 2025

Webrevs

Copy link
Contributor

@JonasNorlinder JonasNorlinder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for making this happen! I think it looks good from my point of view, I have just one question, is it safe to skip the check for os::is_thread_cpu_time_supported? One might ask why CPUTimeUsage does not handle that, but since these methods are also used by internal GC functionality this was intentionally omitted for performance reasons (and some GCs like G1 won't run if thread CPU time is not supported so that call is not always needed). I'm no JVMTI expert so this might be fine, like how its fine for G1 to omit calling it, but just wanted to ask.

@sspitsyn
Copy link
Contributor Author

sspitsyn commented Oct 18, 2025

Thanks for making this happen! I think it looks good from my point of view, I have just one question, is it safe to skip the check for os::is_thread_cpu_time_supported? One might ask why CPUTimeUsage does not handle that, but since these methods are also used by internal GC functionality this was intentionally omitted for performance reasons (and some GCs like G1 won't run if thread CPU time is not supported so that call is not always needed). I'm no JVMTI expert so this might be fine, like how its fine for G1 to omit calling it, but just wanted to ask.

Thank you for looking at it and comment! Yes, I was thinking about this check but have not came to any conclusion yet. Yes, I was thinking if this check would be simpler to add to the CPUTimeUsage and understand your point about performance reasons. From the other hand it is possible for CPUTimeUsage::GC::total() to just always return -1 on platforms where os::is_thread_cpu_time_supported() == false. While it is better to have it implemented in some common place, I'm not very religious about it and could add it to the JVMTI functions. One problem with that is consistency. There is no such check in the existing JVMTI functions like GetThreadCpuTime(), GetCurrentThreadCpuTime(), GetThreadCpuTimerInfo() and GetCurrentThreadCpuTimerInfo().
However, I've just added this check to the GetTotalGCCpuTime(). Please, let me know your opinion.

@openjdk openjdk bot added the csr Pull request needs approved CSR before integration label Oct 18, 2025
@AlanBateman
Copy link
Contributor

There is also interest to get the same performance data from JVMTI.

Would it be possible to expand a bit on this, specifically how it might be used, and whether it might be used with other JVMTI functions (there aren't functions to get the process CPU time for example).

As a general point, the JVMTI spec doesn't have much support for monitoring GC. It has GC start/end events that date from the original spec when collectors were all STW and it hasn't evolved since to model more modern collectors.

I'm sure CPU time spend on GC is important to many profilers but I'm also wondering if JVMTI is the right API for modern profilers to be using. JVMTI is suited to debuggers and other tooling but it's less clear that it is relevant for profiling now.

(n passing, I see GetAvailableProcessors is in the Timers section of the spec, is that the right place for this?)

@JonasNorlinder
Copy link
Contributor

Would it be possible to expand a bit on this, specifically how it might be used, and whether it might be used with other JVMTI functions (there aren't functions to get the process CPU time for example).

Certainly, thanks for asking. Researchers in GC are using the GC start/end events (https://dl.acm.org/doi/10.1145/3669940.3707217, https://ieeexplore.ieee.org/document/9804613, https://dl.acm.org/doi/10.1145/3764118, https://dl.acm.org/doi/10.1145/3652024.3665510 etc.) to understand various costs pertaining to GC. I believe one USP of using a JVMTI agent is that it does not require modification of the benchmarking code and allows usage of powerful features made available by framework such as libpfm.

So these JVMTI hooks are used to read CPU performance counters to get some estimations of various metrics, be it CPU time, cache-misses etc. However this imposes severe limitations especially when it comes GCs with concurrent parts. This patch will expand the capabilities for these users using JVMTI agents to profile applications.

there aren't functions to get the process CPU time for example

I think there is no need for JVMTI to export such functionality as it should be trivial for a C/C++ application to get process CPU time through other means. However getting GC CPU time (outside the scope of stop-the-world) would be less trivial without this patch.

JVMTI is suited to debuggers and other tooling but it's less clear that it is relevant for profiling now.

I believe JVMTI is still relevant for profiling as it is being used in the GC research community.

On a general note, I think exposing GC CPU time at various APIs will serve a different scope of users. At this level we support users that uses advanced profiling techniques with external libraries.

Hope this helps.

JVMTI_VERSION_11 = 0x300B0000,
JVMTI_VERSION_19 = 0x30130000,
JVMTI_VERSION_21 = 0x30150000,
JVMTI_VERSION_25 = 0x30170000,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be JVMTI_VERSION_26 if this is targeted for version 26?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also the hex value represents JDK 23 not 25.

if (os::is_thread_cpu_time_supported()) {
jc.can_get_current_thread_cpu_time = 1;
jc.can_get_thread_cpu_time = 1;
jc.can_get_gc_cpu_time = 1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering, would a user trying to call GetTotalGCCpuTime if can_get_gc_cpu_time is not successfully set to 1 be undefined behavior? The specs say "To possess a capability, the agent must add the capability (https://docs.oracle.com/en/java/javase/25/docs/specs/jvmti.html#capability). If yes maybe we can discard the extra call to os::is_thread_cpu_time_supported in JvmtiEnv::GetTotalGCCpuTime? That seems to align with the pattern to not have that check in the other methods as you pointed out.

</errors>
</function>

<function id="GetGCCpuTimerInfo" phase="any" callbacksafe="safe" num="157" since="26">
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be called GetTotalGCCpuTimerInfo?

@JonasNorlinder
Copy link
Contributor

There may be two issues with the patch as is:

Calling GetTotalGCCpuTime in Agent_OnLoad can cause crash (since CollectedHeap::gc_threads_do do not protect against races on VM startup/shutdown).

If GetTotalGCCpuTime is invoked in a callback for GC start/end, this will cause a deadlock as the Heap_lock is already held. The MutexLocker hl(Heap_lock) pattern was introduced to avoid races that could happen from internal usage in G1 of CPUTimeUsage::GC::total() during shutdown. I could recall this wrong but I think the usage Heap_lock (which evidently has uses in other places) is an optimization to avoid having to create a new mutex shutdown variable. I could be wrong but it is maybe possible that this deadlock would be resolved by introducing a new mutex only used for syncing on the state of Universe::_is_shutting_down. I will ask @walulyai for his thoughts.

Copy link
Member

@dholmes-ora dholmes-ora left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a few issues with the spec and implementation, but they are easily addressed.

I'm not sure this functionality is really JVMTI worthy but if Jonas thinks this is useful for GC profiling then I will take hs word for it.

Comment on lines +11287 to +11288
This is an unsigned value. If tested or printed as a jlong (signed value)
it may appear to be a negative number.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to contradict the description that it could be relative to a value in the future and hence negative - thus it can't be "unsigned". But then I don't see how it can be as described - for this timer to be useful it must be tracking nanoseconds of CPU time consumed by GC since CPU time tracking commenced, sometime after VM startup. This has to be similar to how thread CPU time is defined.

JVMTI_VERSION_11 = 0x300B0000,
JVMTI_VERSION_19 = 0x30130000,
JVMTI_VERSION_21 = 0x30150000,
JVMTI_VERSION_25 = 0x30170000,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also the hex value represents JDK 23 not 25.

JvmtiEnv::GetTotalGCCpuTime(jlong* nanos_ptr) {
{
MutexLocker hl(Heap_lock);
if (!os::is_thread_cpu_time_supported() ||
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it isn't supported then you can't have the capability and so won't reach here.

Comment on lines 3803 to 3804
Universe::heap()->is_shutting_down()) {
*nanos_ptr = -1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems wrong to me and violates the timer-info spec of this timer not jumping backwards. I think you have to cache the last returned value for this function and if you cannot calculate an updated value because of VM shutdown, then that previous value should be returned.

@albertnetymk
Copy link
Member

/cc hotspot-gc

@openjdk
Copy link

openjdk bot commented Oct 20, 2025

@albertnetymk
The hotspot-gc label was successfully added.

@walulyai
Copy link
Member

There may be two issues with the patch as is:

Calling GetTotalGCCpuTime in Agent_OnLoad can cause crash (since CollectedHeap::gc_threads_do do not protect against races on VM startup/shutdown).

If GetTotalGCCpuTime is invoked in a callback for GC start/end, this will cause a deadlock as the Heap_lock is already held. The MutexLocker hl(Heap_lock) pattern was introduced to avoid races that could happen from internal usage in G1 of CPUTimeUsage::GC::total() during shutdown. I could recall this wrong but I think the usage Heap_lock (which evidently has uses in other places) is an optimization to avoid having to create a new mutex shutdown variable. I could be wrong but it is maybe possible that this deadlock would be resolved by introducing a new mutex only used for syncing on the state of Universe::_is_shutting_down. I will ask @walulyai for his thoughts.

Right, there will be a deadlock if GetTotalGCCpuTime is called in the callbacks for events GarbageCollectionStart, GarbageCollectionFinish

@sspitsyn
Copy link
Contributor Author

sspitsyn commented Oct 20, 2025

I'm not sure this functionality is really JVMTI worthy but if Jonas thinks this is useful for GC profiling then I will take his word for it.

Yes. I asked Jonas about it when we started our conversation and I rely on his justification which is in the enhancement description.

This seems to contradict the description that it could be relative to a value in the future and hence negative - thus it can't be "unsigned". But then I don't see how it can be as described - for this timer to be useful it must be tracking nanoseconds of CPU time consumed by GC since CPU time tracking commenced, sometime after VM startup. This has to be similar to how thread CPU time is defined.

Good catch, thanks. Fixed now.

Also the hex value represents JDK 23 not 25.

Good catch, thanks. Fixed now.

If it isn't supported then you can't have the capability and so won't reach here.

Good comment. In fact, I wanted to move this to the capability check. But there is already check for os::is_thread_cpu_time_supported() there as you noted. So, I removed unneeded check pointed by you.

  •   Universe::heap()->is_shutting_down()) {
    
  •   *nanos_ptr = -1;
    

This seems wrong to me and violates the timer-info spec of this timer not jumping backwards. I think you have to cache the last returned value for this function and if you cannot calculate an updated value because of VM shutdown, then that previous value should be returned.

I agree. I've already noticed this issue and was thinking where and how to fix it. Thank you for the suggestion. It can be fixed this way but I feel it is better to fix on the GC side. Will discuss it with Jonas.

@sspitsyn
Copy link
Contributor Author

Would it be possible to expand a bit on this, specifically how it might be used, and whether it might be used with other JVMTI functions (there aren't functions to get the process CPU time for example).

Alan, thank you for looking at this, the comments and good questions. I agree, we need to look at some level of completeness here. It includes the process CPU time and potentially more functionality to support modern GC's (I hope to get more insight from the GC team here). One question I've got is about all these timers and a possibility to reuse some of them. Is it right to have new timer for each CPU metric? We need some kind of overall design for all this. As I've posted earlier in reply to David for this enhancement, I rely on Jonas's justification which is in the enhancement description.

As a general point, the JVMTI spec doesn't have much support for monitoring GC. It has GC start/end events that date from the original spec when collectors were all STW and it hasn't evolved since to model more modern collectors.

Yes, support for modern collectors is on the table for some time. Now seems to be a good time to make some steps in this direction.

I'm sure CPU time spend on GC is important to many profilers but I'm also wondering if JVMTI is the right API for modern profilers to be using. JVMTI is suited to debuggers and other tooling but it's less clear that it is relevant for profiling now.

I hope, Jonas has answered this.

(in passing, I see GetAvailableProcessors is in the Timers section of the spec, is that the right place for this?)

Yes, it seems this function is a little bit out of this section scope. But I do not see a better section for this, and it does not deserve its own section yet. :)

@sspitsyn
Copy link
Contributor Author

sspitsyn commented Oct 20, 2025

I'm wondering, would a user trying to call GetTotalGCCpuTime if can_get_gc_cpu_time is not successfully set to 1 be undefined behavior? The specs say "To possess a capability, the agent must add the capability (https://docs.oracle.com/en/java/javase/25/docs/specs/jvmti.html#capability). If yes maybe we can discard the extra call to os::is_thread_cpu_time_supported in JvmtiEnv::GetTotalGCCpuTime? That seems to align with the pattern to not have that check in the other methods as you pointed out.

Right thinking, thanks. In fact, I had a plan to move this check to the capability side after the week ends. :)

Should this be called GetTotalGCCpuTimerInfo?

I was already thinking on this for some time. I wanted to generalize this timer a little bit for a case if we ever decide to add more GC related functions. I'm not sure that all timers have to be strictly bound to the CpuTime function` and can be potentially reused for some other cases.

There may be two issues with the patch as is:
Calling GetTotalGCCpuTime in Agent_OnLoad can cause crash (since CollectedHeap::gc_threads_do do not protect against races on VM startup/shutdown).

If GetTotalGCCpuTime is invoked in a callback for GC start/end, this will cause a deadlock as the Heap_lock is already held. The MutexLocker hl(Heap_lock) pattern was introduced to avoid races that could happen from internal usage in G1 of CPUTimeUsage::GC::total() during shutdown. I could recall this wrong but I think the usage Heap_lock (which evidently has uses in other places) is an optimization to avoid having to create a new mutex shutdown variable. I could be wrong but it is maybe possible that this deadlock would be resolved by introducing a new mutex only used for syncing on the state of Universe::_is_shutting_down. I will ask @walulyai for his thoughts.

It is nice you have caught this early. It would be nice to sort out this issue, and it is better to fix it on the GC side. I'd suggest to file a bug to separate this issue.

Should this be JVMTI_VERSION_26 if this is targeted for version 26?

I was thinking if we could limit these versions to the LTS releases in the future but on another thought it probably makes sense to add JVMTI_VERSION_26.

@AlanBateman
Copy link
Contributor

Right, there will be a deadlock if GetTotalGCCpuTime is called in the callbacks for events GarbageCollectionStart, GarbageCollectionFinish

The GarbageCollectionStart and GarbageCollectionFinish events are specified to sent when the VM is "stopped" (VM agnostic wording). The only JVMTI functions specified to be allowed are the raw monitor and the env local storage functions.

@AlanBateman
Copy link
Contributor

Certainly, thanks for asking. Researchers in GC are using the GC start/end events (https://dl.acm.org/doi/10.1145/3669940.3707217, https://ieeexplore.ieee.org/document/9804613, https://dl.acm.org/doi/10.1145/3764118, https://dl.acm.org/doi/10.1145/3652024.3665510 etc.) to understand various costs pertaining to GC. I believe one USP of using a JVMTI agent is that it does not require modification of the benchmarking code and allows usage of powerful features made available by framework such as libpfm.

So these JVMTI hooks are used to read CPU performance counters to get some estimations of various metrics, be it CPU time, cache-misses etc. However this imposes severe limitations especially when it comes GCs with concurrent parts. This patch will expand the capabilities for these users using JVMTI agents to profile applications.

I've no doubt that it is useful but it also reasonable to ask if JVMTI is the right API for monitoring GC in 2025. This is a neglected area, the existing events date from 20 years ago, and pre-date concurrent collectors and other advancements. Is this the start of a revival of JVMTI for profiling tools? If we could start again what features would a GC monitor interface have to help troubleshooting, performance monitoring, and research? Would we confident modelling a VM and GC agnostic interface or would there be aspects that are very GC specific?

@sspitsyn
Copy link
Contributor Author

I've no doubt that it is useful but it also reasonable to ask if JVMTI is the right API for monitoring GC in 2025. This is a neglected area, the existing events date from 20 years ago, and pre-date concurrent collectors and other advancements. Is this the start of a revival of JVMTI for profiling tools? If we could start again what features would a GC monitor interface have to help troubleshooting, performance monitoring, and research? Would we confident modelling a VM and GC agnostic interface or would there be aspects that are very GC specific?

Thank you, Alan. There was some offline Slack conversation with Ron and Jonas. Conclusion is the JMX support should be enough for this. So, I've closed the enhancement as WNF and this PR as well.

@sspitsyn sspitsyn closed this Oct 21, 2025
@sspitsyn sspitsyn deleted the f2 branch October 21, 2025 19:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

csr Pull request needs approved CSR before integration hotspot [email protected] hotspot-gc [email protected] rfr Pull request is ready for review serviceability [email protected]

Development

Successfully merging this pull request may close these issues.

6 participants