Skip to content

Feat/poc continuous profiling #4556

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 21 commits into
base: main
Choose a base branch
from
Draft

Conversation

lbloder
Copy link
Collaborator

@lbloder lbloder commented Jul 15, 2025

📜 Description

💡 Motivation and Context

Initial implementation of #2635

💚 How did you test it?

📝 Checklist

  • I added tests to verify the changes.
  • No new PII added or SDK only sends newly added PII if sendDefaultPII is enabled.
  • I updated the docs if needed.
  • I updated the wizard if needed.
  • Review from the native team if needed.
  • No breaking change or entry added to the changelog.
  • No breaking change for hybrid SDKs or communicated to hybrid SDKs.

🔮 Next steps

  • [] Investigate why TRACE lifecycle profiles do not show up in Sentry
  • [] Add external option for profile lifecycle

lbloder added 21 commits April 25, 2025 12:14
…sed on jfr converter bundled with asyncprofiler
… use existing SentryStackFrame instead of JfrFrame,
…t in SentrySpan to work around scientific notation of double, use wall clock profiling
# Conflicts:
#	sentry/build.gradle.kts
#	sentry/src/test/java/io/sentry/ExternalOptionsTest.kt
#	sentry/src/test/java/io/sentry/JsonSerializerTest.kt
#	sentry/src/test/java/io/sentry/SentryClientTest.kt
#	sentry/src/test/java/io/sentry/SentryOptionsTest.kt
Copy link
Contributor

Fails
🚫 Please consider adding a changelog entry for the next release.
Messages
📖 Do not forget to update Sentry-docs with your feature once the pull request gets approved.

Instructions and example for changelog

Please add an entry to CHANGELOG.md to the "Unreleased" section. Make sure the entry includes this PR's number.

Example:

## Unreleased

- Feat/poc continuous profiling ([#4556](https://github.com/getsentry/sentry-java/pull/4556))

If none of the above apply, you can opt out of this check by adding #skip-changelog to the PR description.

Generated by 🚫 dangerJS against 03a20dd

Copy link
Contributor

Performance metrics 🚀

  Plain With Sentry Diff
Startup time 447.81 ms 457.62 ms 9.80 ms
Size 1.58 MiB 2.09 MiB 520.14 KiB

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will be deleted in follow-up PR

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be renamed to SentryProfileSample or just SentrySample

Copy link
Collaborator Author

@lbloder lbloder Jul 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not used anymore, will be deleted in follow-up PR. Instead using SentryStackFrame now

return new JavaContinuousProfiler(
logger,
profilingTracesDirPath,
10, // default profilingTracesHz
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in follow-up PR to use profilingTracesHz instead of hardcoded 10

Comment on lines +124 to +133
long divisor = jfr.ticksPerSec / 1000_000_000L;
long myTimeStamp =
jfr.chunkStartNanos + ((event.time - jfr.chunkStartTicks) / divisor);

JfrSample sample = new JfrSample();
Instant instant = Instant.ofEpochSecond(0, myTimeStamp);
double timestampDouble =
instant.getEpochSecond() + instant.getNano() / 1_000_000_000.0;

sample.timestamp = timestampDouble;
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Revisiting this in follow-up PR, I think the way the timestamp is calculated is not correct

@adinauer
Copy link
Member

@sentry review

Comment on lines +39 to +40
events.add(event);
System.out.println(event);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The System.out.println(event) statement should be removed or replaced with proper logging. Debug output should use the injected logger with appropriate log levels.

Did we get this right? 👍 / 👎 to inform future reviews.

Comment on lines +26 to +29
import io.sentry.util.SentryRandom;
import java.io.File;
import java.io.IOException;
import java.util.ArrayList;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The hardcoded profiling frequency of 10 in the constructor call should use the provided profilingTracesHz parameter instead of ignoring it.

Did we get this right? 👍 / 👎 to inform future reviews.

Comment on lines +108 to +130
} else {
frame.setInApp(
new SentryStackTraceFactory(Sentry.getGlobalScope().getOptions())
.isInApp(sanitizedClassName));
}

frame.setLineno((element.getLineNumber() != 0) ? element.getLineNumber() : null);
frame.setFilename(classNameWithLambdas);

if (sentryProfile.frames != null) {
sentryProfile.frames.add(frame);
}
stack.add(currentFrame);
currentFrame++;
}

long divisor = jfr.ticksPerSec / 1000_000_000L;
long myTimeStamp =
jfr.chunkStartNanos + ((event.time - jfr.chunkStartTicks) / divisor);

JfrSample sample = new JfrSample();
Instant instant = Instant.ofEpochSecond(0, myTimeStamp);
double timestampDouble =

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The conversion loop has performance concerns. Creating new SentryStackTraceFactory and Instant objects for each frame/sample is inefficient. Consider caching the SentryStackTraceFactory and using more efficient timestamp conversion.

Did we get this right? 👍 / 👎 to inform future reviews.

Comment on lines +297 to +310

if (profileChunk.getPlatform().equals("java")) {
final IProfileConverter profileConverter =
ProfilingServiceLoader.loadProfileConverter();
if (profileConverter != null) {
try {
final SentryProfile profile =
profileConverter.convertFromFile(traceFile.toPath());
profileChunk.setSentryProfile(profile);
} catch (IOException e) {
throw new SentryEnvelopeException("Profile conversion failed");
}
}
} else {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The profile conversion logic should have proper error handling and fallback. If conversion fails, the current implementation throws an exception, but it might be better to log the error and continue with a degraded experience.

Did we get this right? 👍 / 👎 to inform future reviews.

Comment on lines +36 to +43
return NoOpContinuousProfiler.getInstance();
} catch (Throwable t) {
logger.log(
SentryLevel.ERROR,
"Failed to load continuous profiler provider, using NoOpContinuousProfiler",
t);
return NoOpContinuousProfiler.getInstance();
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The service loader methods silently catch all Throwable and either return null or a no-op implementation. This could hide important configuration or setup errors. Consider logging the errors at appropriate levels and being more specific about which exceptions to catch.

Did we get this right? 👍 / 👎 to inform future reviews.

Comment on lines +190 to +210
startProfileChunkTimestamp = new SentryNanotimeDate();
}
filename = profilingTracesDirPath + File.separator + SentryUUID.generateSentryId() + ".jfr";
String startData = null;
try {
final String profilingIntervalMicros =
String.format("%dus", (int) SECONDS.toMicros(1) / profilingTracesHz);
final String command =
String.format(
"start,jfr,event=wall,interval=%s,file=%s", profilingIntervalMicros, filename);
System.out.println(command);
startData = profiler.execute(command);
} catch (Exception e) {
logger.log(SentryLevel.ERROR, "Failed to start profiling: ", e);
}
// check if profiling started
if (startData == null) {
return;
}

isRunning = true;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The profiler initialization and AsyncProfiler command construction should have more robust error handling. The current implementation may not handle all edge cases properly.

Did we get this right? 👍 / 👎 to inform future reviews.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants