Implement SCA Reachability runtime detection: report vulnerable classes and callsites via telemetry#11352
Implement SCA Reachability runtime detection: report vulnerable classes and callsites via telemetry#11352jandro996 wants to merge 36 commits into
Conversation
Adds a new SCA Reachability subsystem that reports which vulnerable library classes were actually loaded at runtime, reducing false positives from static dependency scanning. Gated on DD_APPSEC_SCA_ENABLED. Key components: - Gradle task downloads GHSA enrichments from sca-reachability-database and generates sca_cves.json bundled in the agent jar at build time - ClassFileTransformer (observation-only) detects when vulnerable classes are loaded, resolves JAR versions via pom.properties, and checks semver ranges using ComparableVersion (Maven semantics) - ScaReachabilityCollector bridges the transformer and telemetry without circular dependencies, following the WafMetricCollector pattern - ScaReachabilityPeriodicAction reports hits on each app-dependencies-loaded heartbeat by adding reachability metadata to existing dependency entries
…n task The Gradle task now writes to src/main/resources/ and runs only when -PrefreshSca is passed or the file is absent, so CI builds never need network access to the private sca-reachability-database repo.
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: e607887e99
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
On Java 9+, the system classloader (jdk.internal.loader.ClassLoaders$AppClassLoader) no longer extends URLClassLoader, so the URLClassLoader chain walk misses all main classpath entries. Add a fallback that reads java.class.path to cover this case, deduplicating with a HashSet<URL> to avoid scanning the same JAR twice.
…rivate Test verifies: (1) system classloader is not URLClassLoader on Java 9+, and (2) findArtifactVersionInClasspath finds artifacts via java.class.path fallback. Applies to Java 9 and all subsequent JDKs (permanent JDK design change).
When sca_cves.json contains symbols with method != null, the transformer injects a static callback at method entry using ASM. The callback fires the first time the method is called and reports via ScaReachabilityCallback (bootstrap classloader, accessible from any application class). Key changes: - ScaReachabilityCallback in agent-bootstrap: bootstrap-visible callback with runtime dedup (vulnId|artifact|methodName) and handler registration - ScaReachabilityTransformer: injectMethodCallbacks() uses ByteBuddy ASM to inject INVOKESTATIC at first line number of each target method; processClass() routes class-level vs method-level symbols separately - ScaReachabilityHit: adds symbolName + line fields; existing constructor defaults to <clinit>/1 for class-level hits (backward compatible) - ScaReachabilityPeriodicAction: buildMetadataValue() now uses hit.symbolName() and hit.line() instead of hardcoded values - 6 tests: ASM injection, callback fires on method call only, dedup, multiple methods, safe method not reported, class-level unaffected
…rsion-unresolved Two cases required deferred retransformation: 1. Classes already loaded at startup (before transformer registered): the bytecode callback cannot be injected without retransformClasses() 2. Classes where DependencyResolver returned empty deps at load time (version not yet resolvable): empty results are now not cached to allow retries ScaReachabilityTransformer now stores Instrumentation and exposes performPendingRetransforms() called on each telemetry heartbeat via a Runnable callback in ScaReachabilityCollector.periodicWorkCallback. Classes are queued via: - pendingRetransform (Class<?> queue) from checkAlreadyLoadedClasses - pendingRetransformNames (String set) from processClass on empty deps
retransformClasses() always starts from the ORIGINAL class file bytes, not from the previously-transformed bytes. A dedup check in injectCallbacks() that blocked re-injection on the second pass caused the callback to be removed (the class was returned to its original, un-instrumented state). The authoritative dedup for method-level hits is ScaReachabilityCallback.reported (bootstrap-side), which persists across retransformations regardless of how many times transform() is called on the same class. Also update .claude-invariants.md: retransformClasses is now used (for method-level only), the cache constraint clarified, and the dedup invariant documents the two-level approach (transformer for class-level, bootstrap for method-level).
…avadoc, add retransform tests - performPendingRetransforms(): early return when instrumentation is null (unit test safety) - ScaReachabilityCollector: encapsulate periodicWorkCallback as private with getter/setter - ScaReachabilityTransformer class Javadoc: update dedup description from (vulnId,artifact) pair to (vulnId,artifact,symbolName) tuple; document two-level dedup strategy - Add 3 tests for performPendingRetransforms(): no-op with null inst, retransformClasses called for pending queue, no-op when both queues empty
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 82ea8065d9
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
…esolution P1: Replace StandardCharsets.UTF_8 with "UTF-8" string in ScaCveDatabase.load(). java.nio.* is forbidden during premain (bootstrap_design_guidelines.md) because it can trigger premature provider initialization before the app configures the runtime. P2: Add classpath fallback in resolveVersionForArtifact() for entries where the vulnerable artifact is an aggregator/starter POM whose watched classes live in a transitive dependency JAR (e.g., spring-boot-starter-web watches @controller but @controller is defined in spring-context.jar, not the starter). The new helper first checks the class's own JAR, then falls back to findArtifactVersionInClasspath with a hit cache (classpathArtifactCache). processPathA uses the same helper for consistency.
…IfPresent helper - Add CLASS_LEVEL_SYMBOL = "<clinit>" constant to avoid magic string repetition (appeared 3 times in the same class; a typo would silently produce wrong symbol names) - Extract reportClassLevelHitIfPresent(entry, version, internalClassName) helper to unify identical class-level symbol matching loops in processPathA, processPathB, and processClass — all three now delegate to the single helper
Move CLASS_LEVEL_SYMBOL = "<clinit>" to ScaReachabilityHit (internal-api) as a public constant so both the transformer (appsec) and the telemetry payload builder share the canonical definition without cross-module string duplication. The convenience constructor also uses the constant now. ScaReachabilityTransformer delegates to ScaReachabilityHit.CLASS_LEVEL_SYMBOL. Fix misleading comment in processClass: "We enqueue via classBeingRedefined is null here" → explains that classBeingRedefined is null on first class load, preventing direct Class<?> queuing, so scheduleRetransformByName is used instead.
…lback - ScaCveDatabase: move "java.nio.* forbidden in premain" comment from the imports block to inline at the InputStreamReader construction site (comments in imports are unusual and smola flags verbose placement) - ScaReachabilityTransformer.resolveVersionForArtifact: make package-private for testing; add 4 tests covering the two-step fallback: (1) version from classJarDeps directly (2) classpath fallback when classJarDeps is empty (transitive JAR case) (3) classpathArtifactCache hit on second call (4) null for absent artifact
- Remove empty visitCode() in MethodEntryInjector: the method only called super.visitCode() and its comment was misleading — the actual no-debug-info fallback injection is handled by ensureInjected() in the visitInsn/visitVarInsn/ visitMethodInsn/visitFieldInsn overrides, not here - Remove private CLASS_LEVEL_SYMBOL alias in ScaReachabilityTransformer: the constant is used in exactly one place (reportClassLevelHitIfPresent) and ScaReachabilityHit.CLASS_LEVEL_SYMBOL is self-documenting at that site; the alias added a private field with no benefit after the constant was moved to ScaReachabilityHit in a previous commit
|
@codex review |
|
Codex Review: Didn't find any major issues. 👍 ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
Per the RFC and Python implementation (dd-trace-py#17156), the telemetry payload path/symbol/line for method-level hits must report the APPLICATION FRAME that called the vulnerable method (the callsite), not the vulnerable method itself. ScaReachabilityCallback.onMethodHit() now walks Thread.getStackTrace() to find the first non-agent, non-JDK frame after the vulnerable class: ScaReachabilityCallback.onMethodHit (skip - us) com.foo.VulnerableClass.method (skip - vulnerable class) com.myapp.UserService.processRequest (CALLSITE - report this) The dotClassName/methodName params are still baked into the bytecode and used only for deduplication (vulnId|artifact|methodName key). The handler receives the callsite's class/method/line for telemetry. Fallback: if no application frame is found (e.g. called from JDK internals), reports the vulnerable symbol itself so the backend knows it was reached. Class-level hits (<clinit>) are unchanged — no callsite at class load time.
ScaReachabilityCallback (bootstrap) must stay minimal — complex logic does not belong there. Move findCallsite() to ScaReachabilitySystem which has access to internal-api utilities. The handler runs synchronously so the full call stack is still present: ScaReachabilitySystem handler ScaReachabilityCallback.onMethodHit <vulnerable method> <application callsite> ← reported Uses the same class-prefix predicate as AbstractStackWalker. isNotDatadogTraceStackElement (package-private, so replicated inline) to skip agent/JDK frames, consistent with the IAST trie-based filtering infrastructure used elsewhere in the codebase.
…ltering Make isNotDatadogTraceStackElement public in AbstractStackWalker so SCA Reachability can use the existing predicate directly rather than duplicating the 3 class-prefix conditions inline.
… path ScaReachabilitySystemCallsiteTest covers: - findCallsite returns null when vulnerable class is not on the stack (triggers fallback: handler reports the vulnerable symbol itself) - findCallsite skips the vulnerable class frame and returns the first non-agent frame above it (using java.lang.Thread as a non-agent class guaranteed to be at the top of getStackTrace()) Note on the method-level integration test: TargetClass is in com.datadog.appsec.sca.* (agent namespace) so AbstractStackWalker filters it as agent code and findCallsite() returns null. The test now documents this fallback behaviour explicitly. In production the vulnerable class is always a 3rd-party library (e.g. com.fasterxml.jackson.*) and the happy path fires correctly — verified by ScaReachabilitySystemCallsiteTest.
The stream().anyMatch() for detecting method-level symbols was computed for every entry unconditionally. It is only needed when version == null (deps not yet resolvable). Moving the check inside the version==null branch eliminates the stream allocation on the common path where the version resolves successfully.
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 6008ac9ca0
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
…cators JDK classes (e.g. java.sql.PreparedStatement, protectionDomain==null) are loaded by ANY app that uses JDBC, regardless of which driver is present. Using their presence to infer that a specific library (e.g. PostgreSQL) is "reachable" produces classpath-presence false positives, not runtime reachability signals. Entries that list JDK symbols (e.g. the PostgreSQL advisory) also include library-specific classes (e.g. org.postgresql.ds.PGSimpleDataSource) that Path A correctly detects when those classes are actually loaded. In checkAlreadyLoadedClasses(), classes with no code source (JDK/bootstrap) are now skipped silently. The invariants and KB are updated accordingly.
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 750b3c3877
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
The key vulnId|artifact|methodName collapses hits for different classes in the same artifact that share a method name (e.g. ClassA.parse and ClassB.parse would both map to the same key, suppressing the second). Fix: add dotClassName to the key → vulnId|artifact|dotClassName|methodName so each class+method combination is tracked independently. Add regression test that verifies both hits are reported when the same method name exists in two different vulnerable classes of the same artifact.
|
@codex revew |
|
I reviewed the PR context and the currently checked-out code for the flagged areas. I don’t see any new blocking issues beyond what has already been discussed/resolved. Review result
Commands used
No code changes were made, so no commit or follow-up PR was created. |
Replace stateless ScaReachabilityCollector (simple hit queue) with
ScaReachabilityDependencyRegistry (stateful per-dependency CVE tracking)
to comply with the RFC heartbeat specification:
1. When a class from a vulnerable version loads: registerCve() creates a
CVE entry with reached=[] and marks the dependency as pending, so the
next heartbeat reports metadata:[{"type":"reachability","value":"...reached:[]"}]
— signalling the backend that SCA is monitoring this CVE before any
symbol is called.
2. When a vulnerable method is called: recordHit() stores the first callsite
(RFC: single occurrence sufficient) and marks the dependency as pending.
3. On each heartbeat: ScaReachabilityPeriodicAction drains pending dependencies
and re-reports ALL CVEs for each dependency together (both with and without
callsites), then clears pending. Empty heartbeat otherwise.
Key invariant: whenever any CVE state changes, ALL CVEs for the same
dependency are re-reported together so the backend has a complete picture.
Verifies the RFC stateful heartbeat model end-to-end: - jackson-databind:2.6.0 (vulnerable, range < 2.6.7.3) appears in app-dependencies-loaded with metadata reachability entries - GHSA identifier present in the metadata value - reached[] contains a callsite after ObjectMapper is loaded at startup Uses the existing springboot smoke test app (already has jackson-databind:2.6.0) with DD_APPSEC_SCA_ENABLED=true added to JVM args.
For the 7 jackson-databind CVE entries, adds method-level symbols for the deserialization entry points that actually trigger gadget chains when polymorphic typing is enabled with untrusted input: ObjectMapper.readValue — primary deserialization entry point ObjectMapper.readValues — multiple-value deserialization ObjectReader.readValue — reader-based deserialization variant ObjectReader.readValues — reader-based multiple values Class-level symbols (method=null) are kept alongside the new method-level ones: class load detection signals the library is present; method detection signals the vulnerable code path was actually invoked. 26 method-level symbols added across 7 entries (ObjectMapper + ObjectReader × readValue + readValues × 7 GHSA entries).
…r-asl
Adds method-level detection for 24 entries across 4 libraries where the
deserialization/injection entry point is 100% certain:
XStream (17 entries): fromXML — THE entry point for all XStream CVEs;
triggers gadget chains when deserializing untrusted XML
log4j-core (4 entries): info, error, warn, debug, trace, fatal, log —
Log4Shell (GHSA-jfh8-c2jp-5v3q) triggers JNDI lookup when log messages
contain ${jndi:...} patterns; any Logger method is an entry point
snakeyaml (1 entry): load, loadAll — unsafe YAML deserialization;
instantiates arbitrary Java classes from untrusted YAML input
jackson-mapper-asl (1 entry): readValue, readValues — same deserialization
pattern as jackson-databind, applies to the legacy 1.x mapper
56 method-level symbols added. Class-level symbols (method=null) are kept
alongside the new method-level ones for dual detection coverage.
- ScaReachabilitySmokeTest: fix find() to look for the entry with reachability metadata — the same dep appears twice (once from DependencyService without metadata, once from SCA with CVE data) - TelemetryRequestBody.writeDependency(): write metadata:[] even when list is empty — null means SCA disabled, empty list means SCA active but no CVEs detected yet (RFC: all deps get metadata:[] at startup) - sca_cves.json: remove class-level symbol from snakeyaml — Spring Boot loads Yaml at startup causing registerCve+recordHit to fire in the same request, preventing the reached:[] heartbeat from being observed - ScaReachabilityPeriodicActionTest: add rfcFullHeartbeatFlow test covering Heartbeats #2-#6 from the RFC spec - TelemetryRequestBodyDependencyMetadataTest: update to reflect that metadata:[] is written (not suppressed) when list is empty
What Does This Do
Build — symbol database
generateScaCvesJsoninappsec/build.gradledownloads GHSA enrichments fromDataDog/sca-reachability-databaseand bundlessca_cves.jsonin the agent JAR. The committed copy is used in CI (no network access required at build time).GhsaEnrichmentParser(buildSrc Kotlin) converts the GHSA format to the internal format: one record per Maven artifact, method-level symbols (method: nullfor class-level, method name for method-level).sca_cves.jsoncontains 49 entries covering jackson-databind, xstream, log4j, snakeyaml, jackson-mapper-asl, Spring, PostgreSQL, Tomcat, and others.Runtime — detection
ScaCveDatabaseindexessca_cves.jsonby JVM internal class name at startup for O(1) lookup.VersionRangeParserevaluates GHSA version range strings (< 2.6.7.3,>= 2.7.0, < 2.7.9.5,= 9.5.0) usingComparableVersion(Maven 3.9.9 semantics, handles 4-part versions and qualifiers).ScaReachabilityTransformerimplementsClassFileTransformer:method: null): observation-only, registers the CVE when the class loads, always returnsnull.spring-boot-starter-webwatches@Controllerwhich lives inspring-context.jar).protectionDomain == null) are skipped — they are loaded regardless of which library is present and are not reliable reachability indicators.ScaReachabilitySystemis the subsystem entry point, called fromAgent.javavia reflection (same pattern asAppSecSystem/IastSystem), gated onDD_APPSEC_SCA_ENABLED.Runtime — callsite capture (method-level)
ScaReachabilitySystemwalks the call stack usingAbstractStackWalker.isNotDatadogTraceStackElementto find the first application frame above the vulnerable method.path/symbol/linein the telemetry payload identify the application frame that invoked the vulnerable method — not the vulnerable method itself. Mirrors the Python tracer's behavior.Telemetry — RFC stateful heartbeat model
When
DD_APPSEC_SCA_ENABLED=true,ScaReachabilityPeriodicActionreports viaapp-dependencies-loadedon each heartbeat:registerCve): dependency reported withmetadata: [{"type":"reachability","value":"{\"id\":\"GHSA-...\",\"reached\":[]}"}]— signals the backend that SCA is monitoring this CVE before any vulnerable method is called.recordHit): dependency re-reported with callsite inreached.Dependencyextended with optionalreachabilityMetadata: List<String>;TelemetryRequestBody.writeDependency()writesmetadata: []when the list is non-null but empty (SCA active, no CVEs for this dep) and omits the field when null (SCA disabled).Bridge layer
ScaReachabilityCallback(bootstrap classloader): minimal static callback invoked from injected bytecode, deduplicates byvulnId|artifact|dotClassName|methodName, dispatches to a registered handler.ScaReachabilityDependencyRegistry(internal-api): stateful registry implementing the RFC heartbeat model.periodicWorkCallback(private, accessed via getter) is wired byTelemetrySystemfor periodic retransformation.ScaReachabilityHit(internal-api): immutable value class for one callsite hit.AbstractStackWalker.isNotDatadogTraceStackElementmadepublicto be usable from the appsec module.Configuration
DD_APPSEC_SCA_ENABLEDdd.appsec.sca.enablednull(disabled)true, enables SCA Reachability runtime detection. Already present inmetadata/supported-configurations.json(version B).Motivation
Implements APPSEC-62260 / RFC: SCA Runtime Reachability.
Static SCA scanning flags every version-matched library regardless of whether the vulnerable code path is actually executed. SCA Reachability reduces false positives by reporting which vulnerable classes and methods are loaded/called at runtime, with the exact application callsite.
RFC: https://docs.google.com/document/d/1xDw9iG6h41VCEgJGTqoJdruRaNS4pYgNifO6nhiizWA/edit
Additional Notes
Why class-level symbols exist alongside method-level: a class-level symbol fires when the vulnerable class is loaded, even if no specific method is called. This provides a weaker but guaranteed signal. Method-level symbols add the exact callsite. Both use the same
registerCve+recordHitRFC flow.JDK symbols in the CVE database (e.g.
java.sql.PreparedStatementin the PostgreSQL advisory): these are skipped by the transformer because JDK classes are loaded by any JVM regardless of which driver is present — using them as reachability proxies would produce false positives. Library-specific classes in the same entry (e.g.org.postgresql.ds.PGSimpleDataSource) are detected correctly when they load.ScaReachabilityCallbackstays minimal by design: it lives in the bootstrap classloader, visible from any application class. Stack walking and registry updates are delegated to the handler registered byScaReachabilitySystemin the agent classloader, keeping the bootstrap footprint small.Jira ticket: APPSEC-62260
Note: Once your PR is ready to merge, add it to the merge queue by commenting
/merge./merge -ccancels the queue request./merge -f --reason "reason"skips all merge queue checks; please use this judiciously, as some checks do not run at the PR-level. For more information, see this doc.Contributor Checklist
type:and (comp:orinst:) labels in addition to any other useful labelsclose,fix, or any linking keywords when referencing an issueUse
solvesinstead, and assign the PR milestone to the issueDD_APPSEC_SCA_ENABLEDalready present inmetadata/supported-configurations.json(version B); no new configuration keys introduced by this PR