Detect accidentally dropped errors in tests #245

DerGuteMoritz · 2025-09-19T13:15:51Z

Manifold has a debug mechanism which instruments deferreds so that a warning is logged when they are in an error state with no error handler attached at garbage collection time. Such errors are referred to as "dropped errors" and the instrumented deferreds are called "leak-aware".

The dropped error logging feature is enabled by default but for performance reasons, only every 1024th deferred is made leak-aware (see manifold.deferred/deferred). However, deferreds constructed via manifold.deferred/error-deferred are always leak-aware.

This patch introduces a new option manifold.debug/*leak-aware-deferred-rate* which allows changing the rate of leak-aware deferreds being instantiated, with default still being 1024. When this option is set to to 1 (= every deferred will be leak-aware), dropped errors can be detected more reliably e.g. during testing.

To that end, the patch also includes a test util for instrumenting all tests in a namespace so that they fail when a dropped error was detected. All test namespaces are now instrumented like this (except for :benchmark and :stress tests), which already uncovered a few latent dropped errors. These will be fixed in separate patches.

This was prompted by investigating the causes of clj-commons/aleph#766. One of these was found to originate in Manifold itself which will be addressed in a follow-up PR.

This PR is equivalent to #243 but the source branch lives in this here repository which allows us to create stacked PRs with the fixes for it.

DerGuteMoritz · 2025-09-19T15:05:51Z

One more patch coming up to fix the remaining dropped error in manifold.go-off but gotta step away for a while now.

KingMob · 2025-09-21T04:11:51Z

@DerGuteMoritz Sorry to say this, but I no longer remember Manifold well enough to review these, I think. Nor do I really have the time atm. I would suggest soliciting some helpers from the #manifold slack channel.

valerauko · 2025-09-23T11:00:30Z

src/manifold/debug.clj

+  [f handle-dropped-errors]
+  (assert (nil? dropped-errors) "with-dropped-error-detection may not be nested")
+  ;; Flush out any pending dropped errors from before
+  (System/gc)


I was told before that System/gc is more of a suggestion to the VM than an actual call

Did you observe this working reliably?

Indeed, there is no guarantee that System/gc actually triggers a GC, let alone whether it reclaims all garbage. Quoting from the docs

Calling the gc method suggests that the Java Virtual Machine expend effort toward recycling unused objects in order to make the memory they currently occupy available for reuse by the Java Virtual Machine. When control returns from the method call, the Java Virtual Machine has made a best effort to reclaim space from all unused objects. There is no guarantee that this effort will recycle any particular number of unused objects, reclaim any particular amount of space, or complete at any particular time, if at all, before the method returns or ever. There is also no guarantee that this effort will determine the change of reachability in any particular number of objects, or that any particular number of Reference objects will be cleared and enqueued.

However, it does work pretty well in practice, see for example the failing tests for this PR which uses OpenJDK 8. Locally, I've also successfully used it with OpenJDK 21. So yeah, it's not deterministic but still quite useful as evidenced by the issues it uncovered 🙂

DerGuteMoritz · 2025-09-23T12:35:39Z

#249 is the aforementioned patch for manifold.go-off (it wasn't auto-linked here because it's based on #248).

Manifold has a debug mechanism which instruments deferreds so that a warning is logged when they are in an error state with no error handler attached at garbage collection time. Such errors are referred to as "dropped errors" and the instrumented deferreds are called "leak-aware". The dropped error logging feature is enabled by default but for performance reasons, only every 1024th deferred is made leak-aware (see `manifold.deferred/deferred`). However, deferreds constructed via `manifold.deferred/error-deferred` are always leak-aware. This patch introduces a new option `manifold.debug/*leak-aware-deferred-rate*` which allows changing the rate of leak-aware deferreds being instantiated, with default still being 1024. When this option is set to to 1 (= every deferred will be leak-aware), dropped errors can be detected more reliably e.g. during testing. To that end, the patch also includes a test util for instrumenting all tests in a namespace so that they fail when a dropped error was detected. All test namespaces are now instrumented like this (except for `:benchmark` and `:stress` tests), which already uncovered a few latent dropped errors. These will be fixed in separate patches.

Also, make test-alt a bit more reliable and make the exception messages distinct so that a dropped error can easily be matched up with the deferred.

Involving `d/future` makes these tests needlessly flaky. By decoupling the `d/alt` invocation from dereferencing its result, we can deterministically control the order of events.

The result deferred `d` may end up in an error state when `(f)` throws an exception or returns an error deferred. Now since the cancellation callback was only attached via `d/chain` for side-effect (i.e. the resulting deferred is not returned), the dropped error detection would be triggered in this case. The fix is simply to also attach an error handler with `d/on-realized`. Furthermore, putting the result deferred in an error state would not have cancelled the timer because there was no error handler attached to it. This is now also fixed and the tests are extended to cover the cancellation API, too. The latter uncovered an oversight in the mock clock implementation where the `in` method didn't return a cancellation function, resulting in an NPE. This is now also fixed.

This was a mistake, this metadata flag is supposed to go on the test *var* only.

When putting the result deferred `d` in an error state, deferred returned by `chain` would be detected as a dropped error because it's only used to attach a side-effecting callback for cancelling the timeout. The fix then is to use `on-realized` instead to attach a handler for both cases. As a consequence, timeouts are now also properly cancelled when the result deferred is put in an error state, freeing up resources in a more timely fashion. This fixes one of the causes of clj-commons/aleph#766, as well as 2 of the 3 dropped errors in `manifold.go-off-test`. Also, add dedicated tests for `d/timeout!`.

When `in` is closed and `out` is also closed by the time the chained handler runs, closing `s'` would fail because it's `::none`. This non-deterministic case was occasionally observed with `manifold.stream-test/test-cleanup` failing due to a leaked error deferred like so: WARNING: unconsumed deferred in error state, make sure you're using `catch`. java.lang.ClassCastException: clojure.lang.Keyword cannot be cast to manifold.stream.core.IEventStream at manifold.stream$concat$this__3189__auto____14958$fn__14959$fn__14960.invoke(stream.clj:819) at manifold.deferred$eval3018$chain_SINGLEQUOTE____3039.invoke(deferred.clj:904) at manifold.stream$concat$this__3189__auto____14958$fn__14959.invoke(stream.clj:816) at manifold.stream$concat$this__3189__auto____14958.invoke(stream.clj:815) at clojure.lang.AFn.applyToHelper(AFn.java:154) at clojure.lang.AFn.applyTo(AFn.java:144) at clojure.core$apply.invokeStatic(core.clj:669) at clojure.core$apply.invoke(core.clj:662) at manifold.stream$concat$this__3189__auto____14958$fn__14984.invoke(stream.clj:815)

Timeout tests are changed so that every timeout exception has a uniquely identifiable message which makes it easier to match up the dropped error warning with the originating timeout.

This saves a bit of overhead relative to `d/chain` + `d/catch`.

DerGuteMoritz requested a review from KingMob as a code owner September 19, 2025 13:15

This was referenced Sep 19, 2025

Detect accidentally dropped errors in tests #243

Closed

Fix dropped errors in tests #246

Merged

Fix dropped errors in manifold.time/in #247

Merged

Fix dropped errors in manifold.deferred/timeout! #248

Merged

valerauko reviewed Sep 23, 2025

View reviewed changes

KingMob removed their request for review September 25, 2025 06:19

valerauko approved these changes Oct 4, 2025

View reviewed changes

DerGuteMoritz mentioned this pull request Oct 13, 2025

Fix closing bug in manifold.stream/concat #250

Merged

DerGuteMoritz added 5 commits October 15, 2025 16:08

Fix dropped errors in tests

08e4738

Also, make test-alt a bit more reliable and make the exception messages distinct so that a dropped error can easily be matched up with the deferred.

Make test-alt fully deterministic

5c32b3e

Involving `d/future` makes these tests needlessly flaky. By decoupling the `d/alt` invocation from dereferencing its result, we can deterministically control the order of events.

Don't check for :ignore-dropped-errors meta on test function

cff7428

This was a mistake, this metadata flag is supposed to go on the test *var* only.

DerGuteMoritz force-pushed the detect-dropped-errors-in-tests branch from d31871a to cff7428 Compare October 15, 2025 14:08

DerGuteMoritz mentioned this pull request Oct 15, 2025

Fix dropped errors in CompletionStage's *EitherAsync methods #252

Open

DerGuteMoritz added 4 commits October 16, 2025 09:57

Fix dropped errors in manifold.go-off

6ab6ad0

Timeout tests are changed so that every timeout exception has a uniquely identifiable message which makes it easier to match up the dropped error warning with the originating timeout.

Use d/on-realized in manifold.go-off/take!

84c1c37

This saves a bit of overhead relative to `d/chain` + `d/catch`.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Detect accidentally dropped errors in tests #245

Detect accidentally dropped errors in tests #245

Uh oh!

DerGuteMoritz commented Sep 19, 2025

Uh oh!

DerGuteMoritz commented Sep 19, 2025

Uh oh!

KingMob commented Sep 21, 2025

Uh oh!

valerauko Sep 23, 2025

Uh oh!

DerGuteMoritz Sep 23, 2025

Uh oh!

DerGuteMoritz commented Sep 23, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Detect accidentally dropped errors in tests #245

Are you sure you want to change the base?

Detect accidentally dropped errors in tests #245

Uh oh!

Conversation

DerGuteMoritz commented Sep 19, 2025

Uh oh!

DerGuteMoritz commented Sep 19, 2025

Uh oh!

KingMob commented Sep 21, 2025

Uh oh!

valerauko Sep 23, 2025

Choose a reason for hiding this comment

Uh oh!

DerGuteMoritz Sep 23, 2025

Choose a reason for hiding this comment

Uh oh!

DerGuteMoritz commented Sep 23, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants