Skip to content

[coverage] Fix remaining ~0.1% flakiness #2102

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Jun 4, 2025
Merged

[coverage] Fix remaining ~0.1% flakiness #2102

merged 8 commits into from
Jun 4, 2025

Conversation

liamappelbe
Copy link
Contributor

@liamappelbe liamappelbe commented May 27, 2025

dart-lang/test#2494 has landed and has almost eliminated all the flakiness from coverage collection. But I'm still seeing about ~0.1% flakiness.

This last bit of flakiness is hard to debug. The error is the same as before (trying to resume the main isolate after the VM service has already shut down), but the event sequence that leads to that point looks correct. There's no reason the service should have shut down at that point.

onStart: main (this is identified as the main isolate)
onPause: main
onStart: test_suite:file:///build/work/bd7788ee6ad0...
onPause: test_suite:file:///build/work/bd7788ee6ad0...
collectIsolate: test_suite:file:///build/work/bd7788ee6ad0...
    done collecting: test_suite:file:///build/work/bd7788ee6ad0...
Resuming: test_suite:file:///build/work/bd7788ee6ad0...
collectIsolate: main
onExit: test_suite:file:///build/work/bd7788ee6ad0...
    done collecting: main
Resuming main: main
Failed to resume main: resume: (-32000) Service connection disposed

resume: (-32000) Service connection disposed
package:vm_service/src/vm_service.dart 268:34                         new _OutstandingRequest
package:vm_service/src/vm_service.dart 1950:25                        VmService._call.<fn>
package:vm_service/src/vm_service.dart 1962:8                         VmService._call
package:vm_service/src/vm_service.dart 1651:7                         VmService.resume
package:coverage/src/isolate_paused_listener.dart 68:24               IsolatePausedListener.waitUntilAllExited

There's always the option of catching the RPC error and ignoring it. I was hesitant to do this earlier because I didn't want to hide legitimate errors that could lead to missing coverage. But the main isolate's coverage is being collected successfully, so I think it's safe enough to ignore the error at this point.

With this fix in place, I've run 100k tests without any flakes.

@liamappelbe liamappelbe requested a review from a team as a code owner May 27, 2025 00:08
@liamappelbe liamappelbe removed the request for review from a team May 27, 2025 00:08
@liamappelbe liamappelbe requested a review from bkonyi May 27, 2025 00:08
Copy link

github-actions bot commented May 27, 2025

Package publishing

Package Version Status Publish tag (post-merge)
package:bazel_worker 1.1.3 already published at pub.dev
package:benchmark_harness 2.4.0-wip WIP (no publish necessary)
package:boolean_selector 2.1.2 already published at pub.dev
package:browser_launcher 1.1.3 already published at pub.dev
package:cli_config 0.2.1-wip WIP (no publish necessary)
package:cli_util 0.4.2 already published at pub.dev
package:clock 1.1.3-wip WIP (no publish necessary)
package:code_builder 4.10.2-wip WIP (no publish necessary)
package:coverage 1.14.1 ready to publish coverage-v1.14.1
package:csslib 1.0.2 already published at pub.dev
package:extension_discovery 2.1.0 already published at pub.dev
package:file 7.0.2-wip WIP (no publish necessary)
package:file_testing 3.1.0-wip WIP (no publish necessary)
package:glob 2.1.3 already published at pub.dev
package:graphs 2.3.3-wip WIP (no publish necessary)
package:html 0.15.6 already published at pub.dev
package:io 1.1.0-wip WIP (no publish necessary)
package:json_rpc_2 4.0.0 already published at pub.dev
package:markdown 7.3.1-wip WIP (no publish necessary)
package:mime 2.0.0 already published at pub.dev
package:oauth2 2.0.4-wip WIP (no publish necessary)
package:package_config 2.3.0-wip WIP (no publish necessary)
package:pool 1.5.2-wip WIP (no publish necessary)
package:process 5.0.4 already published at pub.dev
package:pub_semver 2.2.0 already published at pub.dev
package:pubspec_parse 1.5.0 already published at pub.dev
package:source_map_stack_trace 2.1.3-wip WIP (no publish necessary)
package:source_maps 0.10.14-wip WIP (no publish necessary)
package:source_span 1.10.1 already published at pub.dev
package:sse 4.1.8 already published at pub.dev
package:stack_trace 1.12.1 already published at pub.dev
package:stream_channel 2.1.4 already published at pub.dev
package:stream_transform 2.1.2-wip WIP (no publish necessary)
package:string_scanner 1.4.1 already published at pub.dev
package:term_glyph 1.2.3-wip WIP (no publish necessary)
package:test_reflective_loader 0.3.0 ready to publish test_reflective_loader-v0.3.0
package:timing 1.0.2 already published at pub.dev
package:unified_analytics 8.0.1 already published at pub.dev
package:watcher 1.1.1 already published at pub.dev
package:yaml 3.1.3 already published at pub.dev
package:yaml_edit 2.2.2 already published at pub.dev

Documentation at https://github.com/dart-lang/ecosystem/wiki/Publishing-automation.

Copy link

github-actions bot commented May 27, 2025

PR Health

Breaking changes ⚠️
Package Change Current Version New Version Needed Version Looking good?
coverage Breaking 1.14.0 1.14.1 2.0.0
Got "1.14.1" expected >= "2.0.0" (breaking changes)
⚠️

This check can be disabled by tagging the PR with skip-breaking-check.

Changelog Entry ✔️
Package Changed Files

Changes to files need to be accounted for in their respective changelogs.

Coverage ✔️
File Coverage
pkgs/coverage/lib/src/isolate_paused_listener.dart 💚 98 % ⬆️ 0 %

This check for test coverage is informational (issues shown here will not fail the PR).

API leaks ⚠️

The following packages contain symbols visible in the public API, but not exported by the library. Export these symbols or remove them from your publicly visible API.

Package Leaked API symbols
coverage _CoverageInfo

This check can be disabled by tagging the PR with skip-leaking-check.

License Headers ✔️
// Copyright (c) 2025, the Dart project authors. Please see the AUTHORS file
// for details. All rights reserved. Use of this source code is governed by a
// BSD-style license that can be found in the LICENSE file.
Files
no missing headers

All source files should start with a license header.

Unrelated files missing license headers
Files
pkgs/bazel_worker/benchmark/benchmark.dart
pkgs/bazel_worker/example/client.dart
pkgs/bazel_worker/example/worker.dart
pkgs/benchmark_harness/integration_test/perf_benchmark_test.dart
pkgs/boolean_selector/example/example.dart
pkgs/clock/lib/clock.dart
pkgs/clock/lib/src/clock.dart
pkgs/clock/lib/src/default.dart
pkgs/clock/lib/src/stopwatch.dart
pkgs/clock/lib/src/utils.dart
pkgs/clock/test/clock_test.dart
pkgs/clock/test/default_test.dart
pkgs/clock/test/stopwatch_test.dart
pkgs/clock/test/utils.dart
pkgs/coverage/lib/src/coverage_options.dart
pkgs/html/example/main.dart
pkgs/html/lib/dom.dart
pkgs/html/lib/dom_parsing.dart
pkgs/html/lib/html_escape.dart
pkgs/html/lib/parser.dart
pkgs/html/lib/src/constants.dart
pkgs/html/lib/src/encoding_parser.dart
pkgs/html/lib/src/html_input_stream.dart
pkgs/html/lib/src/list_proxy.dart
pkgs/html/lib/src/query_selector.dart
pkgs/html/lib/src/token.dart
pkgs/html/lib/src/tokenizer.dart
pkgs/html/lib/src/treebuilder.dart
pkgs/html/lib/src/utils.dart
pkgs/html/test/dom_test.dart
pkgs/html/test/parser_feature_test.dart
pkgs/html/test/parser_test.dart
pkgs/html/test/query_selector_test.dart
pkgs/html/test/selectors/level1_baseline_test.dart
pkgs/html/test/selectors/level1_lib.dart
pkgs/html/test/selectors/selectors.dart
pkgs/html/test/support.dart
pkgs/html/test/tokenizer_test.dart
pkgs/html/test/trie_test.dart
pkgs/html/tool/generate_trie.dart
pkgs/pubspec_parse/test/git_uri_test.dart
pkgs/stack_trace/example/example.dart
pkgs/watcher/test/custom_watcher_factory_test.dart
pkgs/yaml_edit/example/example.dart

@coveralls
Copy link

coveralls commented May 27, 2025

Pull Request Test Coverage Report for Build 15264956178

Details

  • 2 of 2 (100.0%) changed or added relevant lines in 1 file are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage increased (+0.009%) to 93.801%

Totals Coverage Status
Change from base Build 15232453930: 0.009%
Covered Lines: 696
Relevant Lines: 742

💛 - Coveralls


// TODO: Not sure how to write this test, since the RPCError is thrown in
// an async handler, and not propagated to this expectation.
// expect(() => endTest(), throwsA(isA<RPCError>()));
Copy link
Contributor Author

@liamappelbe liamappelbe May 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bkonyi Any suggestions about how to write this test? This expectation doesn't work. The endTest() future actually completes successfully. The RPCError is printed to the console as an unhandled exception.

00:00 +22 -1: IsolatePausedListener throw when resuming other isolate is not ignored [E]                                  
  resume: (-32000) other
  test/isolate_paused_listener_test.dart 533:11             main.<fn>.<fn>.<fn>
  package:mockito/src/mock.dart 186:47                      Mock.noSuchMethod
  test/collect_coverage_mock_test.mocks.dart 1642:14        MockVmService.resume
  package:coverage/src/isolate_paused_listener.dart 108:26  IsolatePausedListener._onPause
  ===== asynchronous gap ===========================
  package:coverage/src/isolate_paused_listener.dart 177:7   listenToIsolateLifecycleEvents.onPause
  ===== asynchronous gap ===========================
  package:coverage/src/isolate_paused_listener.dart 200:16  listenToIsolateLifecycleEvents.<fn>
  ===== asynchronous gap ===========================
  package:coverage/src/isolate_paused_listener.dart 277:7   IsolateEventBuffer.add
  
  resume: (-32000) other
  test/isolate_paused_listener_test.dart 533:11             main.<fn>.<fn>.<fn>
  package:mockito/src/mock.dart 186:47                      Mock.noSuchMethod
  test/collect_coverage_mock_test.mocks.dart 1642:14        MockVmService.resume
  package:coverage/src/isolate_paused_listener.dart 108:26  IsolatePausedListener._onPause
  ===== asynchronous gap ===========================
  package:coverage/src/isolate_paused_listener.dart 177:7   listenToIsolateLifecycleEvents.onPause
  ===== asynchronous gap ===========================
  package:coverage/src/isolate_paused_listener.dart 200:16  listenToIsolateLifecycleEvents.<fn>
  ===== asynchronous gap ===========================
  package:coverage/src/isolate_paused_listener.dart 277:7   IsolateEventBuffer.add
  
  Expected: throws <Instance of 'RPCError'>
    Actual: <Closure: () => Future<void>>
     Which: returned a Future that emitted <null>

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you tried using expectLater? I don't think expect will await the future returned by the closure, just simply registers a listener.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

expectLater doesn't work either. The problem is that the Future returned by endTest/waitUntilAllExited does complete successfully. The exception doesn't propagate through this future. The stack trace terminates at IsolateEventBuffer.add, which is invoked by VmService.onIsolateEvent.listen.

I thought you might have had to solve this problem in VmService's event listener tests. I feel like there's probably some solution involving Zones, but I can't figure out how to make it work.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you'd need to use a Zone if you don't have an error handler setup in isolate_paused_listener.dart. I noticed you don't actually catch anything there, so it makes sense that it's propagating through to the root zone handler.

I guess the question is, what do you expect to happen in this case? Should this exception be unhandled or just ignored?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When running the tool normally, this error should be reported (with the normal stack trace etc), not ignored. Leaving it as an unhandled exception does that. But for this test I want to catch the error and verify it is what I expect. I guess I'll have another go at using a Zone.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The solution was to wrap the test setup in a Zone, rather than wrapping the test itself.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When running the tool normally, this error should be reported (with the normal stack trace etc), not ignored.

Why do we want this error reported? Can it not be safely ignored?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This infra is all about managing the lifecycle of isolates and the VM under test, and the sequencing is pretty complex. This error indicates a bug in that sequencing that could lead to missing coverage (ie failing to collect coverage for an isolate group). If we report the error then it gives me a chance to debug it (or for users to file a bug), but if we ignore the error then the failure mode is that occasionally the coverage report will be incomplete. Much harder to notice and debug.

If I get a report of this error and I'm able to repro it, but it's not clear what's going on and there's no way of fixing the error, and it's rare and harmless, then I'll start ignoring this error. That's what happened for the main isolate error case.


// TODO: Not sure how to write this test, since the RPCError is thrown in
// an async handler, and not propagated to this expectation.
// expect(() => endTest(), throwsA(isA<RPCError>()));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you tried using expectLater? I don't think expect will await the future returned by the closure, just simply registers a listener.

@@ -11,7 +11,7 @@ import 'package:test_process/test_process.dart';

final String testAppPath = p.join('test', 'test_files', 'test_app.dart');

const Duration timeout = Duration(seconds: 30);
const Duration timeout = Duration(seconds: 60);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the timeout increase necessary?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not directly related to this PR, but I've seen a few flakes of the integration tests on GitHub CI due to timing out

@liamappelbe liamappelbe merged commit 700a3c4 into main Jun 4, 2025
18 checks passed
@liamappelbe liamappelbe deleted the cov_rpc_error branch June 4, 2025 04:27
copybara-service bot pushed a commit to dart-lang/sdk that referenced this pull request Jun 5, 2025
Revisions updated by `dart tools/rev_sdk_deps.dart`.

ai (https://github.com/dart-lang/ai/compare/6a71aeb..1d9d60c):
  1d9d60c  2025-06-05  Jacob MacDonald  handle relative paths under roots without trailing slashes (dart-lang/ai#152)

ecosystem (https://github.com/dart-lang/ecosystem/compare/8cebaf0..64aac3a):
  64aac3a  2025-06-03  Daco Harkes  [health] Bump dart_apitool (dart-lang/ecosystem#360)

i18n (https://github.com/dart-lang/i18n/compare/e44af54..43214dd):
  43214dde  2025-06-04  Moritz  Upgrade to new native_assets (dart-lang/i18n#964)

protobuf (https://github.com/dart-lang/protobuf/compare/c69077d..32d53da):
  32d53da  2025-06-05  Devon Carew  Refactor the test goldens so they have standard file name extensions (google/protobuf.dart#1014)

test (https://github.com/dart-lang/test/compare/e2ddae9..0793a2b):
  0793a2b3  2025-06-05  Agam Agarwal  Add isSorted and related matchers (dart-lang/test#2490)

tools (https://github.com/dart-lang/tools/compare/04c6849..e84cbd9):
  e84cbd9e  2025-06-04  Christophe Coevoet  [source_span] Add a test covering the highlighting of non-contiguous spans (dart-lang/tools#1666)
  700a3c4d  2025-06-04  Liam Appelbe  [coverage] Fix remaining ~0.1% flakiness (dart-lang/tools#2102)

webdev (https://github.com/dart-lang/webdev/compare/64492b2..55941b0):
  55941b0c  2025-06-05  Kevin Moore  [dwds] DRY up MD5/etag logic (dart-lang/webdev#2625)
  ab7c4d68  2025-06-05  dependabot[bot]  Bump the github-actions group across 1 directory with 3 updates (dart-lang/webdev#2593)
  f149e43f  2025-06-05  Nate Biggs  Update e2e_test expectation to look for more consistent message. (dart-lang/webdev#2630)
  80b1686b  2025-06-03  Nicholas Shahan  Fix e2e_test to work with current DDC output (dart-lang/webdev#2626)

Change-Id: I4802d7c4a7e39ba238f2f4ce0748dfa6dba9d8c4
Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/433240
Reviewed-by: Konstantin Shcheglov <[email protected]>
Commit-Queue: Devon Carew <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants