|
| 1 | +- Feature Name: `libtest-json` |
| 2 | +- Start Date: 2024-01-18 |
| 3 | +- Pre-RFC: [Internals](https://internals.rust-lang.org/t/path-for-stabilizing-libtests-json-output/20163) |
| 4 | +- eRFC PR: [rust-lang/rfcs#3558](https://github.com/rust-lang/rfcs/pull/3558) |
| 5 | +- Rust Issue: [rust-lang/rust#0000](https://github.com/rust-lang/rust/issues/0000) |
| 6 | + |
| 7 | +# Summary |
| 8 | +[summary]: #summary |
| 9 | + |
| 10 | +This eRFC lays out a path for [stabilizing programmatic output for libtest](https://github.com/rust-lang/rust/issues/49359). |
| 11 | + |
| 12 | +# Motivation |
| 13 | +[motivation]: #motivation |
| 14 | + |
| 15 | +[libtest](https://github.com/rust-lang/rust/tree/master/library/test) |
| 16 | +is the test harness used by default for tests in cargo projects. |
| 17 | +It provides the CLI that cargo calls into and enumerates and runs the tests discovered in that binary. |
| 18 | +It ships with rustup and has the same compatibility guarantees as the standard library. |
| 19 | + |
| 20 | +Before 1.70, anyone could pass `--format json` despite it being unstable. |
| 21 | +When this was fixed to require nightly, |
| 22 | +this helped show [how much people have come to rely on programmatic output](https://www.reddit.com/r/rust/comments/13xqhbm/announcing_rust_1700/jmji422/). |
| 23 | + |
| 24 | +Cargo could also benefit from programmatic test output to improve user interactions, including |
| 25 | +- [Wanting to run test binaries in parallel](https://github.com/rust-lang/cargo/issues/5609), like `cargo nextest` |
| 26 | +- [Lack of summary across all binaries](https://github.com/rust-lang/cargo/issues/4324) |
| 27 | +- [Noisy test output](https://github.com/rust-lang/cargo/issues/2832) (see also [#5089](https://github.com/rust-lang/cargo/issues/5089)) |
| 28 | +- [Confusing command-line interactions](https://github.com/rust-lang/cargo/issues/1983) (see also [#8903](https://github.com/rust-lang/cargo/issues/8903), [#10392](https://github.com/rust-lang/cargo/issues/10392)) |
| 29 | +- [Poor messaging when a filter doesn't match](https://github.com/rust-lang/cargo/issues/6151) |
| 30 | +- [Smarter test execution order](https://github.com/rust-lang/cargo/issues/6266) (see also [#8685](https://github.com/rust-lang/cargo/issues/8685), [#10673](https://github.com/rust-lang/cargo/issues/10673)) |
| 31 | +- [JUnit output is incorrect when running multiple test binaries](https://github.com/rust-lang/rust/issues/85563) |
| 32 | +- [Lack of failure when test binaries exit unexpectedly](https://github.com/rust-lang/rust/issues/87323) |
| 33 | + |
| 34 | +Most of that involves shifting responsibilities from the test harness to the test runner which has the side effects of: |
| 35 | +- Allowing more powerful experiments with custom test runners (e.g. [`cargo nextest`](https://crates.io/crates/cargo-nextest)) as they'll have more information to operate on |
| 36 | +- Lowering the barrier for custom test harnesses (like [`libtest-mimic`](https://crates.io/crates/libtest-mimic)) as UI responsibilities are shifted to the test runner (`cargo test`) |
| 37 | + |
| 38 | +# Guide-level explanation |
| 39 | +[guide-level-explanation]: #guide-level-explanation |
| 40 | + |
| 41 | +The intended outcomes of this experiment are: |
| 42 | +- Updates to libtest's unstable output |
| 43 | +- A stabilization request to [T-libs-api](https://www.rust-lang.org/governance/teams/library#Library%20API%20team) using the process of their choosing |
| 44 | + |
| 45 | +Additional outcomes we hope for are: |
| 46 | +- A change proposal for [T-cargo](https://www.rust-lang.org/governance/teams/dev-tools#Cargo%20team) for `cargo test` and `cargo bench` to provide their own UX on top of the programmatic output |
| 47 | +- A change proposal for [T-cargo](https://www.rust-lang.org/governance/teams/dev-tools#Cargo%20team) to allow users of custom test harnesses to opt-in to the new UX using programmatic output |
| 48 | + |
| 49 | +While having a plan for evolution takes some burden off of the format, |
| 50 | +we should still do some due diligence in ensuring the format works well for our intended uses. |
| 51 | +Our rough plan for vetting a proposal is: |
| 52 | +1. Create an experimental test harness where each `--format <mode>` is a skin over a common internal `serde` structure, emulating what `libtest` and `cargo`s relationship will be like on a smaller scale for faster iteration |
| 53 | +2. Transition libtest to this proposed interface |
| 54 | +3. Add experimental support for cargo to interact with test binaries through the unstable programmatic output |
| 55 | +4. Create a stabilization report for programmatic output for T-libs-api and a cargo RFC for custom test harnesses to opt into this new protocol |
| 56 | + |
| 57 | +It is expected that the experimental test harness have functional parity with `libtest`, including |
| 58 | +- Ignored tests |
| 59 | +- Parallel running of tests |
| 60 | +- Benches being both a bench and a test |
| 61 | +- Test discovery |
| 62 | + |
| 63 | +We should evaluate the design against the capabilities of test runners from different ecosystems to ensure the format has the expandability for what people may do with custom test harnesses or `cargo test`, including: |
| 64 | +- Ability to implement different format modes on top |
| 65 | + - Both test running and `--list` mode |
| 66 | +- Ability to run test harnesses in parallel |
| 67 | +- [Tests with multiple failures](https://docs.rs/googletest/0.10.0/googletest/prelude/macro.expect_that.html) |
| 68 | +- Bench support |
| 69 | +- Static and dynamic [parameterized tests / test fixtures](https://crates.io/crates/rstest) |
| 70 | +- Static and [dynamic test skipping](https://doc.crates.io/contrib/tests/writing.html#cargo_test-attribute) |
| 71 | +- [Test markers](https://docs.pytest.org/en/7.4.x/example/markers.html#mark-examples) |
| 72 | +- doctests |
| 73 | +- Test location (for IDEs) |
| 74 | +- Collect metrics related to tests |
| 75 | + - Elapsed time |
| 76 | + - Temp dir sizes |
| 77 | + - RNG seed |
| 78 | + |
| 79 | +**Warning:** This doesn't mean they'll all be supported in the initial stabilization just that we feel confident the format will support them) |
| 80 | + |
| 81 | +We also need to evaluate how we'll support evolving the format. |
| 82 | +An important consideration is that the compile-time burden we put on custom |
| 83 | +test harnesses as that will be an important factor for people's willingness to |
| 84 | +pull them in as `libtest` comes pre-built today. |
| 85 | + |
| 86 | +Custom test harnesses are important for this discussion because |
| 87 | +- Many already exist today, directly or shoe-horned on top of `libtest`, like |
| 88 | + - [libtest-mimic](https://crates.io/crates/libtest-mimic) |
| 89 | + - [criterion](https://crates.io/crates/criterion) |
| 90 | + - [divan](https://crates.io/crates/divan) |
| 91 | + - [cargo-test-support](https://doc.rust-lang.org/nightly/nightly-rustc/cargo_test_support/index.html) |
| 92 | + - [rstest](https://crates.io/crates/rstest) |
| 93 | + - [trybuild](https://crates.io/crates/trybuild) |
| 94 | +- The compatibility guarantees around libtest mean that development of new ideas is easier through custom test harnesses |
| 95 | + |
| 96 | +# Reference-level explanation |
| 97 | +[reference-level-explanation]: #reference-level-explanation |
| 98 | + |
| 99 | +## Resources |
| 100 | + |
| 101 | +Comments made on libtests format |
| 102 | +- [Format is complex](https://github.com/rust-lang/rust/issues/49359#issuecomment-467994590) (see also [1](https://github.com/rust-lang/rust/issues/49359#issuecomment-1531369119)) |
| 103 | +- [Benches need love](https://github.com/rust-lang/rust/issues/49359#issuecomment-467994590) |
| 104 | +- [Type field is overloaded](https://github.com/rust-lang/rust/issues/49359#issuecomment-467994590) |
| 105 | +- [Suite/child relationship is missing](https://github.com/rust-lang/rust/issues/49359) |
| 106 | +- [Lack of suite name makes it hard to use programmatic output from Cargo](https://github.com/rust-lang/rust/issues/49359#issuecomment-533154674) (see also [1](https://github.com/rust-lang/rust/issues/49359#issuecomment-699691296)) |
| 107 | +- [Format is underspecified](https://github.com/rust-lang/rust/issues/49359#issuecomment-706566635) |
| 108 | +- ~~[Lacks ignored reason](https://github.com/rust-lang/rust/issues/49359#issuecomment-715877950)~~ ([resolved?](https://github.com/rust-lang/rust/issues/49359#issuecomment-1531369119)) |
| 109 | +- [Lack of `rendered` field](https://github.com/rust-lang/rust/issues/49359#issuecomment-1531369119) |
| 110 | + |
| 111 | +# Drawbacks |
| 112 | +[drawbacks]: #drawbacks |
| 113 | + |
| 114 | +# Rationale and alternatives |
| 115 | +[rationale-and-alternatives]: #rationale-and-alternatives |
| 116 | + |
| 117 | +See also |
| 118 | +- https://internals.rust-lang.org/t/alternate-libtest-output-format/6121 |
| 119 | +- https://internals.rust-lang.org/t/past-present-and-future-for-rust-testing/6354 |
| 120 | + |
| 121 | +# Prior art |
| 122 | +[prior-art]: #prior-art |
| 123 | + |
| 124 | +Existing formats |
| 125 | +- junit |
| 126 | +- [subunit](https://github.com/testing-cabal/subunit) |
| 127 | +- [TAP](https://testanything.org/) |
| 128 | + |
| 129 | +# Unresolved questions |
| 130 | +[unresolved-questions]: #unresolved-questions |
| 131 | + |
| 132 | +# Future possibilities |
| 133 | +[future-possibilities]: #future-possibilities |
| 134 | + |
| 135 | +## Improve custom test harness experience |
| 136 | + |
| 137 | +With less of a burden being placed on custom test harnesses, |
| 138 | +we can more easily explore what is needed for making them be a first-class experience. |
| 139 | + |
| 140 | +See |
| 141 | +- [eRFC 2318: Custom Test Frameworks](https://rust-lang.github.io/rfcs/2318-custom-test-frameworks.html) |
| 142 | +- [Blog Post: Iterating on Test](https://epage.github.io/blog/2023/06/iterating-on-test/) |
0 commit comments