Skip to content

fix: align benchmarks with rebuilt :dev (noodles) RustQC image#30

Open
ewels wants to merge 1 commit into
mainfrom
worktree-noodles-migration
Open

fix: align benchmarks with rebuilt :dev (noodles) RustQC image#30
ewels wants to merge 1 commit into
mainfrom
worktree-noodles-migration

Conversation

@ewels

@ewels ewels commented Jun 16, 2026

Copy link
Copy Markdown
Member

Warning

This PR was generated by an automated agent (Claude Code) and has not been reviewed by a human yet. Please review carefully before merging.

Why

Issue seqeralabs/RustQC#113 (switch rust-htslib → noodles) merged on 2026-06-16, and ghcr.io/seqeralabs/rustqc:dev was rebuilt from main HEAD minutes later (now rustqc 0.2.1, commit 2556c9d, no libhts linkage — confirmed pure-Rust I/O).

That newer build changed two CLI surfaces, which broke every RUSTQC_RNA run repo-wide — this is the current red CI (ERROR ~ mapping values are not allowed here across all conda/docker/singularity shards):

  1. --stranded now takes string values (unstranded/forward/reverse) instead of numeric 0/1/2invalid value '0' for '--stranded'.
  2. rustqc --version now prints a second line (Binary: <arch> | CPU: ...), so the old sed 's/rustqc //' wrote multi-line, colon-containing text into versions.yml → invalid YAML → every versions-parsing test fails.

What changed

  • modules/local/rustqc_rna.nf + rustqc_rna_profile.nf: pass string strandedness; extract only the semver for versions.yml (… | head -n1 | cut -d' ' -f2).
  • Regenerated 4 stale regression snapshots against the current build (baselines were frozen at the March v0.1.x build):
    snapshot nature of change verdict
    dupradar trailing whitespace removed; numeric values identical cosmetic
    featurecounts biotype counts now match upstream (protein_coding 39736→42331); old baseline disagreed with upstream correctness ✅
    tin per-transcript value drift; identical 986-gene set; re-run confirmed deterministic benign build drift
    junction_saturation one value max(3,3,0)max(2,3,0); deterministic on re-run benign build drift
  • README: added samtools + tin to the comparison table; documented that :dev tracks main incl. noodles; noted RustQC no longer needs htslib/cmake to build (only g++ + libfontconfig1-dev, MSRV 1.89); added samtools to the layout + snapshot-update docs.

Validation (local, Docker, small dataset)

All 16 crosscheck-vs-upstream tests pass without snapshot changes (the task's required outcome — RustQC still matches every upstream tool). Full local results:

  • ✅ upstream: 16/16 · ✅ crosscheck: 16/16 · ✅ regression: 15/15 (11 unchanged + 4 regenerated) · ✅ pipeline smoke: 4/4

nf-test work dirs were routed to an external drive; no committed config changes for that.

Out of scope / pre-existing (NOT introduced here)

  • nf-core template lint (28 failures: missing .github/CONTRIBUTING.md, email_template.html, LICENSE/workflow mismatches, JSON syntax in adaptivecard.json/slackreport.json) — pre-existing template drift, unrelated to noodles.
  • tests/default.nf.test has no committed snapshot on main and CI runs --ci (fails on missing snapshot). Its baseline is environment-sensitive (Linux-generated, PDF plots excluded) and is being addressed on the bigwig branch (add/rustqc-bigwig-nf-test); generating it on macOS/ARM here would itself break CI, so it is intentionally left out. The versions.yml fix here does clear the mapping values error for it.
  • No versioned RustQC release tag exists for noodles yet, so --rustqc_image stays at :dev (documented).

The ghcr.io/seqeralabs/rustqc:dev image was rebuilt from main HEAD right
after the noodles migration merged (closes seqeralabs/RustQC#113). The
newer build (rustqc 0.2.1) changed two CLI surfaces that broke every
RUSTQC_RNA run repo-wide (this is the current CI failure):

- `--stranded` now takes string values (unstranded/forward/reverse)
  instead of numeric 0/1/2 -> "invalid value '0' for '--stranded'".
- `rustqc --version` now prints a second line ("Binary: <arch> ...")
  so the old `sed 's/rustqc //'` produced invalid YAML in versions.yml
  -> "mapping values are not allowed here", failing every test that
  parses versions (pipeline + nf-core software-versions subworkflow).

Changes:
- rustqc_rna.nf / rustqc_rna_profile.nf: pass string strandedness and
  extract only the semver for versions.yml.
- Regenerate 4 stale regression snapshots against the current build:
  dupradar (trailing-whitespace only; values identical), featurecounts
  (biotype counts now MATCH upstream: protein_coding 39736->42331),
  tin + junction_saturation (deterministic value drift since the March
  v0.1.x snapshot build; re-run confirmed reproducible). All 16
  crosscheck-vs-upstream tests pass unchanged.
- README: add samtools + tin to the comparison table, note :dev tracks
  main incl. noodles, and that RustQC no longer needs htslib/cmake to
  build (g++ + libfontconfig1-dev only, MSRV 1.89); add samtools to the
  layout + snapshot-update docs.

Verified locally (Docker, small dataset): all upstream (16), crosscheck
(16), regression (15) and pipeline smoke (4) nf-tests pass.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant