MerXen

Pre-processing, segmentation, and comparative analysis of paired MERSCOPE and Xenium spatial transcriptomics datasets.

What it does

MerXen takes paired spatial transcriptomics datasets (one MERSCOPE, one Xenium per tissue section pair) and runs a standardised pipeline:

SpatialData build — Builds platform-specific SpatialData zarrs from raw MERSCOPE and Xenium output folders
Cell segmentation — Cellpose-SAM image-based segmentation followed by ProSeg transcript-based refinement
Section alignment — Optionally registers paired adjacent sections to a Xenium reference coordinate system with Spateo
Comparative analysis — QC metrics, gene-level comparison, visualisation, first-pass Scanpy/Squidpy clustering, and optional local MapMyCells cell type assignment across platforms

The workflow is orchestrated by Nextflow to process multiple sample pairs with logging and reproducibility.

Documentation

Full documentation lives in docs/. Start with docs/index.md.

Usage: Getting started · Samplesheet format · Running the pipeline · Configuration · Outputs
Pipeline stages: SpatialData build · Segmentation · Enrichment · QC · Alignment · Comparison · Visualization · Squidpy clustering · MapMyCells
Developer reference: Pipeline architecture · Python API · CLI reference · Development workflow

Repository layout

MerXen/
├── workflows/                  # Nextflow pipeline
│   ├── main.nf                 # DSL2 entry point
│   ├── nextflow.config         # Parameters, executor, per-process resources
│   ├── samplesheet.example.csv # Template samplesheet
│   └── modules/                # One .nf module per pipeline stage
├── src/merxen/                 # Installable Python package
│   ├── config.py               # Pydantic configs (pipeline contract)
│   ├── cli/                    # Click entry points (one per stage)
│   ├── io/                     # Samplesheet, SpatialData builders, image/transcript I/O
│   ├── segmentation/           # Cellpose tiling + ProSeg subprocess
│   ├── enrichment/             # Shape layers + per-shape gene tables
│   ├── qc/                     # Per-dataset and cross-platform metrics
│   ├── visualization/          # Plotting
│   ├── analysis/               # Scanpy/Squidpy downstream analyses
│   └── alignment/              # Optional Spateo cross-section registration
├── tests/                      # pytest suite, mirrors src/merxen/
├── docs/                       # Project documentation (start at docs/index.md)
├── notebooks/                  # Exploratory notebooks only
├── pyproject.toml              # Dependencies, merxen entry point, tool config
├── environment.yml             # Conda env (Python 3.12 + pip)
├── requirements.lock           # Pinned dependency tree
├── .env.example                # Required environment variables template
├── Agents.md                   # Project standards (must-read for contributors)
└── CLAUDE.md                   # Short overview + pointer to Agents.md

Setup

# Create conda environment
conda env create -f environment.yml
conda activate merxen

# Optional: enable Spateo-based section alignment
pip install spateo-release==1.1.1
pip install "anndata>=0.12.10"

# Install pre-commit hooks
pre-commit install
pre-commit install --hook-type pre-push

Required environment variables

Copy .env.example to .env and fill in values:

cp .env.example .env

See .env.example for required variables (ProSeg binary path, output root, etc.) and docs/configuration.md for the full environment + Nextflow parameter reference.

Running the pipeline

# Run via Nextflow with a samplesheet
nextflow run workflows/main.nf --samplesheet samples.csv --outdir ./results --proseg_binary /path/to/proseg

A template samplesheet is provided at workflows/samplesheet.example.csv. Copy and edit this file with your dataset-specific paths before running the workflow.

cp workflows/samplesheet.example.csv workflows/samplesheet.csv

The samplesheet points at raw platform folders with optional reusable SpatialData cache paths (merscope_dir, merscope_spatialdata_path, xenium_dir, xenium_spatialdata_path, plus per-platform channel, z-range, and voxel-layer settings). The full schema, validation rules, and worked examples are documented in docs/samplesheet.md. For Nextflow invocation options — resuming, stage-range runs, force rebuild, parameter overrides, cluster execution — see docs/running-the-pipeline.md.

Running tests

# All tests (excluding slow)
pytest

# Including integration tests
pytest -m "not slow"

# Full suite
pytest --run-slow

Development

# Lint and format
ruff check . --fix
ruff format .

# Type check
mypy src/

# Regenerate lockfile after changing dependencies
uv pip compile pyproject.toml --extra dev -o requirements.lock

Project standards (layout, dependencies, naming, type hints, docstrings, git workflow, commit message prefixes) are defined in Agents.md. Day-to-day development mechanics — testing, pre-commit hooks, CI, debugging, adding a new pipeline stage — are in docs/development.md.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MerXen

What it does

Documentation

Repository layout

Setup

Required environment variables

Running the pipeline

Running tests

Development

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 80 Commits
.github/workflows		.github/workflows
docs		docs
notebooks		notebooks
src/merxen		src/merxen
tests		tests
workflows		workflows
.env.example		.env.example
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
Agents.md		Agents.md
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
README.md		README.md
environment.yml		environment.yml
pyproject.toml		pyproject.toml
requirements.lock		requirements.lock
rerun_visualization_only.config		rerun_visualization_only.config
ssd2_work.config		ssd2_work.config

Folders and files

Latest commit

History

Repository files navigation

MerXen

What it does

Documentation

Repository layout

Setup

Required environment variables

Running the pipeline

Running tests

Development

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages