16 Oct 12:13

DriesSchaumont

18dfe17

openpipelines.bio v3.0.0 Latest

Latest

BREAKING CHANGES

transfer/publish: remove component after deprecating it in 2.1.0 (PR #1019).
Removed split_h5mu_train_test component (PR #1020).
tar_extract has been deprecated and will be removed in openpipeline 4.0 (PR #1014). Use vsh://toolbox/bgzip instead.
compress_h5mu: rename compression argument to output_compression (PR #1017, PR #1018).
delimit_fraction: remove unused layer argument (PR #1018).
download_file has been deprecated and will be removed in openpipeline 3.0 (PR #1015).
scarches: Loading of legacy models no longer asumes the model to based on SCANVI. An argument (reference_class) was added which need to be set in this case (PR #1035).
convert/from_h5mu_to_seurat has been deprecated and will be removed in openpipeline 4.0. Use convert/from_h5mu_or_h5ad_to_seurat instead (PR #1046).

NEW FUNCTIONALITY

liana: enabled jobs to be run in parallel and added two new arguments: consensus_opts, de_method (PR #1039)
from_h5mu_or_h5ad_to_seurat: converts an h5ad file or a single modality from an h5mu file to a seurat object (PR #1046).

EXPERIMENTAL

Warning: These experimental features are subject to change in future releases.

Added from_h5mu_or_h5ad_to_tiledb component (PR #1034).
Added differential_expression/create_pseudobulk: Generation of pseudobulk samples from single-cell transcriptomics data,
to create bulk-like expression profiles suitable for differential expression analysis with methods designed for bulk differential expression analysis (PR #1042).
Added annotate/singler: Cell type annotation using SingleR (PR #1051).
Added tiledb/move_mudata_obsm_to_tiledb (PR #1065).

MAJOR CHANGES

mapping/cellranger_*: Upgrade CellRanger to v9.0 (PR #992 and #1006).
leiden: bump base container to 3.13 (PR #1030).
scanvi, scarches, scvi and totalvi: bump scvi-tools to 1.3.1 and base image to nvcr.io/nvidia/pytorch:25.05-py3 (PR #1035).
lianapy: update liana to 1.5.0 (PR #1039)

MINOR CHANGES

velocyto: pin base container to python:3.10-slim-bookworm (PR #1063).
mapping/cellranger_multi: The output from Cell Ranger is now displayed as Cell Ranger is running (PR #1045).
Remove workflows directory (PR #993). The workflows which were at one point in this directory were all deprecated and moved to src/workflows.
Move output file compression argument for AnnData and MuData files to a base config file (src/base/h5_compression_argument.yaml) (PR #1017).
Add missing descriptions to components and arguments (PR #1018).
Add scope to component and workflow configurations (see https://viash.io/reference/config/scope.html) (PR #1013 and #1032).
workflows/multiomics/process_samples: Add optional --skip_scrublet_doublet_detection flag to bypass Scrublet doublet detection. Scrublet doublet detection runs by default and can now be optionally disabled (PR #1049).
Nextflow runner: use resourceLimits directive in the labels config to set a global limit on the memory (PR #1060).

BUG FIXES

cellranger_multi: Fix error when running Cell Ranger without any computational resources specified (PR #1056)
Bump viash to 0.9.4. This adds support for nextflow versions starting major version 25.01 and fixes an issue where an integer being passed to a argument with type: double resulted in an error (PR #1016).
Fix running neigbors_leiden_umap workflow with -stub enabled (PR #1026).
Add missing CUDA enabled jaxlib to components that use scvi-tools (scanvi, scarches, scvi and totalvi) (PR #1028)
leiden: fix issue where the logging system was shut down prematurely after the calculations were done (PR #1030)
Added missing gpu label to scarches component (PR #1027).
conversion/from_cellranger_multi_to_h5mu: fix conversion to MuData for experiments that combine probe barcodes with other feature barcodes (e.g. Antibody Capture and CIRSPR Guide Capture) (PR #1062).

Assets 2

07 May 07:55

DriesSchaumont

2.1.2

26c477d

openpipelines.bio v2.1.2

DOCUMENTATION

Update README (PR #1024, backported from #1012).

Assets 2

25 Apr 07:17

DriesSchaumont

2.1.1

d1e8e1c

openpipelines.bio v2.1.1

BUG FIXES

Add support for nextflow versions starting major version 25.01 (PR #1009).
Fix an issue where an integer being passed to a argument with type: double resulted in an error (PR #1009).

Assets 2

25 Apr 07:52

DriesSchaumont

2.0.1

346e690

openpipelines.bio v2.0.1

BUG FIXES

Add support for nextflow versions starting major version 25.01 (PR #1010).
Fix an issue where an interger being passed to a argument with type: double would causes an error (PR #1010).

Assets 2

24 Apr 13:59

DriesSchaumont

1.0.5

b0549ad

OpenPipelines.bio v1.0.5

BUG FIXES

Add support for nextflow versions starting major version 25.01 (PR #1008).
Fix an issue where an interger being passed to a argument with type: double causes an error (PR #1008).

Assets 2

12 Apr 07:10

rcannood

2.1.0

b79935a

OpenPipelines.bio v2.1.0

BREAKING CHANGES

Deprecation of metadata/duplicate_obs and metadata/duplicate_var components (PR #952).
Deprecation of workflows/annotation/scgpt_integration_knn component (PR #952).
annotate/scanvi: Remove scarches functionality from this component, as it is already covered in integrate/scarches (PR #986).

NEW FUNCTIONALITY

dataflow/concatenate_h5mu: add modality parameter (PR #977).
filter_with_scrublet: add expected_doublet_rate, stdev_doublet_rate, n_neighbors and sim_doublet_ratio arguments (PR #974).
feature_annotation/aling_query_reference: Added a component to align a query and reference dataset (PR #948, #958, #972).
workflows/qc/qc workflow: Added ribosomal gene detection (PR #961).
workflows/rna/rna_singlesample, workflows/multiomics/process_samples workflows: Added ribosomal gene detection (PR #968).
scanvi: enable CUDA acceleration (PR #969).
workflows/annotation/scvi_knn workflow: Cell-type annotation based on scVI integration followed by KNN label transfer (PR #954).
convert/from_h5ad_to_seurat: Add component to convert from h5ad to Seurat (PR #980).
workflows/annotation/scanvi_scarches workflow: Cell-type annotation based on scANVI integration and annotation with scArches for reference mapping (PR #898).
integrate/scarches: Implemented functionality to align the query dataset with the model registry and extend functionality to predict labels for scANVI models (PR #898).
workflows/annotation/harmony_knn workflow: Cell-type annotation based on harmony integration with KNN label transfer (PR #836).
from_cellranger_multi_to_h5mu: add support for custom modality (PR #982).
integrate/scvi: Enable passing any .var field for gene name information instead of .var index, using the --var_gene_names parameter (PR #986).

MAJOR CHANGES

Several components: when a component processes a single modality, only that modality is read into memory (PR #944)
The transfer/publish component is deprecated and will be removed in a future major release (PR #941).

MINOR CHANGES

Bump viash to 0.9.3 (PR #995).
Several workflows: refactor neighbors, leiden and UMAP in a separate subworkflow (PR #942 and PR #949).
grep_annotation_column and subset_obsp: Fix compatibility for SciPy (PR #945).
popv: Pin numpy<2 after new release of scvi-tools (PR #946).
Various components (scgpt and annotate): Add resource labels (PR #947, PR #950).
feature_annotation/highly_variable_features_scanpy: Enable calculation of HVG on a subset of genes (PR #957, PR #959).
integrate/scvi, integrate/totalvi and integrate/scarches: update base image to nvcr.io/nvidia/pytorch:24.12-py3, pin scvi-tools version to 1.1.5, unpin jax and jaxlib version (PR #970).
annotate/celltypist: Enable passing any layer with log normalized counts, enforce checking whether counts are log normalized (PR #971).
process_10xh5/filter_10xh5: update container base to ubuntu 24.04 (PR #983).

BUG FIXES

Fix -stub runs (PR #1000).
cluster/leiden: Fix an issue where insufficient shared memory (size of /dev/shm) causes the processing to hang.
utils/subset_vars: Convert .var column used for subsetting of dtype "boolean" to dtype "bool" when it doesn't contain NaN values (PR #959).
resources_test_scripts/annotation_test_data.sh: Add a layer to the annotation reference dataset with log normalized counts (PR #960).
annotate/celltypist: Fix missing values in annotation column caused by index misalignment (PR #976).
workflows/annotation/scgpt_annotation and workflows/integrate/scgpt_leiden: Parameterization of HVG flavor with default method cell_ranger instead of seurat_v3 (PR #979).
dataflow/merge: Resolved an issue where merging two MuData objects with overlapping var or obs columns sometimes resulted in an unsupported nullable dtype (PR #990), for instance when merging pd.IntegerDtype and pd.FloatDtype. These columns are now correctly cast to their native numpy dtypes before writing.
workflows/annotation/harmony_knn: Only process RNA modality in the workflow (PR #988).
Documentation CI: Fix building the documentation using CI (PR #1003).

Assets 2

04 Apr 12:57

DriesSchaumont

2.1.0-rc.2

49d7059

OpenPipelines.bio v2.1.0-rc.2 Pre-release

Pre-release

BUG FIXES

Fix -stub runs (PR #1000).

Assets 2

03 Apr 07:01

DriesSchaumont

2.1.0-rc.1

21c4fcc

OpenPipelines.bio v2.1.0-rc.1 Pre-release

Pre-release

BREAKING CHANGES

Deprecation of metadata/duplicate_obs and metadata/duplicate_var components (PR #952).
Deprecation of workflows/annotation/scgpt_integration_knn component (PR #952).
annotate/scanvi: Remove scarches functionality from this component, as it is already covered in integrate/scarches (PR #986).

NEW FUNCTIONALITY

dataflow/concatenate_h5mu: add modality parameter (PR #977).
filter_with_scrublet: add expected_doublet_rate, stdev_doublet_rate, n_neighbors and sim_doublet_ratio arguments (PR #974).
feature_annotation/aling_query_reference: Added a component to align a query and reference dataset (PR #948, #958, #972).
workflows/qc/qc workflow: Added ribosomal gene detection (PR #961).
workflows/rna/rna_singlesample, workflows/multiomics/process_samples workflows: Added ribosomal gene detection (PR #968).
scanvi: enable CUDA acceleration (PR #969).
workflows/annotation/scvi_knn workflow: Cell-type annotation based on scVI integration followed by KNN label transfer (PR #954).
convert/from_h5ad_to_seurat: Add component to convert from h5ad to Seurat (PR #980).
workflows/annotation/scanvi_scarches workflow: Cell-type annotation based on scANVI integration and annotation with scArches for reference mapping (PR #898).
integrate/scarches: Implemented functionality to align the query dataset with the model registry and extend functionality to predict labels for scANVI models (PR #898).
workflows/annotation/harmony_knn workflow: Cell-type annotation based on harmony integration with KNN label transfer (PR #836).
from_cellranger_multi_to_h5mu: add support for custom modality (PR #982).
integrate/scvi: Enable passing any .var field for gene name information instead of .var index, using the --var_gene_names parameter (PR #986).

MAJOR CHANGES

Several components: when a component processes a single modality, only that modality is read into memory (PR #944)
The transfer/publish component is deprecated and will be removed in a future major release (PR #941).

MINOR CHANGES

Bump viash to 0.9.3 (PR #995).
Several workflows: refactor neighbors, leiden and UMAP in a separate subworkflow (PR #942 and PR #949).
grep_annotation_column and subset_obsp: Fix compatibility for SciPy (PR #945).
popv: Pin numpy<2 after new release of scvi-tools (PR #946).
Various components (scgpt and annotate): Add resource labels (PR #947, PR #950).
feature_annotation/highly_variable_features_scanpy: Enable calculation of HVG on a subset of genes (PR #957, PR #959).
integrate/scvi, integrate/totalvi and integrate/scarches: update base image to nvcr.io/nvidia/pytorch:24.12-py3, pin scvi-tools version to 1.1.5, unpin jax and jaxlib version (PR #970).
annotate/celltypist: Enable passing any layer with log normalized counts, enforce checking whether counts are log normalized (PR #971).
process_10xh5/filter_10xh5: update container base to ubuntu 24.04 (PR #983).

BUG FIXES

cluster/leiden: Fix an issue where insufficient shared memory (size of /dev/shm) causes the processing to hang.
utils/subset_vars: Convert .var column used for subsetting of dtype "boolean" to dtype "bool" when it doesn't contain NaN values (PR #959).
resources_test_scripts/annotation_test_data.sh: Add a layer to the annotation reference dataset with log normalized counts (PR #960).
annotate/celltypist: Fix missing values in annotation column caused by index misalignment (PR #976).
workflows/annotation/scgpt_annotation and workflows/integrate/scgpt_leiden: Parameterization of HVG flavor with default method cell_ranger instead of seurat_v3 (PR #979).
dataflow/merge: Resolved an issue where merging two MuData objects with overlapping var or obs columns sometimes resulted in an unsupported nullable dtype (e.g. merging pd.IntegerDtype and pd.FloatDtype). These columns are now correctly cast to their native numpy dtypes before writing(PR #990).
workflows/annotation/harmony_knn: Only process RNA modality in the workflow (PR #988).

Assets 2

17 Dec 13:24

DriesSchaumont

1.0.4

18cc6c8

OpenPipelines.bio v1.0.4

BUG FIXES

scvi_leiden workflow: fix the input layer argument of the workflow not being passed to the scVI component (PR #939, backported from PR #936 and PR #938).

Assets 2

17 Dec 13:28

DriesSchaumont

2.0.0

60028cb

OpenPipelines.bio v2.0.0

BREAKING CHANGES

velocity/scvelo: update scvelo to 0.3.3, which also removes support for using loom input files. The component now uses a MuData object as input. Several arguments were added to support selecting different inputs from the MuData file: counts_layer, modality, layer_spliced, layer_unspliced, layer_ambiguous. An output_h5mu argument was has been added (PR #932).
src/annotate/onclass and src/annotate/celltypist: Input parameter for gene name layers of input datasets has been updated to --input_var_gene_names and reference_var_gene_names (PR #919).
Several components under src/scgpt (cross_check_genes, tokenize_pad, binning) now processes the input (query) datasets differently. Instead of subsetting datasets based on genes in the model vocabulary and/or highly variable genes, these components require an input .var column with a boolean mask specifying this information. The results are written back to the original input data, preserving the dataset structure (PR #832).
query/cellxgene_census: The default output layer has been changed from .layers["counts"] to .X to be more aligned with the standard OpenPipelines format (PR #933).
Use argument --output_layer_counts counts to revert the behaviour to the previous default.
Added cell multiplexing support to the from_cellranger_multi_to_h5mu component and the cellranger_multi workflow. For the from_cellranger_multi_to_h5mu component, the output argument now requires a value containing a wildcard character *, which will be replaced by the sample ID to form the final output file names. Additionally, a sample_csv argument is added to the from_cellragner_multi_to_h5mu component which describes the sample name per output file. No change is required for the output_h5mu argument from the cellranger_multi workflow, the workflow will just emit multiple events in case of a multiplexed run, one for each sample. The id of the events (and default output file names) are set by --sample_ids (in case of cell multiplexing), or (as before) by the user provided id for the input (PR #803 and PR #902).
demux/bcl_convert: update BCL convert from 3.10 to 4.2 (PR #774).
demux/cellranger_mkfastq, mapping/cellranger_count, mapping/cellranger_multi and reference/build_cellranger_reference: update cellranger to 8.0.1 (PR #774 and PR #811).
Removed --disable_library_compatibility_check in favour of --check_library_compatibility to the mapping/cellranger_multi component and the ingestion/cellranger_multi workflow (PR #818).
lianapy: bumped version to 1.3.0 (PR #827 and PR #862). Additionally, groupby is now a required argument.
concat: this component was deprecated and has now been removed, use concatenate_h5mu instead (PR #796).
The workflows folder in the root of the project no longer contains symbolic links to the build workflows in target.
Using any workflows that was previously linked in this directory will now result in an error which will indicate
the location of the workflow to be used instead (PR #796).
XGBoost: bump version to 2.0.3 (PR #646).
Several components: update anndata to 0.11.1 and mudata to 0.3.1 (PR #645 and PR #901), and scanpy to 1.10.4 (PR #901).
filter/filter_with_hvg: this component was deprecated and has now been removed. Use feature_annotation/highly_variable_features_scanpy instead (PR #843).
dataflow/concat: this component was deprecated and has now been removed. Use dataflow/concatenate_h5mu instead (PR #857).
convert/from_h5mu_to_seurat: bump seurat to latest version (PR #850).
workflows/ingestion/bd_rhapsody: Upgrade BD Rhapsody 1.x to 2.x, thereby changing the interface of the workflow (PR #846).
mapping/bd_rhapsody: Upgrade BD Rhapsody 1.x to 2.x, thereby changing the interface of the workflow (PR #846).
reference/make_bdrhap_reference: Upgrade BD Rhapsody 1.x to 2.x, thereby changing the interface of the workflow (PR #846).
reference/build_star_reference: Rename mapping/star_build_reference to reference/build_star_reference (PR #846).
reference/cellranger_mkgtf: Rename reference/mkgtf to reference/cellranger_mkgtf (PR #846).
labels_transfer/xgboost: Align interface with new annotation workflow
- Store label probabilities instead of uncertainties
- Take .h5mu format as an input instead of .h5ad
reference/build_cellranger_arc_reference: a default value of "output" is now specified for the argument --genome, inline with reference/build_cellranger_reference component. Additionally, providing a value for --organism is no longer required and its default value of Homo Sapiens has been removed (PR #864).

NEW FUNCTIONALITY

Important

Workflows from the workflows/annotation and workflows/integration/scgpt_leiden namespaces, plus their newly implemented dependencies, are not yet considered to be part of the stable public API. Their functionality and interface may be subject to change.

velocyto_to_h5mu: now writes counts to .X (PR #932)
qc/calculate_atac_qc_metrics: new component for calculating ATAC QC metrics (PR #868).
workflows/annotation/scgpt_integration_knn workflow: Cell-type annotation based on scGPT integration with KNN label transfer (PR #875).
CI: Use params.resources_test in test workflows in order to point to an alternative location (e.g. a cache) (PR #889).
Added demux/cellranger_atac_mkfastq component: demultiplex raw sequencing data for ATAC experiments (PR #726).
process_samples, process_batches and rna_multisample workflows: added functionality to scale the log-normalized
gene expression data to unit variance and zero mean. The scaled data will be output to a different layer and the
representation with reduced dimensions will be created and stored in addition to the non-scaled data (PR #733).
transform/scaling: add --input_layer and --output_layer arguments (PR #733).
CI: added checking of mudata contents for multiple workflows (PR #783).
Added multiple arguments to the cellranger_multi workflow in order to maintain feature parity with the mapping/cellranger_multi component (PR #803).
convert/from_cellranger_to_h5mu: add support for antigen analysis.
Added demux/cellranger_atac_mkfastq component: demultiplex raw sequencing data for ATAC experiments (PR #726).
Added reference/build_cellranger_reference component: build reference file compatible with ATAC and ATAC+GEX experiments (PR #726).
demux/bcl_convert: add support for no lane splitting (PR #804).
reference/cellranger_mkgtf component: Added cellranger mkgtf as a standalone component (PR #771).
scgpt/cross_check_genes component: Added a gene-model cross check component for scGPT (PR #758).
scgpt/embedding: component: Added scGPT embedding component (PR #761)
scgpt/tokenize_pad: component: Added scGPT padding and tokenization component (PR #754).
scgpt/binning component: Added a scGPT pre-processing binning component (PR #765).
workflows/integration/scgpt_leiden workflow with scGPT integration followed by Leiden clustering (PR #794).
scgpt/cell_type_annotation component: Added scGPT cell type annotation component (PR #798).
resources_test_scripts/scGPT.sh: Added script to include scGPT test resources (PR #800).
transform/clr component: Added the option to set the axis along which to apply CLR. Possible to override
on workflow level as well (PR #767).
annotate/celltypist component: Added a CellTypist annotation component (PR #825).
dataflow/split_h5mu component: Added a component to split a single h5mu file into multiple h5mu files based on the values of an .obs column (PR #824).
workflows/test_workflows/ingestion components & workflows/ingestion: Added standalone components for integration testing of ingestion workflows (PR #801).
workflows/ingestion/make_reference: Add additional arguments passed through to the STAR and BD Rhapsody reference components (PR #846).
annotate/random_forest_annotation component: Added a random forest cell type annotation component (PR #848).
dataflow/concatenate_h5mu: data from .uns, both originating from the global and per-modality slots, is now retained in the final concatenated output object. Additionally, added the uns_merge_mode argument in order to tune the behavior when conflicting keys are detected across samples (PR #859).
dimred/densmap component: Added a densMAP dimensionality reduction component (PR #748).
annotate/scanvi component: Added a component to annotate cells using scANVI (PR #833).
transform/bpcells_regress_out component: Added a component to regress out effects of confounding variables in the count matrix using BPCells (PR #863).
transform/regress_out: Allow providing 'input' and 'output' layers for scanpy regress_out functionality (PR #863).
workflows/ingestion/make_reference: add possibility to build CellRanger ARC references. Added --motifs_file, --non_nuclear_contigs and --output_cellranger_arc arguments (PR #864).
Test resources (reference_gencodev41_chr1): switch reference genome for CellRanger to ARC variant (PR #864).
transform/bpcells_regress_out component: Added a component to regress out effects of confounding variables in the count matrix using BPCells (PR #863).
transform/regress_out: Allow providing 'input' and 'output' layers for scanpy regress_out functionality (PR #863).
Added transform/tfidf component: normalize ATAC data with TF-IDF (PR #870).
Added dimred/lsi component (PR #552).
metadata/duplicate_obs component: Added a component to make a copy from one .obs field or index to another .obs field within...

Assets 2

Releases: openpipelines-bio/openpipeline

openpipelines.bio v3.0.0

BREAKING CHANGES

NEW FUNCTIONALITY

EXPERIMENTAL

MAJOR CHANGES

MINOR CHANGES

BUG FIXES

Uh oh!

openpipelines.bio v2.1.2

DOCUMENTATION

Uh oh!

openpipelines.bio v2.1.1

BUG FIXES

Uh oh!

openpipelines.bio v2.0.1

BUG FIXES

Uh oh!

OpenPipelines.bio v1.0.5

BUG FIXES

Uh oh!

OpenPipelines.bio v2.1.0

BREAKING CHANGES

NEW FUNCTIONALITY

MAJOR CHANGES

MINOR CHANGES

BUG FIXES

Uh oh!

OpenPipelines.bio v2.1.0-rc.2

BUG FIXES

Uh oh!

OpenPipelines.bio v2.1.0-rc.1

BREAKING CHANGES

NEW FUNCTIONALITY

MAJOR CHANGES

MINOR CHANGES

BUG FIXES

Uh oh!

OpenPipelines.bio v1.0.4

BUG FIXES

Uh oh!

OpenPipelines.bio v2.0.0

BREAKING CHANGES

NEW FUNCTIONALITY

Uh oh!