Release OpenPipelines.bio v2.1.0 · openpipelines-bio/openpipeline

BREAKING CHANGES

Deprecation of metadata/duplicate_obs and metadata/duplicate_var components (PR #952).
Deprecation of workflows/annotation/scgpt_integration_knn component (PR #952).
annotate/scanvi: Remove scarches functionality from this component, as it is already covered in integrate/scarches (PR #986).

dataflow/concatenate_h5mu: add modality parameter (PR #977).
filter_with_scrublet: add expected_doublet_rate, stdev_doublet_rate, n_neighbors and sim_doublet_ratio arguments (PR #974).
feature_annotation/aling_query_reference: Added a component to align a query and reference dataset (PR #948, #958, #972).
workflows/qc/qc workflow: Added ribosomal gene detection (PR #961).
workflows/rna/rna_singlesample, workflows/multiomics/process_samples workflows: Added ribosomal gene detection (PR #968).
scanvi: enable CUDA acceleration (PR #969).
workflows/annotation/scvi_knn workflow: Cell-type annotation based on scVI integration followed by KNN label transfer (PR #954).
convert/from_h5ad_to_seurat: Add component to convert from h5ad to Seurat (PR #980).
workflows/annotation/scanvi_scarches workflow: Cell-type annotation based on scANVI integration and annotation with scArches for reference mapping (PR #898).
integrate/scarches: Implemented functionality to align the query dataset with the model registry and extend functionality to predict labels for scANVI models (PR #898).
workflows/annotation/harmony_knn workflow: Cell-type annotation based on harmony integration with KNN label transfer (PR #836).
from_cellranger_multi_to_h5mu: add support for custom modality (PR #982).
integrate/scvi: Enable passing any .var field for gene name information instead of .var index, using the --var_gene_names parameter (PR #986).

Several components: when a component processes a single modality, only that modality is read into memory (PR #944)
The transfer/publish component is deprecated and will be removed in a future major release (PR #941).

Bump viash to 0.9.3 (PR #995).
Several workflows: refactor neighbors, leiden and UMAP in a separate subworkflow (PR #942 and PR #949).
grep_annotation_column and subset_obsp: Fix compatibility for SciPy (PR #945).
popv: Pin numpy<2 after new release of scvi-tools (PR #946).
Various components (scgpt and annotate): Add resource labels (PR #947, PR #950).
feature_annotation/highly_variable_features_scanpy: Enable calculation of HVG on a subset of genes (PR #957, PR #959).
integrate/scvi, integrate/totalvi and integrate/scarches: update base image to nvcr.io/nvidia/pytorch:24.12-py3, pin scvi-tools version to 1.1.5, unpin jax and jaxlib version (PR #970).
annotate/celltypist: Enable passing any layer with log normalized counts, enforce checking whether counts are log normalized (PR #971).
process_10xh5/filter_10xh5: update container base to ubuntu 24.04 (PR #983).

Fix -stub runs (PR #1000).
cluster/leiden: Fix an issue where insufficient shared memory (size of /dev/shm) causes the processing to hang.
utils/subset_vars: Convert .var column used for subsetting of dtype "boolean" to dtype "bool" when it doesn't contain NaN values (PR #959).
resources_test_scripts/annotation_test_data.sh: Add a layer to the annotation reference dataset with log normalized counts (PR #960).
annotate/celltypist: Fix missing values in annotation column caused by index misalignment (PR #976).
workflows/annotation/scgpt_annotation and workflows/integrate/scgpt_leiden: Parameterization of HVG flavor with default method cell_ranger instead of seurat_v3 (PR #979).
dataflow/merge: Resolved an issue where merging two MuData objects with overlapping var or obs columns sometimes resulted in an unsupported nullable dtype (PR #990), for instance when merging pd.IntegerDtype and pd.FloatDtype. These columns are now correctly cast to their native numpy dtypes before writing.
workflows/annotation/harmony_knn: Only process RNA modality in the workflow (PR #988).
Documentation CI: Fix building the documentation using CI (PR #1003).