Releases: openpipelines-bio/openpipeline
OpenPipelines.bio v1.0.0-rc3
BREAKING CHANGES
- Docker image names now use
/instead of_between the name of the component and the namespace (PR #712).
BUG FIXES
-
rna_singlesample: fixed a bug where selecting the column for the filtering with mitochondrial fractions
usingobs_name_mitochondrial_fractionwas done with the wrong column name, causingValueError(PR #743). -
Fix publishing in
process_samplesandprocess_batches(PR #759).
NEW FUNCTIONALITY
dimred/tsnecomponent: Added a tSNE dimensionality reduction component (PR #742).
OpenPipelines.bio v1.0.0-rc2
BUG FIXES
-
Cellranger multi: Fix using a relative input path for
--vdj_inner_enrichment_primers(PR #717) -
dataflow/split_modalities: remove unusedcompressionargument. Useoutput_compressioninstead (PR #714). -
metadata/grep_annotation_column: fix calculating fraction when an input observation has no counts, which caused
the result to be out of bounds. -
Fix
--outputargument not working for several workflows (PR #740).
MINOR CHANGES
-
metadata/grep_annotation_column: Added more logging output (PR #697). -
metadata/add_idandmetadata/grep_annotation_column: Bump python to 3.11 (PR #697). -
Bump viash to 0.8.5 (PR #697)
-
dataflow/split_modalities: add more logging output and bump python to 3.12 (PR #714). -
correction/cellbender: Update nextflow resource labels fromsinglecpuandlowmemtomidcpuandmidmem(PR #736)
OpenPipelines.bio v1.0.0rc1
BREAKING CHANGES
-
Change separator for arguments with multiple inputs from
:to;(PR #700 and #707). Now, all arguments withmultiple: truewill use;as the separator.
This change was made to be able to deal with file paths that contain:, e.g.s3://my-bucket/my:file.txt. Furthermore, the;separator will become
the default separator for all arguments withmultiple: truein Viash >= 0.9.0. -
This project now uses viash version 0.8.4 to build components and workflows. Changes related to this version update should
be mostly backwards compatible with respect to the results and execution of the pipelines. From a development perspective,
drastic updates have been made to the developemt workflow.Development related changes:
- Bump viash version to 0.8.4 (PR #598, PR#638 and #706) in the project configuration.
- All pipelines no longer use the anonymous workflow. Instead, these workflows were given
a name which was added to the viash config as the entrypoint to the pipeline (PR #598). - Removed the
workflowsfolder and moved its contents to new locations:-
The
resources_test_scriptsfolder now resides in the root of the project (PR #605). -
All workflows have been moved to the
src/workflowsfolder (PR #605).
This implies that workflows must now be build usingviash (ns) build, just like with components. -
Adjust GitHub Actions to account for new workflow paths (PR #605).
-
In order to be backwards compatible, the
workflowsfolder now contains symbolic
links to the build workflows intarget. This is not a problem when using the repository for pipeline
execution. However, if a developer wishes to contribute to the project, symlink support should be enabled
in git usinggit config core.symlinks=true. Alternatively, use
git clone -c core.symlinks=true [email protected]:openpipelines-bio/openpipeline.gitwhen cloning the
repository. This avoids the symlinks being resolved (PR #628).
4bis. With PR #668, the workflows have been renamed. This does not hamper the backwards compatibility
of the symlinks that have been described in 4, because they still use the original location
which includes the original name.
*multiomics/rna_singlesamplehas been renamed torna/process_single_sample,
*multiomics/rna_multisamplehas been renamed torna/rna_multisample,
*multiomics/prot_multisamplebecameprot/prot_multisample,
*multiomics/prot_singlesamplebecameprot/prot_singlesample,
*multiomics/full_pipelinewas moved tomultiomics/process_samples,
*multiomics/multisamplehas been renamed tomultiomics/process_batches,
*multiomics/integration/initialize_integrationchanged tomultiomics/dimensionality_reduction,
* finally, all workflows atmultiomics/integration/*were moved tointegration/* -
Removed the
workflows/utilsfolder. Functionality that was provided by theDataflowHelper
andWorkflowHelperis now being provided by viash when the workflow is being build (PR #605).
-
End-user facing changes:
- The
concatcomponent had been deprecated and will be removed in a future release.
It's functionality has been copied to theconcatenate_h5mucomponent because the name is in
conflict with theconcatoperator from nextflow (PR #598). prot_singlesample,rna_singlesample,prot_multisampleandrna_multisample: QC statistics
are now only calculated once where needed. This means that the mitochondrial gene detection is
performed in therna_singlesamplepipeline and the other count based statistics are calculated
during theprot_multisampleandrna_multisamplepipelines. In both cases, theqcpipeline
is being used, but only parts of that workflow are activated by parametrization. Previously
the count based statistics were calculated in both thesinglesampleandmultisamplepipelines,
with the results from the multisample pipelines overwriting the previous results. What is breaking here
is that the qc statistics are not being added to the results of the singlesample worklows.
This is not an issue when using thefull_pipelinebecause in this case the singlesample and
multisample workflows are executed in-tandem. If you wish to execute the singlesample workflows
in a seperate manner and still include count based statistics, please run theqcpipeline
on the result of the singlesample workflow (PR #604).filter/filter_with_hvghas been renamed tofeature_annotation/highly_variable_features_scanpy, along with the following changes (PR #667).--do_filterwas removed--n_top_geneshas been renamed to--n_top_features
full_pipeline,multisampleandrna_multisample: Renamed arguments (PR #667).--filter_with_hvg_var_outputbecame--highly_variable_features_obs_batch_key--filter_with_hvg_obs_batch_keybecame--highly_variable_features_var_output
rna_multisample: Renamed arguments (PR #667).--filter_with_hvg_n_top_genesbecame--highly_variable_features_n_top_features--filter_with_hvg_flavorbecame--highly_variable_features_flavor
-
Renamed
obsm_metricstouns_metricsfor thecellranger_mappingworkflow because the cellranger metrics are stored in.unsand not.obsm(PR #610).
MAJOR CHANGES
mapping/cellranger_mkfastq: update from cellranger6.0.2to7.0.1(PR #675)
NEW FUNCTIONALITY
-
multisamplepipeline: This workflow now works when provided multimple unimodal files or multiple multimodal files, in addition to the previously supported single multimodal file (PR #606). The modalities are processed independently from each other:- As before, a single multimodal file is split into several unimodal MuData objects, each modality being stored in a file.
- (New) When multiple unimodal files are provided, they can be used used as is.
- (New) Mosaic input (i.e. multiple uni- or multimodal files) are split into unimodal files.
Providing the same modality twice is not supported however, meaning the modalities should be unique.
For example, usinginput: ["data1.h5mu", "data2.h5mu"]withdata1.h5muproviding data forrnaandatac
anddata2.h5muforrnaandprotwill not work, because thernamodality is present in both input files.
-
multisampleworkflow: throw an error when argument values for the merge component or theinitialize_integrationworkflow differ between the inputs (PR #606). -
Added a
split_modalitiesworkflow in order to split a multimodal mudata files into several unimodal mudata files. Its behavior is identical to thesplit_modalitiescomponent, but it also provides functionality to make sure everything works when nextflow's-stuboption is enabled (PR #606). -
All workflow now use
dependenciesto handle includes from other workflows (PR #606). -
qc/calculate_qc_metrics: allow setting the output column names and disabling the calculation of several metrics (PR #644). -
rna_multisample,prot_multisampleandqcworkflows: allow setting the output column names and disabling the calculation of several metrics (PR #606). -
cluster/leiden: Allow calculating multiple resolutions in parallel (PR #645). -
qc/calculate_qc_metrics: allow setting the output column names and disabling the calculation of several metrics (PR #644). -
rna_multisampleworkflow: added--modalityargument (PR #607). -
multisampleworkflow: in addition to using multimodal files as input, this workflow now also accepts a list of files. The list of files must be the unimodal equivalents of a split multimodal file. The modalities in the list must be unique and after processing the modalities will be merged into multimodal files (PR #606). -
Added
filter/intersect_obscomponent which removes observations that are not shared between modalities (PR #589). -
Re-enable
convert/from_h5mu_to_seuratcomponent (PR #616). -
Added the
gdo_singlesamplepipeline with basic count filtering (PR #672). -
process_samplespipeline: the--rna_layer,--prot_layerandgdo_layerargument can not be used to specify an alternative layer to .X where the raw data are stored. To enable this feature, the following changes were required:- Added
transform/move_layercomponent. filter/filter_with_scrublet: added--layerargument.transform/clr: added--input_layerargument.metadata/grep_annotation_column: added--input_layerargument.rna/rna_singlesample,rna/rna_multisample,prot/prot_singlesampleandprot/prot_multisample: add--layerargument.process_batches: Addedrna_layerandprot_layerarguments.
- Added
-
Enable dataset functionality for nf-tower (PR #701)
-
Added
annotate/score_genesandannotate/score_genes_cell_cycleto calculate scanpy gene scores (PR #703).
MINOR CHANGES
-
Refactored
rna_multisample(PR #607),cellranger_multi(PR #609),cellranger_mapping(PR #610) and other (PR #606) pipelines to usefromStateandtoStatefunctionality. -
metadata/add_id: add more runtime logging (PR #663). -
cluster/leiden: Bump python to 3.11 and leidenalg to 0.10.0 (PR #645). -
mapping/htseq_count_to_h5muandmulti_star: update polars and gtfparse (PR #642). -
Pin
from_h5mu_to_seuratto use Seurat to version 4 (PR #630). -
velocity/scvelo: bump scvelo to 0.3.1 and python to 3.10 (PR #640). -
Updated the Viash YAML schemas to the latest version of ...
OpenPipelines.bio v0.11.4
BUG FIXES
move_obsm_to_obs: fix setting output columns when they already exist (PR #690).
OpenPipelines.bio v0.12.6
BUG FIXES
move_obsm_to_obs: fix setting output columns when they already exist (PR #690).
OpenPipelines.bio v0.12.5
BUG FIXES
qc/calculate_qc_metrics: Resolved an issue where statistics based on the input columns selected with--var_qc_metricswere incorrect when these input columns were encoded inpd.BooleanDtype()(PR #685).
OpenPipelines.bio 0.11.3
BUG FIXES
qc/calculate_qc_metrics: Resolved an issue where statistics based on the input columns selected with--var_qc_metricswere incorrect when these input columns were encoded inpd.BooleanDtype()(PR #685).
OpenPipelines.bio v0.10.2
BUG FIXES
transform/log1p: fix--input_layerargument not functionning (PR #678).
OpenPipelines.bio v0.12.4
BUG FIXES
transform/log1p: fix--input_layerargument not functionning (PR #678).
OpenPipelines.bio v0.11.2
BUG FIXES
transform/log1p: fix--input_layerargument not functionning (PR #678).