From 3e8f484606dcb8c4196ef2bb53548261ca3387c3 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Micha=C5=82=20Szczepanik?= Date: Wed, 10 Jan 2024 14:28:41 +0100 Subject: [PATCH] Apply further suggestions from code review Apart from small formatting tweaks, this commit adds the following changes to the dataladification workflow description: - clarify dataset creation: the workflow creates dataset representations that can be cloned, not checked-out datasets - link to the singularity container description before using singularity run for the first time - add missing argument placeholders - mention that the catalog needs to be served Suggested-by: Adina Wagner --- docs/source/user/datalad-access.rst | 3 ++- docs/source/user/datalad-generate.rst | 37 +++++++++++++++++---------- 2 files changed, 26 insertions(+), 14 deletions(-) diff --git a/docs/source/user/datalad-access.rst b/docs/source/user/datalad-access.rst index c9ded59..e571047 100644 --- a/docs/source/user/datalad-access.rst +++ b/docs/source/user/datalad-access.rst @@ -31,7 +31,7 @@ The pattern for the URL is:: 'datalad-annex::?type=external&externaltype=uncurl&encryption=none&url=//_{{annex_key}}' -Given the exemplary values above, the pattern would expand to +Given the exemplary values above, the pattern would expand to: .. code-block:: @@ -40,6 +40,7 @@ Given the exemplary values above, the pattern would expand to A full ``datalad clone`` command could then look like this: .. code-block:: + datalad clone 'datalad-annex::?type=external&externaltype=uncurl&encryption=none&url=file:///tmp/local_dicom_store/dl-Z03/P000624_{{annex_key}}' my_clone diff --git a/docs/source/user/datalad-generate.rst b/docs/source/user/datalad-generate.rst index 8c83f8a..a56661d 100644 --- a/docs/source/user/datalad-generate.rst +++ b/docs/source/user/datalad-generate.rst @@ -37,21 +37,27 @@ Download the visit tarball, keeping the same relative path: .. code-block:: bash - datalad download "https://data.inm-icf.de//_dicom.tar local_dicom_store//_dicom.tar" + datalad download "https://data.inm-icf.de//_dicom.tar local_dicom_store//_dicom.tar" + +The local copy of the tarball is required to index its contents. It +can be removed afterwards -- datasets will use the ICF store as the +content source. Using ``datalad download`` for downloading the file has the benefit of using DataLad's credential management. If this is the first time you use DataLad to access the project directory, you will be asked to provide your ICF credentials. See :ref:`dl-credentials` for details. -For the following examples, the *absolute path* to the local DICOM -store will be represented by ``$STORE_DIR``: +For the following steps, the ICF utility scripts packaged as a +Singularity container will be used, and executed with ``singularity +run`` (see :ref:`container` for download and usage details). The +*absolute path* to the local DICOM store will be represented by +``$STORE_DIR``: .. code-block:: bash export STORE_DIR=$PWD/local_dicom_store - Deposit visit metadata alongside tarball ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ @@ -70,20 +76,21 @@ within the tarball (relative path, MD5 checksum, size, and a small subset of DICOM headers describing acquisition type), and the latter describes the tarball itself. -Deposit dataset alongside tarball -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +Deposit dataset representation alongside tarball +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -DataLad dataset is created based on the metadata extracted in the -previous step. Additionally, you need to provide the base URL of the -ICF store, ```` (this base URL should not contain study -or visit ID). The URL, combined with respective IDs, will be -registered in the dataset as the source of the DICOM tarball, and used -for retrieval by dataset clones. +The next step is to create a lightweight, clone-able representation of +a dataset in the local dataset store. This step relies on the metadata +extracted with the previous command. Additionally, the base URL of the +ICF store needs to be provided (here represented by ````, this base URL should not contain study or visit ID). The URL, +combined with respective IDs, will be registered in the dataset as the +source of the DICOM tarball, and used for retrieval by dataset clones. .. code-block:: bash singularity run -B $STORE_DIR icf.sif deposit_visit_dataset \ - --store-dir $STORE_DIR --store-url + --store-dir $STORE_DIR --store-url --id This will produce two files, ``_XDLA--refs`` and ``_XDLA--repo-export`` (text file and zip archive @@ -109,6 +116,10 @@ and place it in the ``catalog`` folder in the study directory. singularity run -B $STORE_DIR icf.sif catalogify_studyvisit_from_meta \ --store-dir $STORE_DIR --id +This catalog needs to be subsequently served; a simple (possibly +local) http server is enough. See the generated README file in the +``catalog`` folder for details. + Remove the tarball ^^^^^^^^^^^^^^^^^^