Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Utility conversion tutorial #12

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
85 changes: 85 additions & 0 deletions tutorials/utility_conversion_on_idc_collection/tutorial.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,85 @@
Welcome to this blog post on the simplicity of the Mhub API through Docker for elegant and reproducible medical imaging processing.

Today, we're diving into the C4KC-KITS dataset—a curated collection of CT scans centered on kidney tumors. Each study comes packed with DICOM-formatted images, detailed metadata, segmentation masks, and clinical annotations. This rich repository not only accelerates computer-aided diagnosis research but also fuels the development and benchmarking of cutting-edge machine learning models in cancer imaging.

The magic begins with downloading your dataset using the Imaging Data Commons API. Simply execute:

```bash
idc download c4kc_kits
```

This command retrieves all the necessary DICOM files. If you also need metadata, remember you can fetch it from TCIA:
https://www.cancerimagingarchive.net/collection/c4kc-kits

Once downloaded, your data will look similar to this:

```bash
suraj@R2-Z9:/mnt/data1/datasets/RadiomicsHub/C4KC_KiTS/c4kc_kits$ ls c4kc_kits
KiTS-00000 KiTS-00018 KiTS-00036 KiTS-00054 KiTS-00072 KiTS-00090 KiTS-00108 KiTS-00126 KiTS-00144 KiTS-00162 KiTS-00180 KiTS-00198
KiTS-00001 KiTS-00019 KiTS-00037 KiTS-00055 KiTS-00073 KiTS-00091 KiTS-00109 KiTS-00127 KiTS-00145 KiTS-00163 KiTS-00181 KiTS-00199
...
```

Transitioning from DICOM to DL-friendly formats like NIfTI or NRRD is traditionally a challenging task, but this is where the elegance of Mhub truly shines. The Mhub API, exposed via Docker, provides a simplified, robust, and elegant solution to convert, organize, and process your imaging data with minimal fuss.

Start by pulling the latest Mhub base container:

```bash
docker pull mhubai/base:latest
```

Next, set up your processing configuration with a YAML file that directs the Mhub API precisely how to handle your data:

```yaml
general:
data_base_dir: /app/data
version: 1.0
description: Sort DICOM files into a structured directory

execute:
- DicomImporter
- DsegExtractor
- NiftiConverter
- DataOrganizer

modules:
DicomImporter:
source_dir: input_data
import_dir: sorted_data
sort_data: true
merge: true
meta:
mod: '%Modality'
desc: '%SeriesDescription'

DataOrganizer:
targets:
- nifti:mod=ct-->[i:sid]/ct.nii.gz # Organize CT images by series ID
- nifti:mod=seg-->[i:sid]/[d:roi]_seg.nii.gz # Prefix segmentation files with ROI information
```

The Mhub API comes to life with a single Docker command, making use of volume mappings to direct your input and output data paths. Execute:

```bash
docker run -v /mnt/data1/datasets/RadiomicsHub/C4KC_KiTS/c4kc_kits:/app/data/input_data \
-v /mnt/data1/datasets/RadiomicsHub/C4KC_KiTS/c4kc_kits_processed:/app/data/output_data \
-v /home/suraj/Repositories/config:/app/utility/config -it mhubai/base:latest --utility
```

This command leverages the full power of the Mhub API to process your DICOM data effortlessly. After successful execution, you’ll find your output directory structured neatly:

```bash
....
....
├── 1.3.6.1.4.1.14519.5.2.1.6919.4624.927655669179864696894393596289
│ ├── ct.nii.gz
│ ├── Kidney_seg.nii.gz
│ └── Renal Tumor_seg.nii.gz
├── 1.3.6.1.4.1.14519.5.2.1.6919.4624.933941010667753097000370321589
│ ├── ct.nii.gz
│ ├── Kidney_seg.nii.gz
│ └── Renal Tumor_seg.nii.gz
...
```

Thanks to the elegant design of the Mhub API, powered by Docker, converting DICOM images and extracting segmentation masks is now a streamlined, hassle-free experience. With a few commands and a simple config file, you’re all set to focus on your research rather than the nitty-gritty of data conversion. Embrace the simplicity of Mhub and let Docker handle the heavy lifting for you!