Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add notebook describing conversion with supplemental metadata #75

Open
wants to merge 4 commits into
base: master
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
251 changes: 251 additions & 0 deletions notebooks/advanced_topics/Convert_with_metadata.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,251 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "022cf89b",
"metadata": {},
"source": [
"## Adding metadata when converting to DICOM\n",
"\n",
"\n",
"When converting data to DICOM (as described in the conversion tools notebook), DICOM tags can be inserted during conversion, so that experimental metadata is included in the output DICOM dataset. This eliminates the need for a post-processing step to add metadata, which speeds up the total time to create a complete DICOM dataset.\n",
"\n",
"See also relevant [Bio-Formats documentation](https://bio-formats.readthedocs.io/en/stable/users/comlinetools/conversion.html#cmdoption-bfconvert-extra-metadata)"
]
},
{
"cell_type": "markdown",
"id": "ef608704",
"metadata": {},
"source": [
"### Recap of required packages"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "a55547d9",
"metadata": {},
"outputs": [],
"source": [
"# Required for downloading data from IDC\n",
"!pip install idc-index\n",
"\n",
"# Install bfconvert via bftools\n",
"!wget https://downloads.openmicroscopy.org/bio-formats/7.3.1/artifacts/bftools.zip\n",
"!unzip bftools.zip"
]
},
{
"cell_type": "markdown",
"id": "ed29c909",
"metadata": {},
"source": [
"### Download SVS input data"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "44288420",
"metadata": {},
"outputs": [],
"source": [
"# Download sample data from OpenSlide\n",
"!wget https://openslide.cs.cmu.edu/download/openslide-testdata/Aperio/CMU-1-Small-Region.svs"
]
},
{
"cell_type": "markdown",
"id": "104fa6a4",
"metadata": {},
"source": [
"### Write supplemental metadata file\n",
"\n",
"DICOM tags to be written are provided as a JSON file.\n",
"\n",
"The structure of the JSON file is based on that used by [dcmqi](https://github.com/QIICR/dcmqi/tree/master/doc/examples), but with several additions.\n",
"\n",
"Additional technical discussion of how to represent DICOM tags in JSON is available [here](https://github.com/ome/bioformats/pull/4016).\n",
"\n",
"#### Basic tag structure\n",
"\n",
"Each DICOM tag is a single JSON object, e.g.:\n",
"\n",
"```\n",
"{\n",
" \"BodyPartExamined\": {\n",
" \"Value\": \"BRAIN\",\n",
" \"VR\": \"CS\",\n",
" \"Tag\": \"(0018,0015)\"\n",
" }\n",
"}\n",
"```\n",
"\n",
"The object's name (`BodyPartExamined`) should be the name of the tag in the DICOM dictionary, with spaces removed.\n",
"There is only 1 required key/value pair:\n",
"\n",
"- `Value` (here, `BRAIN`), which is the tag's value\n",
"\n",
"There are also 3 optional key/value pairs:\n",
"\n",
"- `Tag` (here, `(0018,0015)`, which is the tag corresponding to the object name in the DICOM dictionary. If not defined, this will be looked up automatically.\n",
"- `VR` (here `CS`), which is the value representation to use when writing the tag. If not defined, the default VR will be looked up in the DICOM dictionary.\n",
"- `ResolutionStrategy`, which defines what to do with this tag if it was defined multiple times. Valid values are `IGNORE`, `APPEND`, and `REPLACE`. `APPEND` is the default if the `VR` is `SQ` (a sequence), or `REPLACE` for all other VRs.\n",
"\n",
"\n",
"#### Writing values for different VRs\n",
"\n",
"The `Value` is interpreted according to the VR that was either defined or looked up in the dictionary.\n",
"\n",
"For VRs representing a string of characters (e.g. `SH`), the `Value` is used directly. It is not necessary to ensure that `Value` contains an even number of characters. If needed, Bio-Formats' DICOM writer will pad the string to the correct width.\n",
"\n",
"For VRs representing a numeric type (e.g. `US`), the `Value` is parsed and then saved to DICOM as the correct type (e.g. uint16 for `US`). When a value multiplicity greater than 1 (i.e. an array of values) is needed, the values should be separated by a comma:\n",
"\n",
"\n",
"```\n",
"{\n",
" \"ReferencedFrameNumber\": {\n",
" \"Value\": \"1,3,5,9\",\n",
" \"VR\": \"IS\",\n",
" \"Tag\": \"(0008,1160)\"\n",
" }\n",
"}\n",
"```\n",
"\n",
"#### Handling duplicate or conflicting tags\n",
"\n",
"In the first example above, tag `(0018,0015)` (`BodyPartExamined`) would always be set to `BRAIN`. In this example though:\n",
"\n",
"```\n",
"{\n",
" \"BodyPartExamined\": {\n",
" \"Value\": \"BRAIN\",\n",
" \"VR\": \"CS\",\n",
" \"Tag\": \"(0018,0015)\",\n",
" \"ResolutionStrategy\": \"IGNORE\"\n",
" }\n",
"}\n",
"```\n",
"\n",
"tag `(0018,0015)` (`BodyPartExamined`) would only be set to `BRAIN` if the tag wasn't previously defined.\n",
"\n",
"`ResolutionStrategy` is particularly useful when trying to alter metadata that Bio-Formats' DICOM writer already writes. For example, Bio-Formats will automatically write an `OpticalPathSequence` with the appropriate number of channels, but may have missing wavelengths or other metadata. To fully replace the default `OpticalPathSequence`, the entire sequence can be defined with a `ResolutionStrategy` of `REPLACE`:\n",
"\n",
"```\n",
" \"OpticalPathSequence\": {\n",
" \"VR\": \"SQ\",\n",
" \"Tag\": \"(0048,0105)\",\n",
" \"Sequence\": {\n",
" \"IlluminationTypeCodeSequence\": {\n",
" \"VR\": \"SQ\",\n",
" \"Tag\": \"(0022,0016)\",\n",
" \"Sequence\": {\n",
" \"CodeValue\": {\n",
" \"VR\": \"SH\",\n",
" \"Tag\": \"(0008,0100)\",\n",
" \"Value\": \"111743\"\n",
" },\n",
" \"CodingSchemeDesignator\": {\n",
" \"VR\": \"SH\",\n",
" \"Tag\": \"(0008,0102)\",\n",
" \"Value\": \"DCM\"\n",
" },\n",
" \"CodeMeaning\": {\n",
" \"VR\": \"LO\",\n",
" \"Tag\": \"(0008,0104)\",\n",
" \"Value\": \"Epifluorescence illumination\"\n",
" }\n",
" }\n",
" },\n",
" \"IlluminationWaveLength\": {\n",
" \"VR\": \"FL\",\n",
" \"Tag\": \"(0022,0055)\",\n",
" \"Value\": \"488.0\"\n",
" },\n",
" \"OpticalPathIdentifier\": {\n",
" \"VR\": \"SH\",\n",
" \"Tag\": \"(0048,0106)\",\n",
" \"Value\": \"1\"\n",
" },\n",
" \"OpticalPathDescription\": {\n",
" \"VR\": \"ST\",\n",
" \"Tag\": \"(0048,0107)\",\n",
" \"Value\": \"replacement channel\"\n",
" }\n",
" },\n",
" \"ResolutionStrategy\": \"REPLACE\"\n",
" }\n",
" ```"
]
},
{
"cell_type": "markdown",
"id": "b4953b13",
"metadata": {},
"source": [
"### Convert SVS to DICOM with supplemental metadata"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "217d21df",
"metadata": {},
"outputs": [],
"source": [
"# save one of the JSON examples to a file\n",
"# edit this as needed, or paste a different example from above\n",
"json = '''{\n",
" \"BodyPartExamined\": {\n",
" \"Value\": \"BRAIN\",\n",
" \"VR\": \"CS\",\n",
" \"Tag\": \"(0018,0015)\"\n",
" }\n",
"}'''\n",
"with open('supplemental-metadata.json', 'w') as f:\n",
" f.write(json)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "724609cf",
"metadata": {},
"outputs": [],
"source": [
"!cat supplemental-metadata.json\n",
"!./bftools/bfconvert -noflat -precompressed CMU-1-Small-Region.svs CMU-1.dcm -extra-metadata supplemental-metadata.json"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "3e98476c",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.9"
}
},
"nbformat": 4,
"nbformat_minor": 5
}