Skip to content

Commit aebb6e7

Browse files
fedorovgitbook-bot
authored andcommitted
GITBOOK-401: change request with no subject merged in GitBook
1 parent 74e5cf6 commit aebb6e7

File tree

3 files changed

+58
-39
lines changed

3 files changed

+58
-39
lines changed

SUMMARY.md

+2-2
Original file line numberDiff line numberDiff line change
@@ -30,9 +30,9 @@
3030
## Data
3131

3232
* [Introduction](data/introduction.md)
33+
* [Data model](data/data-model.md)
3334
* [Data versioning](data/data-versioning.md)
3435
* [Organization of data](data/organization-of-data/README.md)
35-
* [IDC data model](data/organization-of-data/idc-data-model.md)
3636
* [Files and metadata](data/organization-of-data/files-and-metadata.md)
3737
* [Resolving CRDC Globally Unique Identifiers (GUIDs)](data/organization-of-data/guids-and-uuids.md)
3838
* [Clinical data](data/organization-of-data/clinical.md)
@@ -49,7 +49,7 @@
4949
## DICOM
5050

5151
* [Introduction to DICOM](dicom/introduction.md)
52-
* [Data model](dicom/data-model.md)
52+
* [DICOM data model](dicom/data-model.md)
5353
* [Original objects](dicom/original-vs-derived-objects.md)
5454
* [Derived objects](dicom/derived-objects/README.md)
5555
* [DICOM Segmentations](dicom/derived-objects/dicom-segmentations.md)

data/data-model.md

+56
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,56 @@
1+
# Data model
2+
3+
IDC relies on DICOM data model for organizing images and image-derived data. At the same time, IDC includes certain attributes and data types that are outside of the DICOM data model. The _Entity-Relationship (E-R) diagram_ and examples below summarize a simplified view of the IDC data model (you will find the explanation of how to interpret the notation used in this E-R diagram in [this page](https://mermaid.js.org/syntax/entityRelationshipDiagram.html) from Mermaid documentation).
4+
5+
```mermaid
6+
erDiagram
7+
COLLECTION ||--o{ CASE: contains
8+
CASE ||--o{ STUDY : contains
9+
STUDY ||--o{ SERIES : contains
10+
SERIES ||--o{ INSTANCE : contains
11+
ANALYSIS_RESULT ||--o{ SERIES : adds
12+
ANALYSIS_RESULT }o--o{ COLLECTION : spans
13+
CASE |o--o| CLINICAL_DATA : "may have"
14+
PROGRAM ||--o{ COLLECTION : contains
15+
16+
PROGRAM {
17+
string program PK
18+
}
19+
20+
COLLECTION {
21+
string collection_id
22+
string source_doi
23+
}
24+
CASE {
25+
string PatientID
26+
}
27+
STUDY {
28+
string StudyInstanceUID
29+
}
30+
SERIES {
31+
string SeriesInstanceUID
32+
}
33+
INSTANCE {
34+
string SOPInstanceUID
35+
}
36+
ANALYSIS_RESULT {
37+
string analysis_result_id
38+
string source_doi
39+
}
40+
CLINICAL_DATA {
41+
string CaseID
42+
}
43+
44+
45+
46+
```
47+
48+
IDC content is organized in **Collections**: groups of DICOM files that were collected through certain research activity.
49+
50+
Collections are organized into **Programs**, which group related collections, or those collections that were contributed under the same funding initiative or a consortium. Example: TCGA program contains TCGA-GBM, TCGA-BRCA and other collections. You will see Collections nested under Programs in the upper left section of the [IDC Portal](https://portal.imaging.datacommons.cancer.gov/explore/). You will also see the list of collections that meet the filter criteria in the top table on the right-hand side of the portal interface. 
51+
52+
Individual DICOM files included in the collection contain attributes that organize content according to the [data-model.md](../dicom/data-model.md "mention"). 
53+
54+
Each collection will contain data for one or more case, or **patient**. Data for the individual patient is organized in DICOM **studies**, which group images corresponding to a single imaging exam/enconter, and collected in a given session. Studies are composed of DICOM **series**, which in turn consist of DICOM **instances**. Each DICOM instance correspond to a single file on disk. As an example, in radiology imaging, individual instances would correspond to image slices in multi-slice acquisitions, and in digital pathology you will see a separate file/instance for each resolution layer of the image pyramid. When using IDC Portal, you will never encounter individual instances - you will only see them if you download data to your computer.
55+
56+
**Analysis results collection** is a very important concept in IDC. These contain analysis results that were not contributed as part of any specific collection. Such analysis results might be contributed by investigators unrelated to those that submitted the analyzed images, and may span images across multiple collections.

data/organization-of-data/idc-data-model.md

-37
This file was deleted.

0 commit comments

Comments
 (0)