Skip to content

Commit 91e970c

Browse files
committed
Upload final-result outlook
1 parent 234c207 commit 91e970c

File tree

5 files changed

+28
-0
lines changed

5 files changed

+28
-0
lines changed
Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
## Example of the final output of the cancer data analysis project
2+
3+
Below are several findings that you would be able to uncover after going through the data analysis workflow of this workshop. However, these findings are merely a starting point. There are many more analysis and visualizations that can be created.
4+
5+
### Distribution of disease types
6+
7+
<img src="../assets/distribution-of-disease-types.png" alt="distribution-of-disease-types.png" width="800">
8+
9+
As we can see, the disease type with the highest count is Adenomas and Adenocarcinomas followed by Epithelial Neoplasms, NOS. Understanding the distribution of disease types helps identify the most common and rare cancers in the dataset, which is crucial for allocating resources and prioritizing research.
10+
11+
### Gender demographic
12+
13+
<img src="../assets/demographic-gender.png" alt="demographic-gender.png" width="600">
14+
15+
Gender information was available from all but 9 samples and showed a slight bias toward females versus males.
16+
17+
This bias can be explained in part by the large number of breast and GYN cancer samples within the dataset since both breast and gynecological cancers are specific to females. Therefore, a dataset with a large number of these samples would naturally have more female participants than male.
18+
19+
<img src="../assets/primary-site-vs-gender.png" alt="primary-site-vs-gender.png" width="800">
20+
21+
### Age distribution
22+
23+
<img src="../assets/age-distribution.png" alt="age-distribution.png" width="600">
24+
25+
26+
The study "High-Throughput Genomic Profiling of Adult Solid Tumors" utilized patient samples that were part of routine clinical care, which were submitted for genomic profiling by Foundation Medicine. As we can see from the graph above, the sampling reflects close to a normal distribution of those who seek genomic profiling or those recommended for such tests by their healthcare providers.
27+
28+
Therefore, although the data were not collected through random sampling, the dataset exhibits substantial diversity, featuring nearly equal representation across genders and a normal distribution in age.

assets/age-distribution.png

21.2 KB
Loading

assets/demographic-gender.png

18.8 KB
Loading
60.2 KB
Loading

assets/primary-site-vs-gender.png

115 KB
Loading

0 commit comments

Comments
 (0)