Skip to content

Commit c76e624

Browse files
committed
Add some explanations about GSEA
1 parent 90fb7b1 commit c76e624

File tree

2 files changed

+35
-2
lines changed

2 files changed

+35
-2
lines changed

docs/pages/blog/gsea.mdx

+30-2
Original file line numberDiff line numberDiff line change
@@ -29,14 +29,42 @@ in which case the gene set is correlated with the phenotypic class distinction.
2929
Of course we will do as many independant tests as we have genes sets to try.
3030
A [multiple testing correction](https://www.firalis.com/products/fimics-cardiac-ruo-kit-panel) should then be considered.
3131

32-
### The Methods
32+
### The Method
3333

34-
### To go further
34+
**Step 1 : Compute an enrichment score (ES)**<br/>
35+
This score reflects the degree to which the set S is overrepresented at the extremes top (ES > 0) or bottom (ES < 0) of the ranked list R. It is calculated by walking down the list L,
36+
increasing the running sum when we encounter a gene in S and decreasing it when we encounter genes not in S. Finally the maximum deviation from zero encountered during the random walk is kept as ES. <br/><br/>
37+
**Step 2 : Estimation of significance level of ES**<br/>
38+
The nominal P-Value is estimates the statistical significance of the ES by using an empirical phenotype-based permutation test that
39+
preserves the correlation structure of the gene expression data. Phenotype labels are permuted and the ES recomputed to generate a null distribution for the ES. The empirical nominal P Value of the observed ES is then calculated relative to
40+
this null distribution. Permutation of class labels preserves gene-gene correlations and thus, provide a more biologically reasonable assessment of significance than would be obtained by permuting genes.<br/><br/>
41+
**Step 3 : Adjustment of Multiple Hypothesis Testing.**<br/>
42+
The enrichment scores for each set are normalized and a false discovery rate is calculated
43+
44+
<p className="popacitydanger" >
45+
<div style={{ textAlign: 'center' }}>
46+
<strong>It is usefull to keep in mind that</strong>
47+
</div><br />
48+
- That the GSEA official software will not complain if you provide them with raw gene expression data. However your results will get totally incorrect results.
49+
- You can perform a pre ranked GSEA, which can be very helpful for performing gene set enrichment analysis on data that do not conform to the typical GSEA scenario. For example, it can be used when the ranking metric choices provided by GSEA are not appropriate for the data, or when a ranked list of genomic features deviates from traditional gene expression data (e.g., GWAS results, ChIP seq, etc.). Also if you lack computing power and have access to pre-ranked list, this solution can be your best option.
50+
- Clearly define the question your are trying to address and choose the appropriate ranking metrics.
51+
- Genes sets curation could be useful as a pre-processing step. Indeed you do not need to perform test on genes set you are not interested in. It will increase your need for computing resources and create supplementary noise for the Adjustment of Multiple Hypothesis Testing procedure.
52+
</p>
53+
54+
### To go further with theory
3555

3656
There exists a variant of GSEA called FGSEA for <u>F</u>ast <u>G</u>ene <u>S</u>et <u>E</u>nrichment <u>A</u>nalysis.<br/>
3757

3858
Another common approach to perform pathways analysis is the [Gene Ontology Enrichment analysis](https://geneontology.org/docs/go-enrichment-analysis/).
3959

60+
## Available programs for practice
61+
- [Official Broad institute tools](https://www.gsea-msigdb.org/gsea/downloads.jsp)
62+
- **WEB-based GEne SeT AnaLysis Toolkit**
63+
[GUI](https://www.webgestalt.org/)
64+
[R package](https://cran.r-project.org/web/packages/WebGestaltR/index.html)
65+
- [Fast Gene Set Enrichment Analysis](https://bioconductor.org/packages/release/bioc/html/fgsea.html) (Pre-ranked only)
66+
67+
4068
[^1]: Timothy E. Sweeney , Winston A. Haynes , Francesco Vallania , John P. Ioannidis
4169
and Purvesh Khatri. (2017). *Methods to increase reproducibility in differential gene expression via meta-analysis*. **Nucleic Acids Research**, Volume 45(Issue 1), Page Range. [DOI](https://doi.org/10.1093/nar/gkw797)
4270
[^2]: Steven N Goodman , Daniele Fanelli , John P A Ioannidis. (2016). *What does research reproducibility mean? *. **Sci Transl Med**, 8(341),12. [DOI](https://doi.org/10.1126/scitranslmed.aaf5027)

docs/styles.css

+5
Original file line numberDiff line numberDiff line change
@@ -50,6 +50,11 @@
5050
padding: 10px;
5151
}
5252

53+
.popacitydanger {
54+
background: #b81d1d;
55+
padding: 10px;
56+
}
57+
5358
.two-column-layout {
5459
display: flex; /* Use flexbox for a simple two-column layout */
5560
justify-content: space-between;

0 commit comments

Comments
 (0)