Skip to content

Commit d516456

Browse files
committed
added PCoA toturial for R+Python method
1 parent 5f360c2 commit d516456

File tree

5 files changed

+212
-3
lines changed

5 files changed

+212
-3
lines changed

docs/beta_diversity_analysis.md

Lines changed: 31 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -191,6 +191,34 @@ $multi_variable_pcoa_plot.py --abundance_table mvpp_mpa_species_relab.tsv --meta
191191

192192
![Multiple variable PCoA plot](../images/mvpp_pcoa.png)
193193

194-
As optional ouputs, `multi_variable_pcoa_plot.py` also generates non-adjustment PERMANOVA test (e.g. [mvpp_permanova.tsv](../example_data/mvpp_permanova.tsv)) and coordinates of PC1 and PC2 (e.g. [mvpp_coordinates.tsv](../example_data/mvpp_coordinates.tsv)) which can be used in visualization in other ways we will discuss shortly below.
195-
196-
## A method mixing R and Python
194+
As optional ouputs, `get_palette(palette = "default", k)` also generates non-adjustment PERMANOVA test (e.g. [mvpp_permanova.tsv](../example_data/mvpp_permanova.tsv)) and coordinates of PC1 and PC2 (e.g. [mvpp_coordinates.tsv](../example_data/mvpp_coordinates.tsv)) which can be used in visualization in other ways we will discuss shortly below.
195+
196+
## A method mixing R and Python
197+
198+
If you wish to enhance the aesthecity of PCoA plot using R but with the pre-calculated coordinates from [multi_variable_pcoa_plot.py](../scripts/multi_variable_pcoa_plot.py) with the flag `--df_opt`, you are suggested to use our R function `pcoa_sideplot()` on coordinate table from `multi_variable_pcoa_plot.py` with arguments:
199+
* `coordinate_df`: the coordinate table generated from python script multi_variable_pcoa_plot.py --df_opt, for example [coordinate_table.tsv](../example_data/coordinate_table.tsv).
200+
* `variable`: specify the variable you want to inspect on PCoA.
201+
* `color_palettes`: give a named vector to pair color palettes with variable group names. default: [ggpubr default palette]
202+
* `coordinate_1`: specify the column header of the 1st coordinate. default: [*PC1*]
203+
* `coordinate_2`: specify the column header of the 2nd coordinate. default: [*PC2*]
204+
* `marker_size`: specify the marker size of the PCoA plot. default: [3]
205+
* `font_size`: specify the font size of PCoA labels and tick labels. default: [20]
206+
* `font_style`: specify the font style of PCoA labels and tick labels. default: [*Arial*]
207+
208+
For example:
209+
210+
```{r}
211+
>source(file = "path_to_the_package/KunDH-2023-CRM-MSM_metagenomics/scripts/functions/beta_diversity_funcs.R")
212+
>coordinate_df <- data.frame(read.csv("path_to_the_package/KunDH-2023-CRM-MSM_metagenomics/example_data/coordinate_table.tsv",
213+
header = TRUE,
214+
sep = "\t"))
215+
>pcoa_sideplot(coordinate_df = coordinate,
216+
coordinate_1 = "PC1",
217+
coordinate_2 = "PC2",
218+
variable = "sexual_orientation",
219+
color_palettes = c("Non-MSM" = "#888888", "MSM" = "#eb2525"))
220+
```
221+
222+
[PCoA Side plot](../images/pcoa_w_sideplot.png)
223+
224+
**Note:** The figure displayed above had been edited using [inkscape](https://inkscape.org/) on the base of the crude output in order to enhance the readability and aesthetic sense.

example_data/coordinate_table.tsv

Lines changed: 125 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,125 @@
1+
PC1 PC2 sexual_orientation
2+
-0.2628051359128226 -0.15605007858766573 MSM
3+
-0.29031551685241075 -0.2306897666476053 MSM
4+
-0.29138693804365257 -0.22179363971017124 MSM
5+
-0.32090547078081705 -0.2563014078473214 MSM
6+
-0.1948358881417443 0.17939078682251117 MSM
7+
-0.23625645807979842 0.20103790161237714 MSM
8+
-0.3193954850100104 -0.231531305577392 MSM
9+
-0.32497975005023233 -0.12286046544442687 MSM
10+
-0.21403276798456888 0.1597789506798806 MSM
11+
-0.2198990812155916 0.18649282765995961 MSM
12+
0.06365888677175648 0.06778275169584252 MSM
13+
-0.077356170706838 -0.19385953698160924 Non-MSM
14+
-0.30800296014436335 -0.10403393040001498 MSM
15+
0.13554557110942725 0.2825494016619917 MSM
16+
-0.06744547088978758 0.2798333588365733 MSM
17+
-0.13914220852177575 0.04639761443455348 MSM
18+
-0.285068046067444 -0.04668281457390372 MSM
19+
-0.3094660286485821 -0.2231431963414331 MSM
20+
-0.22141228804746133 -0.0669143229983127 MSM
21+
-0.17519072924879783 -0.19549526430463365 MSM
22+
0.37243377344879547 -0.004938847701093791 Non-MSM
23+
-0.168899331145202 0.31448129015497645 Non-MSM
24+
-0.20704651572822963 0.01935676754118731 Non-MSM
25+
0.09962412472581383 -0.05954836155096865 Non-MSM
26+
-0.13769457780524466 -0.000646348202553923 MSM
27+
0.23337142086116236 0.10700421418215182 Non-MSM
28+
-0.13390077085490965 -0.11786180771596756 Non-MSM
29+
-0.10748700301527406 0.35286795889519224 Non-MSM
30+
-0.3143329483543445 -0.010426730685141739 MSM
31+
-0.31761576313682177 -0.22242901200710127 MSM
32+
-0.31409460863 -0.26905032498012255 Non-MSM
33+
0.02905069375032977 0.2892589057683008 MSM
34+
0.03621156681354462 0.3321279458816928 MSM
35+
0.43065553120847716 -0.09724234758363742 Non-MSM
36+
0.33100242738444474 0.054094101189757764 Non-MSM
37+
0.10933860444600554 -0.22517778332196398 Non-MSM
38+
0.34606308825340554 0.02848018400734919 Non-MSM
39+
0.3369415090034524 -0.05788677494864451 Non-MSM
40+
0.13014996661823097 0.09360548773818418 MSM
41+
0.33425913531034 0.08251047149315524 MSM
42+
-0.29370231455039514 -0.11448324441243958 MSM
43+
-0.24413737064159316 0.15199023515154578 MSM
44+
0.013485057562827743 0.3421231896153416 MSM
45+
-0.04083369785916474 0.30884583042396757 MSM
46+
0.3862027412562518 -0.04386920797429145 MSM
47+
0.1684365641450756 -0.0870274991147484 MSM
48+
0.31109698127805974 0.02516007670727285 Non-MSM
49+
-0.3401975641881859 -0.20967559300728847 MSM
50+
0.3594123194755097 -0.06470248473155016 MSM
51+
0.3986153456052948 -0.1267373077533661 MSM
52+
0.046255536309669554 0.2810082298244458 MSM
53+
-0.02258147301819479 0.3397344191491147 MSM
54+
-0.29639708096604866 -0.06780598224971333 MSM
55+
0.41597931247338377 -0.14900412362477705 MSM
56+
0.18945216414472432 -0.042447380912132335 MSM
57+
-0.25926773186152946 -0.10493481227119635 MSM
58+
-0.04981884353189541 0.3007904360475624 MSM
59+
-0.31639001367135455 -0.035693039669966495 MSM
60+
0.020637769912962012 0.410766326082602 MSM
61+
-0.2508379630983494 0.1512718946439016 MSM
62+
-0.24866756841445947 -0.011776886185518524 MSM
63+
-0.06062248295929493 0.09984074835114227 MSM
64+
-0.14757195167974466 0.04339673580911644 MSM
65+
-0.219381683170428 0.1551440838729017 MSM
66+
0.03492119860006388 -0.017895308778923424 MSM
67+
-0.252409319941444 0.09247713253932459 MSM
68+
-0.0022636163274699487 0.428637192155102 MSM
69+
0.38822833478614555 -0.13612452166841413 MSM
70+
0.44245103659096474 -0.1199519770522526 MSM
71+
0.38479091848409924 -0.15957994126105673 Non-MSM
72+
0.09345439824515817 -0.2546946277371098 MSM
73+
0.4357367201331367 -0.15375985445597007 Non-MSM
74+
0.042134034448124934 0.3253140941844303 MSM
75+
0.43355550913789936 -0.09489197383147234 MSM
76+
0.4670348497729969 -0.15508790200823316 Non-MSM
77+
0.28633308170873556 -0.051094516485459894 Non-MSM
78+
0.022472017377469258 0.2973332193558033 MSM
79+
0.000290920583021966 -0.3049162316639377 Non-MSM
80+
0.35569909828860063 -0.08418795503581224 MSM
81+
0.42132318212487585 -0.09455376061977355 Non-MSM
82+
0.44578984482419354 -0.14457469038112186 MSM
83+
0.2155796686771958 0.08431100213709204 MSM
84+
0.38070630831272706 -0.06548253255198717 Non-MSM
85+
0.2940114746557116 -0.13465333871593427 MSM
86+
0.40668882908433607 -0.14660763825334322 Non-MSM
87+
0.30280178593804397 -0.0367574847161416 Non-MSM
88+
-0.1442266836782678 -0.07531906579551187 MSM
89+
0.13054961923390396 -0.12386766385809217 Non-MSM
90+
0.38664654306938556 -0.10737000121129817 MSM
91+
0.28818136797042965 -0.10342621350551943 MSM
92+
-0.18591002513659655 -0.13964718807334883 MSM
93+
0.30622508939841 0.051103617912361586 Non-MSM
94+
0.4176342761841033 -0.07057117984723121 MSM
95+
-0.1928803703916965 0.013418491488453817 MSM
96+
-0.27100483343878357 0.044174894014619805 MSM
97+
-0.26314792775266876 -0.2542565535953482 MSM
98+
-0.30829425760574897 -0.1351193831255704 MSM
99+
-0.04466710668496929 0.28292430945610686 MSM
100+
0.27908158191717863 0.0788783508452007 Non-MSM
101+
0.0044515314467062465 0.3392047985031592 MSM
102+
-0.026422512280681335 0.12096616678903391 MSM
103+
-0.043102002784257507 0.3406518597548492 MSM
104+
-0.15292430978445407 0.1639421993179468 Non-MSM
105+
0.16681913824786065 -0.14798783506901328 Non-MSM
106+
-0.11158327302506973 -0.19139012239104805 MSM
107+
-0.31257938302228905 -0.20233748204134944 MSM
108+
-0.2792189151690751 -0.11335201544092842 MSM
109+
-0.08712395811297115 0.088059174613106 MSM
110+
-0.23774540954803167 -0.2445616696085774 MSM
111+
-0.3386641479500712 -0.19165627291431533 MSM
112+
0.3069504767103925 -0.12391018739777997 Non-MSM
113+
0.013058989422652635 0.3954154942285945 MSM
114+
-0.31233837914395446 -0.02954054164281813 MSM
115+
-0.2525680088163061 -0.12674559579161007 MSM
116+
-0.2509552816139222 -0.02279529935983045 MSM
117+
-0.12751697865714787 -0.06105509623037198 MSM
118+
-0.20573466329607487 0.06845533551832138 MSM
119+
-0.22698730908320192 -0.10228253624636663 MSM
120+
0.052820329663948735 -0.07400718577184878 MSM
121+
-0.2660466971246142 -0.1624019716416814 MSM
122+
0.20762354944056213 -0.058787765264924394 Non-MSM
123+
0.05644452616271447 -0.006582799412051108 MSM
124+
0.0408868410498254 0.26834889836329806 MSM
125+
-0.16356614056338756 0.02576815336069973 MSM

images/pcoa_w_sideplot.png

850 KB
Loading

scripts/alpha_diversity_analysis.R

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -24,3 +24,13 @@ est_permanova(mat = mat,
2424
nper = 999,
2525
to_rm = c("no_receptive_anal_intercourse"),
2626
by_method = "margin")
27+
28+
pcoa_df <- data.frame(read.csv("/Users/kunhuang/R_analysis_mirror/msm_analysis/manuscript_05012023/figure1/pcoa_df.tsv",
29+
header = TRUE,
30+
sep = "\t"))
31+
32+
length(unique(pcoa_df[, "sexual_orientation"]))
33+
pcoa_sideplot(coordinate_df = pcoa_df,
34+
coordinate_1 = "PC1",
35+
coordinate_2 = "PC2",
36+
variable = "sexual_orientation")

scripts/functions/beta_diversity_funcs.R

Lines changed: 46 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -73,4 +73,50 @@ est_permanova <- function(mat,
7373
est <- vegan::adonis2(eval(parse(text = str2)), data = md, permutations = nper, by = by_method)
7474
}
7575
est
76+
}
77+
78+
pcoa_sideplot <- function(coordinate_df,
79+
variable,
80+
color_palettes = ggpubr::get_palette(palette = "default", k = length(unique(coordinate_df[, variable]))),
81+
coordinate_1 = "PC1",
82+
coordinate_2 = "PC2",
83+
marker_size = 3,
84+
font_size = 20,
85+
font_style = "Arial"
86+
){
87+
# coordinate_df: the coordinate table generated from python script multi_variable_pcoa_plot.py --df_opt
88+
# variable: specify the variable you want to inspect on PCoA.
89+
# color_palettes: give a named vector to pair color palettes with variable group names. default: [ggpubr default palette]
90+
# coordinate_1: specify the column header of the 1st coordinate. default: [PC1]
91+
# coordinate_2: specify the column header of the 2nd coordinate. default: [PC2]
92+
# marker_size: specify the marker size of the PCoA plot. default: [3]
93+
# font_size: specify the font size of PCoA labels and tick labels. default: [20]
94+
# font_style: specify the font style of PCoA labels and tick labels. default: ["Arial"]
95+
main_plot <- ggplot2::ggplot(coordinate_df,
96+
ggplot2::aes(x = .data[[coordinate_1]], y = .data[[coordinate_2]], color = .data[[variable]])) +
97+
ggplot2::geom_point(size = marker_size) +
98+
ggplot2::theme_bw() +
99+
ggplot2::theme(text = ggplot2::element_text(size = font_size, family = font_style)) +
100+
ggpubr::color_palette(color_palettes)
101+
102+
xdens <- cowplot::axis_canvas(main_plot, axis = "x") +
103+
ggplot2::geom_density(data = coordinate_df,
104+
ggplot2::aes(x = .data[[coordinate_1]], fill = .data[[variable]]),
105+
alpha = 0.7,
106+
size = 0.2) +
107+
ggpubr::fill_palette(color_palettes)
108+
109+
ydens <- cowplot::axis_canvas(main_plot, axis = "y", coord_flip = TRUE) +
110+
ggplot2::geom_density(data = coordinate_df,
111+
ggplot2::aes(x = .data[[coordinate_2]], fill = .data[[variable]]),
112+
alpha = 0.7,
113+
size = 0.2) +
114+
ggplot2::coord_flip() +
115+
ggpubr::fill_palette(color_palettes)
116+
117+
plot1 <- cowplot::insert_xaxis_grob(main_plot, xdens, grid::unit(.2, "null"), position = "top")
118+
plot2 <- cowplot::insert_yaxis_grob(plot1, ydens, grid::unit(.2, "null"), position = "right")
119+
120+
comb_plot <- cowplot::ggdraw(plot2)
121+
comb_plot
76122
}

0 commit comments

Comments
 (0)