You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/beta_diversity_analysis.md
+65Lines changed: 65 additions & 0 deletions
Original file line number
Diff line number
Diff line change
@@ -128,4 +128,69 @@ Below, we are showcasing how to inspect the beta diversity of microbiomes from t
128
128
129
129
## Python-based method
130
130
131
+
#### R packages required
132
+
*[scikit-bio >= 0.5.6](https://scikit.bio/)
133
+
*[pandas >= 1.3.5](https://pandas.pydata.org/)
134
+
*[numpy >= 1.23.5](https://numpy.org/)
135
+
*[matplotlib >= 3.5.0](https://matplotlib.org/)
136
+
*[seaborn >= 0.11.2](https://seaborn.pydata.org/)
137
+
138
+
#### Beta diversity analysis with PCoA plotting integrating maximum three variables
139
+
Here, we introduce a python script `multi_variable_pcoa_plot.py` in the `path_to_the_package/KunDH-2023-CRM-MSM_metagenomics/scripts` to perform PCoA analysis:
This program is to do PCoA analysis on microbial taxonomic or functional abundance data integrating maximum three variables together.
146
+
147
+
optional arguments:
148
+
-h, --help show this help message and exit
149
+
--abundance_table [ABUNDANCE_TABLE]
150
+
Input the merged abundance table generated by MetaPhlAn.
151
+
--metadata [METADATA]
152
+
Input a tab-delimited metadata file.
153
+
--transformation [TRANSFORMATION]
154
+
Specify the tranformation function applied on data points in the original table. For abundance table, you can choose <sqrt>/<log>. Default setting is <None>.
155
+
--metric [METRIC] Specify the metric you want to use for calculating beta diversity in the case of as input using abundance table.<braycurtis>/<unweighted_unifrac>/<jaccard>/<weighted_unifrac>. Default setting is <braycurtis>
156
+
--amplifier [AMPLIFIER]
157
+
Specify how much you want to amplify your original data point. For example, <--amplifier 100> indicates that all original data point times 100. Default is 1.
158
+
--sample_column [SAMPLE_COLUMN]
159
+
Specify the header of column containing metagenome sample names in the metadata file.
160
+
--variable1 [VARIABLE1]
161
+
Specify the header of the variable in the metadata table you want to assess. This variable will be represented by colors.
162
+
--variable2 [VARIABLE2]
163
+
Specify the header of second variable in the metadata table you want to assess. This variable will be represented by marker shapes.
164
+
--variable3 [VARIABLE3]
165
+
Specify the header of the third variable in the metadata table you want to assess. This variable will be represented by marker sizes.
166
+
--marker_palette [MARKER_PALETTE]
167
+
Input a tab-delimited mapping file where 1st column contains group names and 2nd column contains color codes. default: [None] (automatic handling)
168
+
--marker_shapes [MARKER_SHAPES]
169
+
Input a tab-delimited mapping file where 1st column contains group names and 2nd column contains marker shapes. default: [None] (automatic handling)
170
+
--marker_sizes [MARKER_SIZES]
171
+
Input a tab-delimited mapping file where values are group names and keys are marker size. default: [None] (automatic handling)
172
+
--output_figure [OUTPUT_FIGURE]
173
+
Specify the name for the output figure. For example, output_figure.svg
174
+
--test [TEST] Specify an output file for saving permanova test results. For example, project_name
175
+
--df_opt [DF_OPT] Specify the output name for saving coordinates (PC1 and PC2) for each sample. For example, project_name_coordinates.tsv
176
+
--font_style [FONT_STYLE]
177
+
Specify the font style which is composed by font family and font type, delimited with a comma. default: [sans-serif,Arial]
To demostrate the usage of `multi_variable_pcoa_plot.py`, we will drawa PCoA plot based on [microbiome compistion of samples](../example_data/mvpp_mpa_species_relab.tsv.bz2) from 11 populations grouped as *W (Westernization)*, *NW (Non-Westernization)*, *NWU (Non-Westernization(Urban))* and *MSM (Men-having-sex-with-men)*. Different populations will be assigned with custom colors using a [color map file](../example_data/mvpp_color_map.tsv) and *MSM* population will be highlighted with larger marker size using a [marker size map file](../example_data/mvpp_marker_size_map.tsv). The metadata of each sample is provided by a [metadata file](../example_data/mvpp_metadata.tsv).
As optional ouputs, `multi_variable_pcoa_plot.py` also generates non-adjustment PERMANOVA test (e.g. [mvpp_permanova.tsv](../example_data/mvpp_permanova.tsv)) and coordinates of PC1 and PC2 (e.g. [mvpp_coordinates.tsv](../example_data/mvpp_coordinates.tsv)) which can be used in visualization in other ways we will discuss shortly below.
0 commit comments