Script description

Package requirement

PySEAT may have conflict with numpy version. We recommand: numpy = 1.22.4 and pyseat = 0.0.1.3

To reproduce the result, first please download and unzip data.zip to /data directory.

If you want to start the analysis from GCN, please run /script_procedure/step1_compute_distance first to compute distance matrix of GCN. This may take some time. To save time, you can directly use sp_d.tsv in /data directory which is preproduced by step1_compute_distance.

Prior structure is generated by scripts in /script_GCN_d3/GCN_tree, please run this script before FMT, NSCLC, Anti_analysis which depend on the prior sturcture.

input: following files (can be found at data/)

metadata.tsv(metadata from GutMeta, disease should be in the header for phenotype comparison)
abd.tsv (header: species, index: sample name)
read by abd_profile.input_profile()
GCN.tsv (header: KO, index: species)
read by GCN.input_GCN()

script of taxonomy abundance difference (script_abundance_check)

Check the abundance difference for each taxon (including NAFLD 16s OTU).

script of completeness (script_completeness)

completeness
Compute the module completeness of each taxon (including NAFLD 16s OTU).
test_diff
Test completeness enrichment based one the GCN prior GCN structure and should be used after GCN_tree script in script_GCN_d3

script of prior GCN structure (script_GCN_d3)

GCN_tree
Make prior GCN tree structure.
SE_diff / NFR_diff
Check SE/nFR difference of disease and health group.
distribution_se
Plot SE distribution for disease and health group.

script of FMT (FMT)

GCN_tree result is required

Scripts related to two FMT dataset analysis.

analysis_se
Mutiple regression on SE value, days after FMT and fraction at each cluster/super-cluster.
analysis_nfr
Mutiple regression on nFR, days after FMT and fraction at each cluster/super-cluster.

script of Antibiotic treatment (script_Anti_analysis)

GCN_tree result is required

analysis_se/analysis_nfr
Check SE/nFR difference of control and exposed group at each clsuter/super-cluter.
analysis_se_exposed/analysis_nfr_exposed
Check SE/nFR difference of six participants exhibited a bloom of the opportunistic pathogen Enterobacter cloacae complex at the E7 timepoint in exposed group and control group at each clsuter/super-cluter.
merge
Merge and plot the difference test result of nFR and SE in control and exposed group.
merge_exposed
Merge and plot the difference test result of nFR and SE of the six samples and control group.
boxplot Draw boxplot for SE at each cluster/super-cluster.

script of lCFR procedure （script_procedure）

step0_NAFLD
An example of comparing keyston clusters of taxa on NAFLD dataset.
step1_compute_distance
An example of computing taxa distance and KO distance from GCN.
step2_cluster_analysis
An example of analyzing keystone cluster and keystone taxon for metagenomics abundance profiles in cMD by constructing posterior structure.
step3_count_support
An example of checking valid keystone-taxon enterotype with more than one network of size larger than 10 supporting.
utils

a. log_effect
An example of computing lCFR and showing distribution of lCFR values and CFR values without log and normalization.

b. nestedness_experiment
An example to test the nestedness compared with NULL experiments of lCFR.

c. evaluation
An example to evaluate the feature of GCN.
draw_*
Scripts used to plot the result of previous step.

script of NSCLC (script_NSCLC)

SE
Test difference of SE between response group and non-response group at each cluster/super-cluster and compute FR S score for each sample.
sig_SE
Test difference of SE between response group and non-response group at SIG1/SIG2 clsuter raised in original study and compute S score for each sample.
distribution Plot SE distribution for response group and non-response group.
combination
Compute combined S score for each sample.
The r script Used to produce the analysis in original study and is provided by https://github.com/valerioiebba/TOPOSCORE/tree/main.

plot keystone graph (script_keystone_graph)

GCN_tree result is required
abundance difference result is required

run.ipynb
Plot keystone result.

script of eigenspecies analysis (script_eigen_graph_preservation)

GCN_tree result is required

run.ipynb
Find eigen species and plot the result.

script to predict disease (script_predict)

GCN_tree result and SE values are required

CRC_recurrent_ROC.ipynb
Predict CRC.
IBD_ROC.ipynb
Predict IBD.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
__pycache__		__pycache__
draw_network/init		draw_network/init
network		network
script_Anti_analysis		script_Anti_analysis
script_FMT		script_FMT
script_FMT_FR		script_FMT_FR
script_GCN_d3		script_GCN_d3
script_NSCLC		script_NSCLC
script_abundance_check		script_abundance_check
script_completeness		script_completeness
script_eigen_graph_preservation		script_eigen_graph_preservation
script_eigensp		script_eigensp
script_keystone_graph		script_keystone_graph
script_predict		script_predict
script_procedure		script_procedure
.gitignore		.gitignore
FR.py		FR.py
GCN.py		GCN.py
abd_profile.py		abd_profile.py
analysis.py		analysis.py
completeness.py		completeness.py
data.zip		data.zip
detect_eigen.py		detect_eigen.py
distance.py		distance.py
itol_util.py		itol_util.py
nestedness_calculator.py		nestedness_calculator.py
readme.md		readme.md
sample_cluster.py		sample_cluster.py
se.py		se.py
test_se.ipynb		test_se.ipynb
tree_util.py		tree_util.py
vis_function.py		vis_function.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Script description

Package requirement

input: following files (can be found at data/)

script of taxonomy abundance difference (script_abundance_check)

script of completeness (script_completeness)

script of prior GCN structure (script_GCN_d3)

script of FMT (FMT)

script of Antibiotic treatment (script_Anti_analysis)

script of lCFR procedure （script_procedure）

script of NSCLC (script_NSCLC)

plot keystone graph (script_keystone_graph)

script of eigenspecies analysis (script_eigen_graph_preservation)

script to predict disease (script_predict)

About

Releases

Packages

Languages

deepomicslab/FR_Hierarchy_Gut

Folders and files

Latest commit

History

Repository files navigation

Script description

Package requirement

input: following files (can be found at data/)

script of taxonomy abundance difference (script_abundance_check)

script of completeness (script_completeness)

script of prior GCN structure (script_GCN_d3)

script of FMT (FMT)

script of Antibiotic treatment (script_Anti_analysis)

script of lCFR procedure （script_procedure）

script of NSCLC (script_NSCLC)

plot keystone graph (script_keystone_graph)

script of eigenspecies analysis (script_eigen_graph_preservation)

script to predict disease (script_predict)

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages