Adding Taxonomic Ranks to Phyloseq? #118
Answered
by
kbdilliplaine
kbdilliplaine
asked this question in
Q&A
-
Hello, I am just found your package and am only now working my way through the available information. Is there a way to correct phyloseq ranks when importing qiime2 .qza files? I use the PR2 database which has 9 taxonomic ranks starting at Domain, Supergroup etc. so not only are the ranks incorrect, but I am missing the genus and species entirely. Thank you! |
Beta Was this translation helpful? Give feedback.
Answered by
kbdilliplaine
Jul 19, 2023
Replies: 1 comment 1 reply
-
Hi, Not sure what your specific aim is, but here are some suggestions and examples:
library(phyloseq)
library(microViz)
#> microViz version 0.10.10 - Copyright (C) 2023 David Barnett
#> ! Website: https://david-barnett.github.io/microViz
#> ✔ Useful? For citation details, run: `citation("microViz")`
#> ✖ Silence? `suppressPackageStartupMessages(library(microViz))`
# example phyloseq data
data("GlobalPatterns")
rank_names(GlobalPatterns)
#> [1] "Kingdom" "Phylum" "Class" "Order" "Family" "Genus" "Species"
colnames(GlobalPatterns@tax_table)
#> [1] "Kingdom" "Phylum" "Class" "Order" "Family" "Genus" "Species"
# example of setting new names
colnames(GlobalPatterns@tax_table) <- c("K.", "P.", "C.", "O.", "F.", "G.", "S.")
rank_names(GlobalPatterns)
#> [1] "K." "P." "C." "O." "F." "G." "S."
# example of modifying ranks with microViz tax_mutate
GP <- GlobalPatterns %>% tax_mutate(
F_G = paste(F., G., sep = "_"), # combines 2 ranks
F. = NULL, G. = NULL, S. = NULL # deletes these 3 ranks
)
rank_names(GP)
#> [1] "K." "P." "C." "O." "F_G"
GlobalPatterns@tax_table[30:40, 4:7]
#> Taxonomy Table: [11 taxa by 4 taxonomic ranks]:
#> O. F. G. S.
#> 185612 "Cenarchaeales" "Cenarchaeaceae" NA NA
#> 321018 "Cenarchaeales" "Cenarchaeaceae" NA NA
#> 549041 "Cenarchaeales" "Cenarchaeaceae" "Cenarchaeum" NA
#> 153762 "Cenarchaeales" "Cenarchaeaceae" "Cenarchaeum" NA
#> 155789 "Cenarchaeales" "Cenarchaeaceae" "Cenarchaeum" NA
#> 155495 "Cenarchaeales" "Cenarchaeaceae" "Cenarchaeum" "Cenarchaeumsymbiosum"
#> 1029 "Cenarchaeales" "Cenarchaeaceae" "Cenarchaeum" "Cenarchaeumsymbiosum"
#> 155526 "Cenarchaeales" "Cenarchaeaceae" "Nitrosopumilus" NA
#> 197473 "Cenarchaeales" "Cenarchaeaceae" "Nitrosopumilus" NA
#> 315545 "Cenarchaeales" "Cenarchaeaceae" "Nitrosopumilus" NA
#> 348549 "Cenarchaeales" "Cenarchaeaceae" "Nitrosopumilus" NA
GP@tax_table[30:40, 4:5]
#> Taxonomy Table: [11 taxa by 2 taxonomic ranks]:
#> O. F_G
#> 185612 "Cenarchaeales" "Cenarchaeaceae_NA"
#> 321018 "Cenarchaeales" "Cenarchaeaceae_NA"
#> 549041 "Cenarchaeales" "Cenarchaeaceae_Cenarchaeum"
#> 153762 "Cenarchaeales" "Cenarchaeaceae_Cenarchaeum"
#> 155789 "Cenarchaeales" "Cenarchaeaceae_Cenarchaeum"
#> 155495 "Cenarchaeales" "Cenarchaeaceae_Cenarchaeum"
#> 1029 "Cenarchaeales" "Cenarchaeaceae_Cenarchaeum"
#> 155526 "Cenarchaeales" "Cenarchaeaceae_Nitrosopumilus"
#> 197473 "Cenarchaeales" "Cenarchaeaceae_Nitrosopumilus"
#> 315545 "Cenarchaeales" "Cenarchaeaceae_Nitrosopumilus"
#> 348549 "Cenarchaeales" "Cenarchaeaceae_Nitrosopumilus" Created on 2023-07-19 with reprex v2.0.2 |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Thank you! TLDR:
qiime2R
is the package responsible for handling the taxonomic ranks and modifying the code there solves the problem.I did some digging this morning to locate where the taxonomy rank number and names come from during import from qiime2.
qiime2R::parse_taxonomy()
is responsible for handling this and is default set to the 7 taxonomic levels we learned as kids.I was able to edit the function using
trace("parse_taxonomy", edit=TRUE)
to change the names of the taxonomic ranks and add the two that were missing. https://stackoverflow.com/questions/76717311/how-to-add-and-specify-additional-ranks-to-phyloseq-taxonomy for the full answer.