git clone https://github.com/tmbuza/iMAP.git
# OR
curl -LOk https://github.com/tmbuza/iMAP/archive/master.zip
unzip master.zip
mv iMAP-master iMAP
rm -rf master.zip
# OR
wget --no-check-certificate https://github.com/tmbuza/iMAP/archive/master.zip
unzip master.zip
mv iMAP-master iMAP
rm -rf master.zip
This Table provide useful information to help you place data in correct folders. Use the new versions if available.
Using demo data
The following command copy the required data files located in the iMAP/resources/ and place them in their respective locations.
bash iMAP/code/00_allDemo_data.bash
Run checkFiles command everytime you want to verify any missing files. Add all missing files and check again untill everything looks ok.
bash iMAP/code/00_checkFiles_driver.bash
- Rawdata: iMAP/data/raw/
- Metadata: iMAP/data/metadata/
- Mapping files: iMAP/data/metadata/
Re-run checkFiles command everytime you change the original data files. It is important to maintain the format presented by the demo data.
bash iMAP/code/00_checkFiles_driver.bash
Users who want to change the default settings may do so using any text editor. Use this table to locate files with default parameters that may be altered.
Link: https://docs.docker.com/get-docker/.
- Docker ID grants you access to Docker Hub repositories. All you need is an email address.
- Register for a Docker ID at https://docs.docker.com/docker-id/.
- If the commands below work, then you are all set.
docker login
docker info
docker pull tmbuza/rpackages:v3.5.2
containerName=metadataprofile
docker run --rm --name=$containerName -it -v $(pwd)/iMAP:/imap --workdir=/imap tmbuza/rpackages:v3.5.2 /bin/bash
bash code/01_metadataProfiling_driver.bash
exit
Progress report 1: Metadata profiling
Skip for now!
This chunk will hold an R script that generates Progress report 2: Read Preprocessing
bash code/01_metadataProfiling_driver.bash (Being updated to run outside Docker container)
docker pull tmbuza/readqctools:v1.0.0
containerName=readpreprocess
docker run --rm --name=$containerName -it -v $(pwd)/iMAP:/imap tmbuza/readqctools:v1.0.0 /bin/bash
bash code/02_readPreprocess_driver.bash
exit
Make sure you exited the container which is done by running exit command above. That will bring you back to your normal CLI. The HTML QC-summary report (multiqc_report.html) is stored in the iMAP/results/multiqc/ folder. You can open the HTML report(s) using your favorite browser or try to open it using CLI like:
open iMAP/results/multiqc/qced/R1/multiqc_report.html
Progress report 2: Read Preprocessing
Skip for now!
This chunk will hold an R script that generates Progress report 2: Read Preprocessing
bash 02_readPreprocess_driver.bash (Being updated to run outside Docker container)
- Requires a Mothur-formatted classifer.
- Default classifier is a recreated seed from Silva database.
- You can use different classifiers from other Mothur taxonomy outlines.
docker pull tmbuza/mothur:v1.43.0
containerName=mothurseqprocessing
docker run --rm --name=$containerName -it -v $(pwd)/iMAP:/imap --workdir=/imap tmbuza/mothur:v1.41.3 /bin/bash
The sequence processing and classification command will implement the folllowing:
- Assemble the forward and reverse reads, screen by length and create representative sequences
- Align representative sequences with reference alignments..
- Denoise to remove poor alignments
- Remove Chimeric sequences.
- Classify the sequences and do post-classification QC.
- Estimates the sequencing error rate.
bash ./code/03_imapClassifySEQ_driver.bash
You may see a lot of WARNINGS. It is safe to ignore them. Also, the program is set to remove all temporary files after processing the sequences. If no any temporary file found you will see an error message that reads: rm: cannot remove '.temp': No such file or directory*. Just ignore it.
Method 1: Phylotype-based method (works for large and small dataset).
bash ./code/04_1_phylotype_driver.bash
Method 2: OTU-cluster method (works best for small dataset).
bash ./code/04_2_otucluster_driver.bash
Method 3: Phylogeny-based method (works best for small dataset).
bash ./code/04_3_phylogeny_driver.bash
End of Mothur-based bioinformatics pipeline!
- Must install iMAP repo first which will automatically create a directory named iMAP.
- Requires a QIIME2 trained classifer.
- You can use Naive Bayes (nb) classifiers trained on GreenGenes or SILVA database with 99% OTUs.
- Default: Greengenes 515-806 conservative fragments.
- You can train your own classifier using the q2-feature-classifier.
If using other pretrained QIIME2-formatted classifiers you must replace the default file with the filename containing your favorable classifier. It is safe to do so when outside the container. Path: iMAP/code/qiime2/gg-13-8-99-515-806-nb-classifier.qza.
We will use a QIIME2 version 2020.6 Docker image. Credit goes to the QIIME2 team for developing the qiime2core images. We renamed the image to enable us to commit changes to the image. Optionally, you can pull different tag of the qiime2/core image directly from QIIME2 docker repository. Please note that, using a different tag may require a version-compartible trained OTU classifier.
docker pull tmbuza/qiime2core:v2020.6
containerName=qiime2classification
docker run --rm --name=$containerName -it -v $(pwd)/iMAP:/imap --workdir=/imap qiime2/core:2020.6 /bin/bash
bash code/qiime2/qiime2.bash
# While inside the container
cat LOG/qiime2logfile.txt
# While outside the container
cat iMAP/LOG/qiime2logfile.txt
# or locate the text file named *qiime2logfile.txt* in the iMAP/LOG folder and open it manually.
exit
- Output path: iMAP/data/qiime2/results/
- Use client-side interface: https://view.qiime2.org/ to view the results (see image below).
- Simply drag and drop the QIIME 2 artifacts (.qza files) or the visualizations (.qzv files).
- For more help visit https://view.qiime2.org/about.
End of QIIME2-based bioinformatics pipeline!
URLs | Description | Status |
---|---|---|
Manuscript | In BMC Bioinformatics | Software |
README | Guidelines | iMAP README |
Teresia M. Buza, Triza Tonui, Francesca Stomeo, Christian Tiambo, Robab Katani, Megan Schilling, Beatus Lyimo, Paul Gwakisa, Isabella M. Cattadori, Joram Buza and Vivek Kapur. iMAP: an integrated bioinformatics and visualization pipeline for microbiome data analysis. BMC Bioinformatics (2019) 20:374. Free Full Text.