Microbial Profiling Using iMAP

Download iMAP repository

git clone https://github.com/tmbuza/iMAP.git

# OR

curl -LOk https://github.com/tmbuza/iMAP/archive/master.zip
unzip master.zip
mv iMAP-master iMAP
rm -rf master.zip

# OR

wget --no-check-certificate https://github.com/tmbuza/iMAP/archive/master.zip 
unzip master.zip
mv iMAP-master iMAP
rm -rf master.zip

Add data to designated folders

This Table provide useful information to help you place data in correct folders. Use the new versions if available.

Using demo data

The following command copy the required data files located in the iMAP/resources/ and place them in their respective locations.

bash iMAP/code/00_allDemo_data.bash

Check missing folders or files

Run checkFiles command everytime you want to verify any missing files. Add all missing files and check again untill everything looks ok.

bash iMAP/code/00_checkFiles_driver.bash

What to replace

Rawdata: iMAP/data/raw/
Metadata: iMAP/data/metadata/
Mapping files: iMAP/data/metadata/

Re-run checkFiles command everytime you change the original data files. It is important to maintain the format presented by the demo data.

bash iMAP/code/00_checkFiles_driver.bash

Changing default settings

Users who want to change the default settings may do so using any text editor. Use this table to locate files with default parameters that may be altered.

Using Docker Images

Install Docker Desktop

Link: https://docs.docker.com/get-docker/.

Set up Docker Account

Docker ID grants you access to Docker Hub repositories. All you need is an email address.
Register for a Docker ID at https://docs.docker.com/docker-id/.

Confirm the installation

If the commands below work, then you are all set.

docker login
docker info

Metadata profiling

Download rpackage image

docker pull tmbuza/rpackages:v3.5.2

Create a container for bash CLI

containerName=metadataprofile
docker run --rm --name=$containerName -it -v $(pwd)/iMAP:/imap --workdir=/imap  tmbuza/rpackages:v3.5.2 /bin/bash

Start profiling metadata

bash code/01_metadataProfiling_driver.bash

Exit the container

exit

Progress report 1: Metadata profiling

Skip for now!
This chunk will hold an R script that generates Progress report 2: Read Preprocessing
bash code/01_metadataProfiling_driver.bash (Being updated to run outside Docker container)

Read Quality Control

Download reaad QC image

docker pull tmbuza/readqctools:v1.0.0

Create a container for bash CLI

containerName=readpreprocess
docker run --rm --name=$containerName -it -v $(pwd)/iMAP:/imap tmbuza/readqctools:v1.0.0 /bin/bash

Start read preprocessing

bash code/02_readPreprocess_driver.bash

Exit the container

exit

View MultiQC report

Make sure you exited the container which is done by running exit command above. That will bring you back to your normal CLI. The HTML QC-summary report (multiqc_report.html) is stored in the iMAP/results/multiqc/ folder. You can open the HTML report(s) using your favorite browser or try to open it using CLI like:

open iMAP/results/multiqc/qced/R1/multiqc_report.html

Progress report 2: Read Preprocessing

Skip for now!
This chunk will hold an R script that generates Progress report 2: Read Preprocessing
bash 02_readPreprocess_driver.bash  (Being updated to run outside Docker container)

Microbial Profiling

A: MOTHUR-BASED PIPELINE

Requires a Mothur-formatted classifer.
Default classifier is a recreated seed from Silva database.
You can use different classifiers from other Mothur taxonomy outlines.

Download Mothur images

docker pull tmbuza/mothur:v1.43.0

Create a container for bash CLI

containerName=mothurseqprocessing
docker run --rm --name=$containerName -it -v $(pwd)/iMAP:/imap --workdir=/imap tmbuza/mothur:v1.41.3 /bin/bash

Start sequence processing and classification

The sequence processing and classification command will implement the folllowing:

Assemble the forward and reverse reads, screen by length and create representative sequences
Align representative sequences with reference alignments..
Denoise to remove poor alignments
Remove Chimeric sequences.
Classify the sequences and do post-classification QC.
Estimates the sequencing error rate.

bash ./code/03_imapClassifySEQ_driver.bash

You may see a lot of WARNINGS. It is safe to ignore them. Also, the program is set to remove all temporary files after processing the sequences. If no any temporary file found you will see an error message that reads: rm: cannot remove '.temp': No such file or directory*. Just ignore it.

Pick a method for OTU clustering and taxonomy assignement

Method 1: Phylotype-based method (works for large and small dataset).

bash ./code/04_1_phylotype_driver.bash

Method 2: OTU-cluster method (works best for small dataset).

bash ./code/04_2_otucluster_driver.bash

Method 3: Phylogeny-based method (works best for small dataset).

bash ./code/04_3_phylogeny_driver.bash

End of Mothur-based bioinformatics pipeline!

B: QIIME2-BASED PIPELINE

Must install iMAP repo first which will automatically create a directory named iMAP.
Requires a QIIME2 trained classifer.
You can use Naive Bayes (nb) classifiers trained on GreenGenes or SILVA database with 99% OTUs.
Default: Greengenes 515-806 conservative fragments.
You can train your own classifier using the q2-feature-classifier.

If using other pretrained QIIME2-formatted classifiers you must replace the default file with the filename containing your favorable classifier. It is safe to do so when outside the container. Path: iMAP/code/qiime2/gg-13-8-99-515-806-nb-classifier.qza.

Download QIIME2 images

We will use a QIIME2 version 2020.6 Docker image. Credit goes to the QIIME2 team for developing the qiime2core images. We renamed the image to enable us to commit changes to the image. Optionally, you can pull different tag of the qiime2/core image directly from QIIME2 docker repository. Please note that, using a different tag may require a version-compartible trained OTU classifier.

docker pull tmbuza/qiime2core:v2020.6

Create QIIME2 container

containerName=qiime2classification
docker run --rm --name=$containerName -it -v $(pwd)/iMAP:/imap --workdir=/imap  qiime2/core:2020.6 /bin/bash

Start the analysis

bash code/qiime2/qiime2.bash

View Progress

# While inside the container
cat LOG/qiime2logfile.txt

# While outside the container
cat iMAP/LOG/qiime2logfile.txt

# or locate the text file named *qiime2logfile.txt* in the iMAP/LOG folder and open it manually.

Exit the QIIME2 container

exit

View QIIME 2 results

Output path: iMAP/data/qiime2/results/
Use client-side interface: https://view.qiime2.org/ to view the results (see image below).
Simply drag and drop the QIIME 2 artifacts (.qza files) or the visualizations (.qzv files).
For more help visit https://view.qiime2.org/about.

End of QIIME2-based bioinformatics pipeline!

Post-Classification Analysis

In Process

Citation

Teresia M. Buza, Triza Tonui, Francesca Stomeo, Christian Tiambo, Robab Katani, Megan Schilling, Beatus Lyimo, Paul Gwakisa, Isabella M. Cattadori, Joram Buza and Vivek Kapur. iMAP: an integrated bioinformatics and visualization pipeline for microbiome data analysis. BMC Bioinformatics (2019) 20:374. Free Full Text.

URLs	Description	Status
Manuscript	In BMC Bioinformatics	Software
README	Guidelines	iMAP README

Files

_index.md

Latest commit

History

_index.md

File metadata and controls

Microbial Profiling Using iMAP

Download iMAP repository

Add data to designated folders

Check missing folders or files

What to replace

Changing default settings

Using Docker Images

Install Docker Desktop

Set up Docker Account

Confirm the installation

Metadata profiling

Download rpackage image

Create a container for bash CLI

Start profiling metadata

Exit the container

Read Quality Control

Download reaad QC image

Create a container for bash CLI

Start read preprocessing

Exit the container

View MultiQC report

Microbial Profiling

A: MOTHUR-BASED PIPELINE

Download Mothur images

Create a container for bash CLI

Start sequence processing and classification

Pick a method for OTU clustering and taxonomy assignement

B: QIIME2-BASED PIPELINE

Download QIIME2 images

Create QIIME2 container

Start the analysis

View Progress

Exit the QIIME2 container

View QIIME 2 results

Post-Classification Analysis

In Process

Related Links

Citation