Skip to content

Latest commit

 

History

History
188 lines (142 loc) · 3.82 KB

hands-on-start.md

File metadata and controls

188 lines (142 loc) · 3.82 KB

This hands on is based on Materials for Analyzing Next-Generation Sequencing (ANGUS) course.

Run Docker

  • run simple ubuntu based container
# host
docker run ubuntu:14.04
  • list docker containers
# host
docker ps
docker ps -a
docker images
  • the container has been created, but had nothing to do, so it shut down

  • we can attach to the container (like ssh to the remote)

    -i keep STDIN open

    -t allocate pseudo-tty

# host
docker run -it ubuntu:14.04
  • use second terminal window to list containers
# host
docker ps -a
  • exit with exit
  • if you run same command again, new ubuntu base container will be created
  • make a new container, create a file and exit. Restart the container again (docker start [container ID], docker attach [container ID]). Are your changes still there?
  • you have to delete containers by hand, they will stack up very quickly,
  • you can docker run with -rm flag to delete the container once it exits
# host
docker run --rm ubuntu:14.04

Building Docker images

# host
docker run -it ubuntu:14.04
  • install necessary dependencies (remember, you're already root)
# in the container
apt-get update && apt-get install -y g++ make git zlib1g-dev python
  • checkout and install megahit
# in the container
git clone https://github.com/voutcn/megahit.git /home/megahit
cd /home/megahit && make
  • we don't want to do it again, we want to keep this image for use
# host
docker commit -m "build megahit" e82c6007f7a4 megahit
docker images
  • we can now run it and use megahit
# host
docker run -it megahit

# in the container
/home/megahit/megahit
  • later we'll put it in dockerhub so that no one ever has to do it again
  • how do we get the data for analysis to the container?

getting data to the container

  • get data locally
# host
mkdir $HOME/data
cd $HOME/data
curl -O http://public.ged.msu.edu.s3.amazonaws.com/ecoli_ref-5m-trim.se.fq.gz
curl -O http://public.ged.msu.edu.s3.amazonaws.com/ecoli_ref-5m-trim.pe.fq.gz
  • run container and connect to local data directory
# host
docker run -v $HOME/data:/data -it megahit

# in the container
ls /data
  • lets run the assembly
# in the container
/home/megahit/megahit --12 /data/*.pe.fq.gz \
                      -r /data/*.se.fq.gz \
                      -o /data/ecoli -t 4
  • exit and look at analysis data
# in the container
exit

# host
ls $HOME/data
ls $HOME/data/ecoli
  • we can run megahit command without entering the container like this (first do rm -rf [local ecoli dir])
# host
docker run -v $HOME/data:/data \
   -it megahit \
   sh -c '/home/megahit/megahit --12 /data/*.pe.fq.gz \
                     -r /data/*.se.fq.gz \
                     -o /data/ecoli -t 4'
  • we could also put the command in the script (on host or container) and run the script do-assemble.sh
#! /bin/bash
rm -fr /data/ecoli
/home/megahit/megahit --12 /data/*.pe.fq.gz \
                      -r /data/*.se.fq.gz  \
                      -o /data/ecoli -t 4
# host
chmod +x do-assemble.sh

building with Dockerfile

  • create a Dockerfile
# host
FROM ubuntu:14.04
RUN apt-get update
RUN apt-get install -y g++ make git zlib1g-dev python
RUN git clone https://github.com/voutcn/megahit.git /home/megahit
RUN cd /home/megahit && make
CMD /data/do-assemble.sh
  • we will now build and image based on the Dockerfile
# host
docker build -t megahit .
  • and run a container
# host
docker run -v $HOME/data/:/data -it megahit