Docker Learning Guideline by Loai
Docker is an open-source platform designed to automate the deployment, scaling, and management of applications. Docker's official website states that "it helps developers build, share, run, and verify applications anywhere โ without tedious environment configuration or management".
There are several concepts that are crucial to grasp when it comes to learning Docker:
- Dockerfile: A text file/doc that contains a series of instructions, dependencies, or commands to run on the command line to build and assemble an image.
- Image: A derivative of the assembled Dockerfile, read-only templates, and executable files used to create containers.
- Container: A standalone, standard, and executable unit of software that packages up code and its dependencies and contain everything needed to run an application.
- Docker Engine: The main technology that enables containerization. It consists of Docker daemon, Docker client, and other components.
- Docker Hub: A cloud-based repository/registry which enables users to store, share, and distribute Docker images - like Github.
This section denotes the commands you may need to use in your daily practice. Each command listed below will have a brief explanation which you may refer to grasp its purpose.
-
docker build -t <image-name> .
Purpose: builds a Docker image from a Dockerfile located in your current directory (
.
)docker build
: This tell Docker to create a new image from the instructions in Dockerfile.-t <image-name>
: The-t
flag stands for tag. It allows you to give a name and optionally a version tag to the image you're building.
-
docker run --name <container-name> -d -p PORT:PORT <image-name>
Purpose: starts a new Docker container from the image you built from the previous command.
docker run
: Therun
command creates and starts a new container from an image.--name <container-name>
: Thename
flag allows you to give a specific name to your container, making it easier to reference later.-v
: This flag is used to mount volumes or bind directories between your local machine and a Docker container. Read more on how to use it.-d
: This flag This flag stands for detached mode, which means the container will run in the background. If you donโt use it, the container will run in the foreground and block your terminal. You may use this flag so that you don't see logs from Docker in your terminal.-p PORT:PORT
: The-p
flag maps the container's internal port to a port on your machine. In this case, you're mapping port 4000 of the container to port 4000 of your machine (containerPORT:machinePORT
), allowing you to access the app running inside the container via localhost:4000.
-
docker ps
Purpose: lists all currently running conatainers.
docker ps -a
: The flag-a
lists all containers including stopped ones.
-
docker logs <container-name>
Purpose: views the logs of a running (or stopped) container. It can help troubleshooting and understanding what's happening inside your containerized application.
-
-f
continuously stream the logs in real time. It keeps logs open in your terminal, and you can see new log entries as they are generated, for example:docker logs -f <container-name>
-
-t
includes timestamps with each log line, for example:docker logs -t <container-name>
-
--since
&--until
fetch logs from a specific time range. For example:docker logs --since "2023-09-22T00:00:00" <container-name>
ORdocker logs --until "5m" <container-name>
(logs from the last 5 mins)
-
-
docker exec -it <container-name> bash
Purpose: This command lets you interact with a running container by opening a Bash shell inside it.
docker exec
: executes a command inside a running container.-it
: these are two flags combined:-i
stands for interactive mode, keeping the session open.-t
allocates a pseudo-TTY, basically giving you a shell-like interface.
bash
: a command that will run inside the container to open a bash shell.
-
docker image ls
Purpose: lists all the images available on your system. This shows the images you have built or pulled, along with details like the image ID, size, and creation time.
-
docker rm <container-name> -f
Purpose: removes/deletes a container.
docker rm
: It removes a container-f
: This flag forces the container to stop and removes it, even if it's still running. Without-f
, Docker will not remove running container, so you have to stop the container first withdocker stop <container-name>
โน๏ธ You might need to look into Windows Configurations if you are following this guide using Windows OS.
- Build your image:
docker build -t <image-name> .
- Run the container:
docker run --name <container-name> -d -p 4000:4000 <image-name>
- Check running containers:
docker ps
- Check the logs of a container (if needed):
docker logs <container-name>
- Interact with the container (if needed):
docker exec -it <container-name> bash
- Stop the container (optional):
docker stop <container-name>
- Remove the container (if no longer needed):
docker rm <container-name> -f
- List images (optional):
docker image ls
In Nodejs, in development stage, when you modify a file and you want to server to restart automatically when you save your changes, you need to use Nodemon
package. Nodemon works by monitoring the file system for changes using a feature called file system events. When you modify a file, an event is triggered that Nodemon listens for, prompting it to restart your application as a Hot Reload. You need to add your nodemon index.js
command into your Dockerfile. In this project, we add it through the package.json scripts into Dockerfile as CMD [ "npm", "run", "dev" ]
.
Usually when we want to see a change affecting the containers' files, we need to remove the running container first, do the changes in local files in the machine, then build the image, and finally run the container. In order to bypass all these steps and make the container files syncs directly, we need to add a new flag to docker run
command which is -v
. This way, you donโt have to rebuild the Docker image every time you make a small change to your code. When using tools like Nodemon, changes in the local directory will trigger automatic reloads inside the container.
The -v
flag in Docker is used to mount volumes or bind directories between your local machine and a Docker container. It allows you to share files and directories between your host machine and the container, making it easier to work with files during development.
Your run command will look like: docker run --name <container-name> -v [host-path]:[container-path] -d -p PORT:PORT <image-name>
You may need this if you wish you local machine changes to reflect directly on your Docker container only (one way), while any changes in the Docker container will NOT reflect any changes on your local machine.
Your run command will look like: docker run --name <container-name> -v [host-path]:[container-path]:ro -d -p PORT:PORT <image-name>
We added :ro
after the container path indicating Read-only.
But, what happens if I delete, for example, node_modules
from my local machine? As you have guessed, node_modules
will also be deleted from the container's files and therefore crashing the running application. How should we solve that? Welcome to Anonymous Volumes.
These volume has no host directory specified (unlike the first ones), so Docker will automatically create an anonymous volume to store the data in the specified directory such as /app/node_modules
inside the container. The container can write to this volume (since itโs not read-only), and the data physically persists as long as the volume exists, meaning that even if you remove the container, your volume will still be there (not flushed).
Your run command will look like: docker run --name <container-name> -v [host-path]:[container-path]:ro -v </container/path/to/file> -d -p PORT:PORT <image-name>
Now, even if you delete node_modules
are deleted in your local, your application on the container will still be running, but you might notice the modules are removed from your container as well (when you use docker exec...
command to view the content of the container files). Don't worry, node_modules
are still there. As we said, the data persists as long as the volume exists, so if you remove/delete the container and run it back, you will see the modules coming back again (even though your modules are deleted from your local machine). You may navigate to Docker Desktop -> Volumes to view your available volumes. That explains why the application was still running and not crashed in the container.
The following are some of the commands you might need to use to interact with volumes
docker volume ls
: lists all the available volumesdocker volume rm <volume-name>
: removes the volume you want to remove (name can be obtained fromdocker volume ls
command)docker volume prune
: removes the volumes that are not being used by the current running containers (unused ones only).
Is there a simpler solution? Of course!
In Ideal world, we maintain our main source files within a folder called src
. Therefore, when we do the one-way binding, we will do it this way -v $(pwd)/src:/app/src:ro
. This way you will ensure that the files within the src
folder will only mirror the changes with the container's src folder, leaving the outsider files safe and untouched from changes or modifications. If you wish to follow this way, don't forget to update your package.json
scripts to nodemon --legacy-watch src/index.js
so it can navigate to src
to read your files correctly.
If you face any challenges seeing your updates, try to build the image again and try run.
Docker Compose is a tool used for defining and running multi-container Docker applications. Instead of managing individual Docker containers manually, Compose assists in automating and managing complex environments with a simple, declarative approach by setting containers, their configurations, and how they interact with each other in a single YAML file, usually called docker-compose.yml
.
Key Concepts in Docker Compose:
- Service: a definition of a container that is part of your application (e.g. web, database, cache, backend).
- YAML File: Describes how your application is composed of multiple services in
docker-compose.yml
. - Networking: Compose sets up networking between containers so they can communicate with each other.
- Volumes: Helps you manage persistent storage within the configuration.
- Scaling: You can scale services up or down for load balancing or redundancy.
To run your docker compose, run docker-compose up -d
. -d
is used to activate detach mode as before.
To shut down your docker compose, run docker-compose down
.
Here is a simple structure of how to define a basic Compose file for your project.
The above commands will work if you have a single docker-compose.yml
file in your project. The next section will guide you how to manage multiple files (indicating multiple envs) in your project.
When publishing your application, you will need environment variables to be utilised within your application. For that, in your docker-compose.yml
, add the following block in under your defined service:
env_file:
- ./.env
This is supposed to inform your Docker Compose file that path to your .env
file where you store your environment variables. If you wish to declare specific envs for different Docker environments, use:
environment:
- yourEnv=<whatever>
Whenever you plan to split your application to different environments (e.g. development, staging, production, etc.), you may need to create a specific Docker Compose file for each environment. Suppose we're only having DEV and PROD for this example. We will need to create two Docker Compose file as follows:
Development: docker-compose.dev.yml
services:
node-app:
container_name: <your-container-name>
build: .
volumes:
- ./src:/app/src:ro
ports:
- "<port>:<port>"
environment:
- NODE_ENV=dev
env_file:
- ./.env
Production: docker-compose.prod.yml
services:
node-app:
container_name: <your-container-name>
build: .
ports:
- "<port>:<port>"
environment:
- NODE_ENV=prod
env_file:
- ./.env
Note: We have removed volumes as we don't want any mirroring in the prod env
In this scenario, we can use the following commands to run the specific docker compose environment file:
docker-compose -f <specific-docker-compose-file> up -d
docker-compose -f <specific-docker-compose-file> down
But in reality, we don't want to keep duplicating the content in each environment. What if we want to add/modify a service or a configuration? We will have to repeat the same setting in all Docker Compose files making it a cumbersome. Therefore, a common practice is that we can save the common configuration in a docker-compose.yml
file and keep only the parts that differ within the environment specific file to avoid repetition. The following is an example:
Main Docker Compose File: docker-compose.yml
(for all environments)
services:
node-app:
container_name: <your-container-name>
build: .
ports:
- "<port>:<port>"
env_file:
- ./.env
Development: docker-compose.dev.yml
services:
node-app:
volumes:
- ./src:/app/src:ro
environment:
- NODE_ENV=dev
Production: docker-compose.prod.yml
services:
node-app:
environment:
- NODE_ENV=prod
If you wish to use this way, use the following commands to run your specific environment docker compose file. Bear in mind that <common-docker-compose-file>
here refers to docker-compose.yml
file, in this example, while <docker-compose-env-file>
refers to either docker-compose.dev.yml
or docker-compose.prod.yml
docker-compose -f <common-docker-compose-file> -f <docker-compose-env-file> up -d
docker-compose -f <common-docker-compose-file> -f <docker-compose-env-file> down
For the environments that do not have the volumes for mirroring, you might need to rebuild using the following command, in case you got changes that want them to be reflected:
docker-compose -f <common-docker-compose-file> -f <docker-compose-env-file> -d --build
Once you allow your app to run on different environments as explained in the previous section, you might face a situation where you want to run npm run dev
for development within Dockerfile and npm start
for production. Although instantiating Dockerfile to multiple environments is possible, yet it will not be practical if there is a change you want to add later on as that change might need to be added to all Dockerfiles one-by-one.
One way to allow your Dockerfile work for multiple environment is the following:
If you want to override CMD [ "npm", "run", "dev" ]
so that it works as npm run dev
for dev and npm start
for prod is to add the following to your docker compose files:
- For
docker-compose.dev.yml
, addcommand: ["npm", "run", "dev"]
- For
docker-compose.prod.yml
, addcommand: ["npm", "start"]
Example:
services:
node-app:
build:
context: .
args:
- NODE_ENV=dev
volumes:
- ./src:/app/src:ro
environment:
- NODE_ENV=dev
command: ["npm", "run", "dev"] // see the command added here
You may keep the CMD command as it is in your Dockerfile as the commands shown above will override that one and it would reflect your env.
Now, we're done with running the command! How about npm install
? We are worried that in production we might install nodemon
there as well. We want to keep it only for dev environment. Here is how we can solve that:
- Remove
build: .
fromdocker-compose.yml
as that would not be common among all envs anymore. - In Dockerfile, remove
RUN npm install
, and add the following lines instead:
ARG NODE_ENV
RUN if [ "$NODE_ENV" = "prod" ]; \
then npm install --omit=dev; \
else npm install; \
fi
ARG
states for Argument. It's the parameter we will pass from Docker compose to tell Dockerfile what env we are using.
The purpose of the above block is to say, if NODE_ENV
is dev, we want to run npm install --omit=dev
, and if NODE_ENV
is prod, we want to run npm install
. So your Dockerfile would look like:
FROM node:20
WORKDIR /app
COPY package.json .
ARG NODE_ENV
RUN if [ "$NODE_ENV" = "prod" ]; \
then npm install --omit=dev; \
else npm install; \
fi
COPY . .
EXPOSE 4000
CMD [ "npm", "run", "dev" ]
Side note, --omit=dev
omits all the devDependencies packages when it installs your packages, ensuring only installing commands that are ready for prod (a.k.a without nodemon)
- In your
docker-compose.dev.yml
, add the following block:
build:
context: .
args:
- NODE_ENV=dev
and for your docker-compose.prod.yml
, add the following block:
build:
context: .
args:
- NODE_ENV=prod
args
here tells Dockerfile what environment we are running.
Finally, we need to use this command in the terminal to run it:
docker-compose -f docker-compose.yml -f docker-compose.dev.yml up -d --build
Change docker-compose.dev.yml
to docker-compose.prod.yml
if you want to run your production containers.
docker-compose.dev.yml
and then stop the container and try to build again to run on docker-compose.prod.yml
, your node_modules
might get cached, so you might still see devDependency packages inside your production modules. Don't panic. Just use the following command to clean Docker build:
docker-compose -f docker-compose.yml -f docker-compose.prod.yml down --rmi all
--rmi
stands for remove images. It allows you to specify whether or not Docker should remove the images used by your services when taking down the containers.
Another way for you to allow Dockerfile accommodate different stages or environments is to introduce how you expect it to handle each stage using as <env>
.
Also, you may use base
to help you configure the instructions that need to run in common before starting your specific environment configurations. Your Dockerfile might look like this:
FROM node:20 as base
FROM base as dev
WORKDIR /app
COPY package.json .
ARG NODE_ENV
RUN npm install
COPY . .
EXPOSE 4000
CMD [ "npm", "run", "dev" ]
FROM base as prod
WORKDIR /app
COPY package.json .
ARG NODE_ENV
RUN npm install --omit=dev
COPY . .
EXPOSE 4000
CMD [ "npm", "start" ]
If you wonder how Dockerfile will know which stage/env you are introducing, we still need to do a slight adjustment on docker-compose files as well.
Replace your args
block with target: <env>
, so your docker-compose.prod.yml
file will look, for example, as follows:
services:
node-app:
build:
context: .
target: prod // we added this
environment:
- NODE_ENV=prod
command: ["npm", "start"]
Running and communicating multiple containers is very common in development. This section will layout how you can run two containers - one for your NodeJs application and the other for your MongoDB database.
Start by installing mongoose: npm install mongoose
. Then, by looking at your docker-compose.yml
file. We store there our common services. We will need to add our mongoDB service there. By looking at DockerHub, we can now what configurations we need to add to our compose file. A sample could be the following:
mongo:
image: mongo
restart: always
environment:
MONGO_INITDB_ROOT_USERNAME: root
MONGO_INITDB_ROOT_PASSWORD: example
Please note that when you specify restart: always
, the Docker daemon will try to restart the container indefinitely.
In your index.js
file, we need to add the mongoose connection, as in follows:
const URI = `mongodb://${DB_USER}:${DB_PASSWORD}@${DB_HOST}:${DB_PORT}`
mongoose.connect(URI)
.then(() => console.log('connected to db'))
.catch((err) => console.log('failed to connect to db', err));
We identified the DB_USER
and DB_PASSWORD
from settings in our compose file we set, we need to identify the DB_HOST
and DB_PORT
.
To get the DB_PORT
, we can just start our container by building the image using the command explained previously: docker-compose -f docker-compose.yml -f docker-compose.dev.yml up -d --build
, then running docker ps
. You can then see the ports that is responsible to run mongo.
To get the DB_HOST
, we need to check the network of the container. The following commands will help you do that:
docker network ls
: lists all the networks the Enginedaemon
knows about. This includes the networks that span across multiple hosts in a cluster.
Usually the network's name you want to target holds the your project name (check your IDE) then a _default
substring added to it. So, for example, if your project name is docker-hands-on
, your network name shall be docker-hands-on_default
.
Once you get that name, run the following command:
docker network inspect <network-name>
: returns information about one or more networks. By default, this command renders all results in a JSON object.
By checking the container object, you can see the mongo IPv4Address and that's your DB host. However, it is not practical to always do this to check as this IP address might change, so what you can do is just to include your service name like mongo
to your DB Host and Docker is smart enough to detect it.
When you shut off the containers, the data stored in mongo container is erased. How can we avoid such issue?
As has been discussed previously, via Volumes. How can we create a volume here?
As follows:
- First, create volumes block (same level as services block).
- Call the volumes instance as
mongo-db
for example. Your block should look like this:
volumes:
mongo-db:
- At your
mongo
service, add volumes and define it this way:
volumes:
- mongo-db:/data/db
volumes:
- /data/db
every time you run the container, a file /data/db will be created every time you run the container. You need to link it up with mongo-db
volume that you have initiated to ensure data consistency.
Bonus: You can interact with the database by executing the bash with this command:
docker exec -it docker-hands-on-mongo-1 mongosh -u root -p example
then
show dbs
: to show databases availableuse <db-name>
: to create a dbdb.<collection-name>.insertOne({mydata: "whatever"})
: to insert data into your new collectiondb.<collection-name>.find()
: to display the data in your collection
Some of your containers will need to depend on other containers when it comes to running their services.
In our guide here, we have a NodeJs app that depends on both MongoDB and Redis. If our NodeJS container started before MongoDB service, NodeJS app will crash as its dependency is not ready yet. For that, we need to add a special configuration key called depends_on
.
So for example, since your NodeJs depends on Mongo and Redis, you will add the config key as follows:
node-app:
container_name: docker-hands-on-container
ports:
- "4000:4000"
env_file:
- ./.env
depends_on:
- mongo
- redis
This way you can guarantee that the containers our NodeJs app depend on will run and be ready just before the NodeJS container comes to life.
- [Connection Issues with DBs]: Some issues might be strange when working with Docker compose, you might face a connection issue or anything related. If you are working with
volumes
, consider deleting them, this might solve the "illogical" problem you face. The following command shuts down Docker Compose AND removes volumes:docker-compose -f <common-compose-file> -f <env-compose-file> down -v
, where-v
removes all the volumes being utilised. Use with care!
docker run --name [container-name] -v C:/Users/Loai/Desktop/[project-name]:/app -d -p PORT:PORT [image-name]
. MacOS/Linux might not face the same issue and relative paths generally work fine with them.
If you do not wish you use the lengthy absolute path, you may shorten it using $(pwd)
command. If you are using If youโre using Git Bash on Windows, it automatically converts Unix-style paths (/c/Users/...) into Windows-style paths (C:/Users/...). However, sometimes this automatic conversion interferes with Docker's ability to bind the mount properly.
You can disable Git Bash's path conversion by setting the MSYS_NO_PATHCONV=1
environment variable, like this:
MSYS_NO_PATHCONV=1 docker run --name <container-name> -v $(pwd):<container-path>:ro -d -p PORT:PORT <image-name>
After following the steps in this section, when running your localhost on Windows to test your local changes, you might notice that all local files mirrors the container files as expected but still you might not be able to see the changes. This is happening because Docker on Windows has to "translate" between the Windows file system and the virtualized Linux file system inside the container, and during this translation, file change notifications may not be passed through correctly. To solve the issue, you need to add a flag --legacy-watch
in your Nodemon command to become nodemon --legacy-watch index.js
.
--legacy-watch
tells Nodemon to use an older, more compatible method of watching for file changes. It uses polling mode (legacy method) instead of File System Events (newer method). In polling mode, Nodemon continuously checks the file modification timestamps at regular intervals to see if a file has changed. While this is slightly less efficient than using file system events (because it uses more CPU cycles), it is much more reliable in environments where native file system watching doesnโt work well (such as Docker + Windows). Using polling mode might let you experience a slight delay between making a change and Nodemon detecting it. The polling interval is short, so the delay is typically not noticeable for small projects, but it can be an issue for larger ones.