Everything you need to know about Dockerfile

9 min readNov 1, 2020

In this tutorial I am going to explain each and everything about a Dockerfile.

What is a Dockerfile?

A Dockerfile is just another text file. It contains a list of commands that the Docker client calls while creating an image.

The commands I am talking about here are almost similar to their equivalent Linux commands. So one no need to learn any new syntactical rules here.

Why we need Dockerfile?

The answer is simple and straightforward : “To automate the image creation process.”

Workflow around Dockerfile

The workflow will be like this:

You create the Dockerfile and define the steps required to build up your image
You then issue the docker build command which will build a Docker image
Now you can use this image to start provisioning containers with the docker run command
The Docker image you created can now be pushed to your private docker registry server to make it available for everyone within your organization.

Dockerfile creation example

Let’s start with an example to understand the Dockerfile creation in detail.

Step 1: Create a file named Dockerfile
Create a file named “Dockerfile” and edit it using any text editor of your choice. Please note that the name of the file has to be “Dockerfile” with “D” as capital and there won’t be any file extensions.

root@kmaster-rj:~/docker-demos# touch Dockerfile

Step 2 − Start adding step by step instructions to your Dockerfile

root@kmaster-rj:~/docker-demos# cat Dockerfile
#This is a sample Dockerfile
FROM ubuntu
MAINTAINER rakeshrhcss@gmail.comRUN apt-get update
RUN apt-get install –y nginx
CMD [“echo”,”demo Docker Image created”]

Let’s decode the file step by step now.

The first line “#This is a sample Dockerfile” is a comment. Comments can be added to the Docker File with the help of the # command
Every Dockerfile must start with the FROM instruction. That means you need to mention a starting point. In our example, we are creating an image from the ubuntu image.
Note: You can start FROM scratch, scratch is an explicitly empty image on the Docker store that is used to build base images like Alpine, Debian and so on.
The next command mentions the email id of the person who is going to maintain this image. Keyword here is MAINTAINER.
Next is RUN command which is used to run instructions against the image we are creating. In our example, we first update our Ubuntu system and then install the nginx server on our ubuntu image.
In the last command we are displaying a message to the user who is using this image.

Step 3: Build our image
Run the following command in terminal:
docker build -t rakeshrhcss/nginx-ubuntu:1.0 .

root@kmaster-rj:~/docker-demos# docker build -t rakeshrhcss/nginx-ubuntu:1.0 .
Sending build context to Docker daemon  2.048kB
Step 1/5 : FROM ubuntu
latest: Pulling from library/ubuntu
6a5697faee43: Pull complete
ba13d3bc422b: Pull complete
a254829d9e55: Pull complete
Digest: sha256:fff16eea1a8ae92867721d90c59a75652ea66d29c05294e6e2f898704bdb8cf1
Status: Downloaded newer image for ubuntu:latest
 ---> d70eaf7277ea
Step 2/5 : MAINTAINER rakeshrhcss@gmail.com
 ---> Running in 7e3b8efc07c4
Removing intermediate container 7e3b8efc07c4
 ---> b75a5474a44c
Step 3/5 : RUN apt-get update
 ---> Running in 20d71dff0c3b
Get:1 http://security.ubuntu.com/ubuntu focal-security InRelease [107 kB]
Get:2 http://archive.ubuntu.com/ubuntu focal InRelease [265 kB]
Get:3 http://security.ubuntu.com/ubuntu focal-security/main amd64 Packages [438 kB]
...
...
 ---> 82153e9637e8
Step 4/5 : RUN apt-get install nginx -y
 ---> Running in f0a0ac1aa338
Reading package lists...
Building dependency tree...
Reading state information...
...
...
Processing triggers for libc-bin (2.31-0ubuntu9.1) ...
Removing intermediate container f0a0ac1aa338
 ---> bb6194d2891d
Step 5/5 : CMD [“echo”,”demo Docker Image created”]
 ---> Running in b69b9826fdfd
Removing intermediate container b69b9826fdfd
 ---> d4823913e6a8
Successfully built d4823913e6a8
Successfully tagged rakeshrhcss/nginx-ubuntu:1.0

Understand the command we ran above:

docker build is the command to build a Docker image from a Dockerfile
-t option is to tag the image.
here I am using rakeshrhcss/nginx-ubuntu:1.0 which will be basically the name of the image. As the first part put the name of the maintainer then give some human readable name nginx-ubuntuand provided a version number 1.0.

NOTE: the . (dot) at the end of the line. You need to specify the directory where docker build should be looking for a Dockerfile. Therefore . tells docker build to look for the file in the current working directory.

Step 4: Verify the image
Docker created an image from your Dockerfile. Now you should see a new image in your image list. Run docker images ls command.

root@kmaster-rj:~/docker-demos# docker image ls
REPOSITORY                           TAG                 IMAGE ID            CREATED             SIZE
rakeshrhcss/nginx-ubuntu             1.0                 d4823913e6a8        7 minutes ago       158MB

Step 5: Run a container using your image

root@kmaster-rj:~/docker-demos# docker run -it -d --name dockerfile-demo -p 80:80 rakeshrhcss/nginx-ubuntu:1.0
79fde9fc73da641b537c27243a1f6c96eaee561074744cdd1704ef87a6c88d7c

List out the container we just ran:

root@kmaster-rj:~/docker-demos# docker container ls | grep -i dockerfile-demo71c6737df5e0        rakeshrhcss/nginx-ubuntu:1.0   "/bin/sh"                About a minute ago   Up About a minute   0.0.0.0:80->80/tcp   dockerfile-demo

So we have successfully ran the container using our own image created by Dockerfile.

Dockerfile key instructions

Lets understand what all instruction we can use within our Dockerfile.
A should be to know that there aren’t many.

FROM:
A valid Dockerfile must start with FROMThe FROM instruction initializes a new build stage and sets the Base Image for subsequent instructions. As of version 17.05, you can have more than one FROM instruction in one Dockerfile.

COPY:
The COPY instruction copies new files or directories from <src> and adds them to the filesystem of the container at the path <dest>.

Few examples -
To add all files starting with “web”:
COPY web* /mydir/

The <dest> is an absolute path, or a path relative to WORKDIR, into which the source will be copied inside the destination container.

The example below uses a relative path, and adds “test.txt” to <WORKDIR>/relativePathDir/:

COPY test.txt relativePathDir/

Whereas this example uses an absolute path, and adds “test.txt” to /absolutePathDir/

COPY test.txt /absolutePathDir/

All new files and directories are created with a UID and GID of 0, unless the optional --chown flag specifies a given username, groupname, or UID/GID combination to request specific ownership of the copied content.

Few examples with--chown flag:

COPY --chown=55:mygroup files* /somedir/
COPY --chown=bin files* /somedir/
COPY --chown=1 files* /somedir/
COPY --chown=10:11 files* /somedir/

NOTE:
If <dest> does not end with a trailing slash, it will be considered a regular file and the contents of <src> will be written at <dest>.

ADD:
The ADD instruction copies new files, directories or remote file URLs from <src> and adds them to the filesystem of the image at the path <dest>.

To add all files starting with “web”:

ADD web* /mydir/

COPY vs ADD
Both ADD and COPY are both used to add directories and files to your Docker image.
Recommended is to use COPY. Because ADD has many extra features compared to COPY that make ADD more unpredictable and a bit over-designed.
ADD can pull files from url sources, COPY cannot.
ADD can extract compressed files assuming it can recognize and handle the format. COPYcannot.

If you want to pull files from the web into your image I would suggest to use RUN and curl and uncompress your files with RUN and commands you would use on the command line.

ENV:
ENV is used to define environment variables.

You can use it to define environment variables that will be available in your container. Same will be available to use when you run a container using your image.
The variable you specify by ENV in the Dockerfile can be used in all subsequent instructions within the same Dockerfile itself.

RUN:
RUN will execute commands. It is the most used instruction.

RUN has 2 forms:

RUN <command> (shell form, the command is run in a shell, which by default is /bin/sh -c on Linux or cmd /S /C on Windows)
RUN ["executable", "param1", "param2"] (exec form)

The RUN instruction will execute any commands in a new layer on top of the current image and commit the results. The resulting committed image will be used for the next step in the Dockerfile.

In the shell form you can use a \ (backslash) to continue a single RUN instruction onto the next line. For example, consider these two lines:

RUN /bin/bash -c 'source $HOME/.bashrc; \
echo $HOME'

Together they are equivalent to this single line:

RUN /bin/bash -c 'source $HOME/.bashrc; echo $HOME'

To use a different shell, other than ‘/bin/sh’, use the exec form passing in the desired shell. For example:

RUN ["/bin/bash", "-c", "echo hello"]

Unlike the shell form, the exec form does not invoke a command shell.
For example, RUN [ "echo", "$HOME" ] will not do variable substitution on $HOME. If you want shell processing then either use the shell form or execute a shell directly, for example: RUN [ "sh", "-c", "echo $HOME" ]. When using the exec form and executing a shell directly, as in the case for the shell form, it is the shell that is doing the environment variable expansion, not docker.

VOLUME:
You can use the VOLUME instruction in a Dockerfile to instruct Docker that the data you store in that specific directory should be stored on the host file system not in the container file system. This implies that data stored in the volume will persist and be available also after you destroy the container.

The docker run command initializes the newly created volume with any data that exists at the specified location within the base image. For example, consider the following Dockerfile snippet:

FROM ubuntu
RUN mkdir /myvol
RUN echo "hello world" > /myvol/greeting
VOLUME /myvol

USER:
The USER instruction sets the user name (or UID) and optionally the user group (or GID) to use when running the image and for any RUN, CMD and ENTRYPOINT instructions that follow it in the Dockerfile.

NOTE: When the user doesn’t have a primary group then the image (or the next instructions) will be run with the root group.

WORKDIR:
The WORKDIR instruction sets the working directory for any RUN, CMD, ENTRYPOINT, COPY and ADD instructions that follow it in the Dockerfile. If the WORKDIR doesn’t exist, it will be created even if it’s not used in any subsequent Dockerfile instruction.

For example:

ENV DIRPATH=/path
WORKDIR $DIRPATH/$DIRNAME
RUN pwd

EXPOSE:
The EXPOSE instruction informs Docker that the container listens on the specified network ports at runtime. You can specify whether the port listens on TCP or UDP, and the default is TCP if the protocol is not specified.

An important instruction to inform your users about the ports your application is listening on. EXPOSE will not publish the port, you need to use docker run -p... to do that when you start the container.

ONBUILD:
You can specify instructions with ONBUILD that will be executed when your image is used as the base image of another Dockerfile. :)

This is useful when you want to create a generic base image to be used in different variations by many Dockerfiles, or in many projects or by many parties.

So you do not need to add the specific stuff immediately, like you don’t need to copy the source code or config files in the base image. How could you even do that, when these things will be available only later?

So what you do instead is to add ONBUILD instructions. So you can do something like this:

ONBUILD COPY . /usr/src/app
ONBUILD RUN /usr/src/app/mybuild.sh

ONBUILD instructions will be executed right after the FROM instruction in the downstream Dockerfile.

CMD and ENTRYPOINT:
Both CMD and ENTRYPOINT instructions define what command gets executed when running a container. There are few rules that describe their co-operation.

Dockerfile should specify at least one of CMD or ENTRYPOINT commands.
ENTRYPOINT should be defined when using the container as an executable.
CMD should be used as a way of defining default arguments for an ENTRYPOINT command or for executing an ad-hoc command in a container.
CMD will be overridden when running the container with alternative arguments.

With CMD instruction you can specify what component is to be run by your image with arguments in the following form: CMD [“executable”, “param1”, “param2”…].

You can override CMD when you’re starting up your container by specifying your command after the image name like this: $ docker run [OPTIONS] IMAGE[:TAG|@DIGEST] [COMMAND] [ARG...].

You can only specify one CMD in a Dockerfile (you can specify more than one, but only the last one will be used).

It is good practice to specify a CMD even if you are developing a generic container, in this case an interactive shell is a good CMD entry. So you do CMD ["python"] or CMD [“php”, “-a”] to give your users something to work with.

So what’s the deal with ENTRYPOINT? When you specify an entry point, your image will work a bit differently. You use ENTRYPOINT as the main executable of your image. In this case whatever you specify in CMD will be added to ENTRYPOINT as parameters.

ENTRYPOINT ["git"]
CMD ["--help"]

This way you can build Docker images that mimic the behavior of the main executable you specify in ENTRYPOINT.

ARG:
The ARG instruction defines a variable that users can pass at build-time to the builder with the docker build command using the --build-arg <varname>=<value> flag. If a user specifies a build argument that was not defined in the Dockerfile, the build outputs a warning.

A Dockerfile may include one or more ARG instructions.

That’s all!

It’s bit lengthy article but would help you a lot in understanding the core concepts of a Dockerfile.

Hope you like the article. Please let me know your feedback in the response section.

Thanks. Happy learning!

Ref: https://docs.docker.com/engine/reference/builder/