Dockerfile Tips

Keep your layer count small

• More layers means a larger image. The larger the image, the longer it takes to build, push, and pull from a registry. Faster builds and deploys.
• Smaller layers also have a smaller attack surface, making them more secure.

So how can I reduce the number of layers?

• Use shared base images. If you have a number of services that all start with a similar base configuration, build that out into a custom image and push it to your registry. Use this as your base image for all similar services.
• Limit the amount of data written to the container layer.
• Chain RUN commands. If you have a number of run commands in a row, chain them instead. Each RUN command creates a new layer.
• Switching USER adds layers:
- • everytime you switch a user you commit a layer.
- • so if your image is something like nodejs where it comes with its own user when you install it, stick to that user if you can.

Use `--no-install-recommends` flag

For example, by default, Ubuntu installs recommended but not suggested packages. With --no-install-recommends, only the main dependencies (packages in the Depends field) are installed. This will reduce image size

Remove Package Manager Cache

RUN apt update \
    && apt -y install --no-install-recommends \
        openjdk-8jdk \
    && rm -rf /var/lib/apt/lists/* 

You don’t need them after installing the packages so why keep them in the image?

Exclude with .dockerignore

To exclude files not relevant to the build (without restructuring your source repository) use a .dockerignore file (similar to .gitignore).

Always combine `RUN apt-get update` with `apt-get install` in the same `RUN` statement.

Using apt-get update alone in a RUN statement causes caching issues and subsequent apt-get install instructions fail.

After building the image, all layers are in the Docker cache. So on that note…

Use the cache. Make the build cache your friend.

Order is important!
Order your steps from least to most frequently changing steps to optimize caching.

Only copy what’s needed

Avoid COPY . if possible.

Don’t use the `latest` tag

Do not use the latest tag. It’s a rolling tag. It has the convenience of always being available for official images on Docker Hub but there can be breaking changes over time. Depending on how far apart in time you rebuild the Dockerfile without cache, you may have failing builds due to unexpected changes in your base image.

Look for minimal flavors if possible

REPOSITORY  TAG             SIZE
openjdk     8               624MB
openjdk     8-jre           443MB
openjdk     8-jre-slim      204MB
openjdk     8-jre-alpine     83MB

Use multi-stage builds where possible:

This is great for languages like Java and Golang where there’s a lot of tool-chain work required to build something, but nto to run it.

e.g

FROM maven:3.6-jdk-8-alpin
WORKDIR /app
COPY pom.xml
COPY src ./src
RUN mvn -e -B package
CMD ["java", "-jar", "/app/app.jar"]

This is okay, because we are bulding source from a sonsistent environment, but now everytime we make a code change, all the dependencies will be fetched. We don’t want that! So identify cachable units.

better:

FROM maven:3.6-jdk-8-alpin
WORKDIR /app
COPY pom.xml
RUN mvn -e -B dependency:resolve
COPY src ./src
RUN mvn -e -B package
CMD ["java", "-jar", "/app/app.jar"]

But there’s a problem here. Image size has increased. We’re also shipping all build tools in the final image. We don;t need to deploy our build tools though, cuz we certainly don’t need them at runtime. So… let’s use a multi-stage build.

FROM maven:3.6-jdk-8-alpine AS builder
WORKDIR /app
COPY pom.xml
RUN mvn -e -B dependency:resolve
COPY src ./src
RUN mvn -e -B package
CMD ["java", "-jar", "/app/app.jar"]

FROM openjdk:8-jre-alpine
COPY --from=builder /app/target/app.jar /
CMD ["java", "-jar", "/app.jar"]

So now only the final stage (the build following the second FROM) will be pushed to your registry and deployed.

Multi-Stage Builds : Use Cases

• Separate build from runtime environment (reduces image size)
• When there are only slight variation between images
• When there are platform specific changes between builds

Maintainability

If possible use official images:

• reduces time spent on maintenance
• written for containers, so usually best practices have been applied
• frequently upodated with fixes

DONTS

Don’t store secrets in your image!!

FROM baseimage
RUN ...
ENV AWS_ACCESS_KEY_ID=...
ENV AWS_SECRET_ACCESS_KEY=...
RUN ./fetch-s3-assets.sh
RUN ./build-script.sh

Even with a private registry, this is not a good idea. Things leak!

Using build arguments instead of environment variables is not a solution either!

STILL BAD: $ docker build --build-arg \ AWS_ACCESS_KEY_ID=... . Using build arguments allows you to NOT commit the environment variables in the final image, however, all the RUN commands have the value of those build args in the Docker history. So it’s still there in the image, just not as environments vars.

A better solution is to use the --secret in Docker build.

New Docker Build secret information

The new --secret flag for docker build allows the user to pass secret information to be used in the Dockerfile for building docker images in a safe way that will not end up stored in the final image.

id is the identifier to pass into the docker build –secret. This identifier is associated with the RUN –mount identifier to use in the Dockerfile. Docker does not use the filename of where the secret is kept outside of the Dockerfile, since this may be sensitive information.

dst renames the secret file to a specific file in the Dockerfile RUN command to use.