Dockerfile: Creating Custom Images
Overview
A Dockerfile is a text file that contains instructions for building Docker images. It defines the environment, dependencies, and configuration needed to run your application in a container.
Dockerfile Basics
What is a Dockerfile?
A Dockerfile is a recipe for creating Docker images. It contains step-by-step instructions that Docker follows to build an image automatically.
Basic Structure
# Comment
INSTRUCTION arguments
Simple Example
# Use official Python runtime as base image
FROM python:3.11-slim
# Set working directory
WORKDIR /app
# Copy requirements file
COPY requirements.txt .
# Install dependencies
RUN pip install -r requirements.txt
# Copy application code
COPY . .
# Expose port
EXPOSE 8000
# Run application
CMD ["python", "app.py"]
Dockerfile Instructions
FROM - Base Image
# Official image
FROM ubuntu:20.04
# Specific version
FROM node:18.16.0
# Multi-stage build
FROM node:18 AS builder
FROM nginx:alpine AS runtime
# Scratch (empty base)
FROM scratch
RUN - Execute Commands
# Single command
RUN apt-get update
# Multiple commands (recommended)
RUN apt-get update && \
apt-get install -y \
curl \
wget \
vim && \
rm -rf /var/lib/apt/lists/*
# Using arrays (shell form vs exec form)
RUN ["apt-get", "update"]
RUN ["apt-get", "install", "-y", "curl"]
COPY vs ADD
# COPY (recommended for simple file copying)
COPY source destination
COPY app.py /app/
COPY requirements.txt /app/
COPY . /app/
# ADD (has additional features)
ADD source destination
ADD https://example.com/file.tar.gz /app/ # Downloads from URL
ADD archive.tar.gz /app/ # Auto-extracts archives
# Best practice: Use COPY unless you need ADD's features
WORKDIR - Set Working Directory
# Set working directory
WORKDIR /app
# All subsequent commands run from this directory
COPY . .
RUN ls -la # Lists contents of /app
# Create directory if it doesn't exist
WORKDIR /app/data/logs
ENV - Environment Variables
# Set environment variables
ENV NODE_ENV=production
ENV PORT=3000
ENV DATABASE_URL=postgresql://localhost/mydb
# Multiple variables
ENV NODE_ENV=production \
PORT=3000 \
DEBUG=false
# Use variables
ENV APP_HOME=/app
WORKDIR $APP_HOME
ARG - Build Arguments
# Define build arguments
ARG NODE_VERSION=18
ARG APP_ENV=production
# Use in FROM instruction
FROM node:${NODE_VERSION}
# Use in other instructions
ENV ENVIRONMENT=${APP_ENV}
# Build with arguments
# docker build --build-arg NODE_VERSION=16 .
EXPOSE - Document Ports
# Document which ports the container listens on
EXPOSE 80
EXPOSE 443
EXPOSE 8000/tcp
EXPOSE 53/udp
# Multiple ports
EXPOSE 80 443 8080
USER - Set User
# Create user
RUN groupadd -r appuser && useradd -r -g appuser appuser
# Switch to user
USER appuser
# Switch back to root if needed
USER root
# Use numeric IDs
USER 1000:1000
CMD vs ENTRYPOINT
# CMD - Default command (can be overridden)
CMD ["python", "app.py"]
CMD python app.py
# ENTRYPOINT - Fixed command (cannot be overridden)
ENTRYPOINT ["python", "app.py"]
# Combination (ENTRYPOINT + CMD)
ENTRYPOINT ["python"]
CMD ["app.py"]
# This allows: docker run myimage script.py
VOLUME - Mount Points
# Create mount points
VOLUME ["/data"]
VOLUME ["/var/log", "/var/db"]
# Or
VOLUME /data
VOLUME /var/log /var/db
LABEL - Metadata
# Add metadata
LABEL version="1.0"
LABEL description="My web application"
LABEL maintainer="yourname@example.com"
# Multiple labels
LABEL version="1.0" \
description="My web application" \
maintainer="yourname@example.com"
Real-World Examples
Node.js Application
# Use official Node.js runtime
FROM node:18-alpine
# Set working directory
WORKDIR /usr/src/app
# Copy package files
COPY package*.json ./
# Install dependencies
RUN npm ci --only=production
# Create non-root user
RUN addgroup -g 1001 -S nodejs
RUN adduser -S nextjs -u 1001
# Copy application code
COPY --chown=nextjs:nodejs . .
# Switch to non-root user
USER nextjs
# Expose port
EXPOSE 3000
# Health check
HEALTHCHECK --interval=30s --timeout=3s --start-period=5s --retries=3 \
CMD curl -f http://localhost:3000/health || exit 1
# Start application
CMD ["npm", "start"]
Python Web Application
# Multi-stage build
FROM python:3.11-slim as builder
# Install system dependencies
RUN apt-get update && \
apt-get install -y --no-install-recommends \
build-essential \
gcc && \
rm -rf /var/lib/apt/lists/*
# Set working directory
WORKDIR /app
# Copy requirements
COPY requirements.txt .
# Install Python dependencies
RUN pip wheel --no-cache-dir --no-deps --wheel-dir /app/wheels -r requirements.txt
# Production stage
FROM python:3.11-slim
# Create app user
RUN groupadd -r app && useradd -r -g app app
# Install system dependencies
RUN apt-get update && \
apt-get install -y --no-install-recommends \
curl && \
rm -rf /var/lib/apt/lists/*
# Copy wheels from builder stage
COPY --from=builder /app/wheels /wheels
COPY --from=builder /app/requirements.txt .
# Install Python dependencies
RUN pip install --no-cache /wheels/*
# Set working directory
WORKDIR /app
# Copy application code
COPY --chown=app:app . .
# Switch to app user
USER app
# Expose port
EXPOSE 8000
# Health check
HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
CMD curl -f http://localhost:8000/health/ || exit 1
# Start application
CMD ["gunicorn", "--bind", "0.0.0.0:8000", "myapp.wsgi:application"]
Java Spring Boot Application
# Multi-stage build
FROM maven:3.8-openjdk-17 as builder
# Set working directory
WORKDIR /app
# Copy POM file
COPY pom.xml .
# Download dependencies (cached layer)
RUN mvn dependency:go-offline -B
# Copy source code
COPY src ./src
# Build application
RUN mvn clean package -DskipTests
# Runtime stage
FROM openjdk:17-jre-slim
# Install curl for health checks
RUN apt-get update && \
apt-get install -y curl && \
rm -rf /var/lib/apt/lists/*
# Create app user
RUN groupadd -r spring && useradd -r -g spring spring
# Set working directory
WORKDIR /app
# Copy jar from builder stage
COPY --from=builder /app/target/*.jar app.jar
# Change ownership
RUN chown spring:spring app.jar
# Switch to app user
USER spring
# Expose port
EXPOSE 8080
# Health check
HEALTHCHECK --interval=30s --timeout=10s --start-period=60s --retries=3 \
CMD curl -f http://localhost:8080/actuator/health || exit 1
# JVM options
ENV JAVA_OPTS="-Xmx512m -Xms256m"
# Start application
ENTRYPOINT ["sh", "-c", "java $JAVA_OPTS -jar app.jar"]
.NET Application
# Multi-stage build
FROM mcr.microsoft.com/dotnet/sdk:8.0 AS build
WORKDIR /src
# Copy project files
COPY *.csproj ./
RUN dotnet restore
# Copy source code
COPY . ./
RUN dotnet publish -c Release -o /app/publish
# Runtime stage
FROM mcr.microsoft.com/dotnet/aspnet:8.0 AS runtime
# Install curl for health checks
RUN apt-get update && \
apt-get install -y curl && \
rm -rf /var/lib/apt/lists/*
# Create app user
RUN groupadd -r app && useradd -r -g app app
WORKDIR /app
# Copy published app
COPY --from=build /app/publish .
# Change ownership
RUN chown -R app:app /app
# Switch to app user
USER app
# Expose port
EXPOSE 80
# Health check
HEALTHCHECK --interval=30s --timeout=10s --start-period=30s --retries=3 \
CMD curl -f http://localhost/health || exit 1
# Start application
ENTRYPOINT ["dotnet", "MyApp.dll"]
Multi-Stage Builds
Benefits of Multi-Stage Builds
- Smaller final images
- Separation of build and runtime dependencies
- Better security (no build tools in final image)
- Faster deployments
Advanced Multi-Stage Example
# Build stage
FROM node:18-alpine AS dependencies
WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production && npm cache clean --force
# Development dependencies
FROM node:18-alpine AS dev-dependencies
WORKDIR /app
COPY package*.json ./
RUN npm ci
# Build stage
FROM dev-dependencies AS build
COPY . .
RUN npm run build && npm run test
# Runtime stage
FROM node:18-alpine AS runtime
RUN addgroup -g 1001 -S nodejs
RUN adduser -S nextjs -u 1001
WORKDIR /app
# Copy dependencies
COPY --from=dependencies --chown=nextjs:nodejs /app/node_modules ./node_modules
# Copy built application
COPY --from=build --chown=nextjs:nodejs /app/dist ./dist
COPY --from=build --chown=nextjs:nodejs /app/package.json ./package.json
USER nextjs
EXPOSE 3000
CMD ["npm", "start"]
Best Practices
1. Use Official Base Images
# Good
FROM node:18-alpine
FROM python:3.11-slim
FROM openjdk:17-jre-slim
# Avoid
FROM ubuntu:latest # Too generic, large size
2. Minimize Layers
# Bad (multiple layers)
RUN apt-get update
RUN apt-get install -y curl
RUN apt-get install -y wget
RUN rm -rf /var/lib/apt/lists/*
# Good (single layer)
RUN apt-get update && \
apt-get install -y \
curl \
wget && \
rm -rf /var/lib/apt/lists/*
3. Use .dockerignore
# .dockerignore file
node_modules
npm-debug.log
.git
.gitignore
README.md
.env
.nyc_output
coverage
.coverage
.pytest_cache
.DS_Store
*.pyc
*.pyo
*.pyd
__pycache__
.vscode
.idea
4. Leverage Build Cache
# Copy dependency files first (cached layer)
COPY package*.json ./
RUN npm install
# Copy source code last (changes frequently)
COPY . ./
5. Run as Non-Root User
# Create user
RUN groupadd -r appuser && useradd -r -g appuser appuser
# Set ownership
COPY --chown=appuser:appuser . /app
# Switch to user
USER appuser
6. Use Specific Tags
# Good
FROM node:18.16.0-alpine
# Bad
FROM node:latest
FROM node
7. Add Health Checks
HEALTHCHECK --interval=30s --timeout=3s --start-period=5s --retries=3 \
CMD curl -f http://localhost:3000/health || exit 1
Common Patterns
Configuration Management
# Environment-specific configuration
ARG ENV=production
COPY config/config.${ENV}.json /app/config.json
# Template-based configuration
COPY config/app.conf.template /app/
RUN envsubst < /app/app.conf.template > /app/app.conf
Secret Management
# Use build secrets (BuildKit)
# syntax=docker/dockerfile:1
FROM alpine
RUN --mount=type=secret,id=mypassword \
cat /run/secrets/mypassword
# Build with: docker build --secret id=mypassword,src=./password.txt .
Caching Strategies
# Cache package installations
FROM python:3.11-slim
COPY requirements.txt .
RUN pip install -r requirements.txt
COPY . .
# Use build cache mounts
RUN --mount=type=cache,target=/root/.cache/pip \
pip install -r requirements.txt
Building and Optimization
Build Commands
# Basic build
docker build -t myapp .
# Build with arguments
docker build --build-arg ENV=production -t myapp .
# Build specific stage
docker build --target runtime -t myapp .
# Build with custom Dockerfile
docker build -f Dockerfile.prod -t myapp .
# Build without cache
docker build --no-cache -t myapp .
Image Optimization
# Use alpine variants
FROM node:18-alpine
# Remove package manager caches
RUN apt-get update && \
apt-get install -y package && \
rm -rf /var/lib/apt/lists/*
# Use multi-stage builds
FROM builder AS runtime
# Remove unnecessary files
RUN rm -rf /tmp/* /var/tmp/*
Dockerfile Linting
Using hadolint
# Install hadolint
brew install hadolint
# Lint Dockerfile
hadolint Dockerfile
# Ignore specific rules
hadolint --ignore DL3008 Dockerfile
Common Issues
# DL3008: Pin versions in apt-get install
RUN apt-get install -y curl=7.68.0-1ubuntu2
# DL3009: Delete apt-get lists
RUN apt-get update && \
apt-get install -y curl && \
rm -rf /var/lib/apt/lists/*
# DL3006: Always tag the version of an image explicitly
FROM ubuntu:20.04
Debugging Dockerfiles
Debug Techniques
# Build up to specific stage
docker build --target debug -t myapp-debug .
# Run intermediate stage
docker run -it myapp-debug /bin/bash
# Check build history
docker history myapp
# Inspect final image
docker inspect myapp
Debug Dockerfile
FROM ubuntu:20.04 AS debug
# Install debugging tools
RUN apt-get update && \
apt-get install -y \
curl \
wget \
vim \
htop \
strace && \
rm -rf /var/lib/apt/lists/*
# Your application setup
FROM debug AS app
COPY . /app
WORKDIR /app
# Production stage
FROM ubuntu:20.04 AS production
COPY --from=app /app /app
WORKDIR /app
CMD ["./start.sh"]
Testing Dockerfiles
Test Structure
# Test script
#!/bin/bash
set -e
# Build image
docker build -t test-app .
# Test 1: Image builds successfully
echo "✓ Image builds successfully"
# Test 2: Application starts
docker run -d --name test-container test-app
sleep 5
# Test 3: Health check passes
docker exec test-container curl -f http://localhost:3000/health
# Test 4: Cleanup
docker stop test-container
docker rm test-container
echo "✓ All tests passed"
Next Steps
After mastering Dockerfile creation:
Creating effective Dockerfiles is essential for building efficient, secure, and maintainable container images. Start with simple examples and gradually incorporate advanced patterns as your applications become more complex.