Dockerfile Complete Guide
A Dockerfile is a text file containing instructions to build Docker images. Master Dockerfile creation to build efficient, secure, and maintainable container images.
What is a Dockerfile?β
A Dockerfile is a script containing a series of instructions used to build a Docker image automatically. Each instruction creates a new layer in the image.
Basic Structure:
# Comment
INSTRUCTION arguments
Dockerfile Instructionsβ
FROM - Base Imageβ
Sets the base image for subsequent instructions
# Use official image
FROM node:18-alpine
# Use specific version
FROM python:3.11-slim
# Use scratch (empty image)
FROM scratch
# Multi-stage build
FROM node:18-alpine AS builder
Best Practices:
- Use official images when possible
- Specify exact versions for reproducibility
- Use minimal base images (alpine, slim)
WORKDIR - Working Directoryβ
Sets the working directory for subsequent instructions
# Set working directory
WORKDIR /app
# Creates directory if it doesn't exist
WORKDIR /app/src
# Use absolute paths
WORKDIR /usr/src/app
COPY vs ADDβ
COPY - Copy files/directories
# Copy single file
COPY package.json .
# Copy directory
COPY src/ ./src/
# Copy with ownership
COPY --chown=node:node package.json .
# Copy from build stage
COPY --from=builder /app/dist ./dist
ADD - Copy with additional features
# Copy and extract tar
ADD app.tar.gz /app/
# Download from URL (not recommended)
ADD https://example.com/file.tar.gz /app/
# Copy files (same as COPY)
ADD package.json .
When to use:
- COPY: For simple file copying (recommended)
- ADD: Only when you need auto-extraction or URL download
RUN - Execute Commandsβ
Execute commands during image build
# Single command
RUN npm install
# Multiple commands (inefficient)
RUN apt-get update
RUN apt-get install -y curl
RUN apt-get clean
# Multiple commands (efficient)
RUN apt-get update && \
apt-get install -y curl && \
apt-get clean && \
rm -rf /var/lib/apt/lists/*
# Using arrays (exec form)
RUN ["npm", "install"]
CMD - Default Commandβ
Specify default command when container starts
# Shell form
CMD npm start
# Exec form (recommended)
CMD ["npm", "start"]
# With parameters
CMD ["node", "server.js"]
# Can be overridden at runtime
# docker run my-app python app.py
ENTRYPOINT - Entry Pointβ
Configure container as executable
# Exec form (recommended)
ENTRYPOINT ["python", "app.py"]
# Shell form
ENTRYPOINT python app.py
# Combined with CMD
ENTRYPOINT ["python"]
CMD ["app.py"]
# Runtime: docker run my-app script.py
# Executes: python script.py
ENTRYPOINT vs CMD:
- ENTRYPOINT: Cannot be overridden, always executes
- CMD: Can be overridden by runtime arguments
- Both: ENTRYPOINT + CMD for flexible defaults
ENV - Environment Variablesβ
Set environment variables
# Single variable
ENV NODE_ENV=production
# Multiple variables
ENV NODE_ENV=production \
PORT=3000 \
DEBUG=false
# Using in other instructions
ENV APP_HOME=/app
WORKDIR $APP_HOME
ARG - Build Argumentsβ
Define build-time variables
# Define argument
ARG VERSION=1.0
ARG BUILD_DATE
# Use in instructions
FROM node:${VERSION}-alpine
LABEL build-date=${BUILD_DATE}
# Build with arguments
# docker build --build-arg VERSION=2.0 --build-arg BUILD_DATE=$(date) .
EXPOSE - Document Portsβ
Document which ports the container listens on
# Single port
EXPOSE 3000
# Multiple ports
EXPOSE 3000 8080
# With protocol
EXPOSE 3000/tcp
EXPOSE 53/udp
# Note: EXPOSE doesn't publish ports
# Use -p flag: docker run -p 3000:3000 my-app
VOLUME - Mount Pointsβ
Create mount points for external volumes
# Single volume
VOLUME /data
# Multiple volumes
VOLUME ["/data", "/logs"]
# Best practice: Use at runtime instead
# docker run -v my-data:/data my-app
USER - Switch Userβ
Set user for subsequent instructions
# Create user and switch
RUN addgroup -g 1001 -S nodejs && \
adduser -S nextjs -u 1001
USER nextjs
# Switch to existing user
USER node
# Use numeric ID
USER 1001:1001
LABEL - Metadataβ
Add metadata to image
# Single label
LABEL version="1.0"
# Multiple labels
LABEL version="1.0" \
description="My application" \
maintainer="developer@example.com"
# Standard labels
LABEL org.opencontainers.image.title="My App"
LABEL org.opencontainers.image.version="1.0.0"
Multi-Stage Buildsβ
Build efficient images by using multiple FROM statements
Basic Multi-Stage Exampleβ
# Build stage
FROM node:18-alpine AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production
# Production stage
FROM node:18-alpine AS production
WORKDIR /app
RUN addgroup -g 1001 -S nodejs && \
adduser -S nextjs -u 1001
COPY --from=builder /app/node_modules ./node_modules
COPY . .
USER nextjs
EXPOSE 3000
CMD ["npm", "start"]
Advanced Multi-Stage Exampleβ
# Base stage with common dependencies
FROM node:18-alpine AS base
WORKDIR /app
COPY package*.json ./
# Development dependencies
FROM base AS dev-deps
RUN npm ci
# Production dependencies
FROM base AS prod-deps
RUN npm ci --only=production
# Build stage
FROM dev-deps AS build
COPY . .
RUN npm run build
# Test stage
FROM dev-deps AS test
COPY . .
RUN npm test
# Production stage
FROM node:18-alpine AS production
WORKDIR /app
RUN addgroup -g 1001 -S nodejs && \
adduser -S nextjs -u 1001
COPY --from=prod-deps /app/node_modules ./node_modules
COPY --from=build /app/dist ./dist
USER nextjs
EXPOSE 3000
CMD ["node", "dist/server.js"]
Build Specific Stageβ
# Build only test stage
docker build --target test -t my-app:test .
# Build production stage
docker build --target production -t my-app:prod .
Best Practicesβ
Image Size Optimizationβ
1. Use Minimal Base Images
# Good: Alpine-based images
FROM node:18-alpine
FROM python:3.11-alpine
# Better: Distroless images
FROM gcr.io/distroless/nodejs18-debian11
# Best: Scratch for static binaries
FROM scratch
2. Minimize Layers
# Bad: Multiple RUN instructions
RUN apt-get update
RUN apt-get install -y curl
RUN apt-get clean
# Good: Single RUN instruction
RUN apt-get update && \
apt-get install -y curl && \
apt-get clean && \
rm -rf /var/lib/apt/lists/*
3. Use .dockerignore
node_modules
npm-debug.log
.git
.gitignore
README.md
.env
.nyc_output
coverage
.nyc_output
Security Best Practicesβ
1. Don't Run as Root
# Create non-root user
RUN addgroup -g 1001 -S appgroup && \
adduser -S appuser -u 1001 -G appgroup
# Switch to non-root user
USER appuser
2. Use Specific Versions
# Bad: Latest tag
FROM node:latest
# Good: Specific version
FROM node:18.17.0-alpine
3. Minimize Attack Surface
# Remove package managers
RUN apt-get update && \
apt-get install -y --no-install-recommends curl && \
apt-get purge -y --auto-remove apt-get && \
rm -rf /var/lib/apt/lists/*
Performance Optimizationβ
1. Order Instructions by Change Frequency
# Dependencies change less frequently
COPY package*.json ./
RUN npm ci --only=production
# Source code changes more frequently
COPY . .
2. Use Build Cache Effectively
# Cache npm dependencies
COPY package*.json ./
RUN npm ci
# Copy source code after dependencies
COPY . .
RUN npm run build
3. Leverage Multi-Stage Builds
# Build stage with dev dependencies
FROM node:18-alpine AS builder
COPY package*.json ./
RUN npm ci
COPY . .
RUN npm run build
# Production stage with only runtime dependencies
FROM node:18-alpine AS production
COPY package*.json ./
RUN npm ci --only=production
COPY --from=builder /app/dist ./dist
Real-World Examplesβ
Node.js Applicationβ
# Multi-stage build for Node.js app
FROM node:18-alpine AS base
WORKDIR /app
COPY package*.json ./
# Install dependencies
FROM base AS deps
RUN npm ci --only=production && npm cache clean --force
# Build application
FROM base AS build
RUN npm ci
COPY . .
RUN npm run build
# Production image
FROM node:18-alpine AS production
WORKDIR /app
# Create non-root user
RUN addgroup -g 1001 -S nodejs && \
adduser -S nextjs -u 1001
# Copy dependencies and built application
COPY --from=deps --chown=nextjs:nodejs /app/node_modules ./node_modules
COPY --from=build --chown=nextjs:nodejs /app/dist ./dist
COPY --chown=nextjs:nodejs package*.json ./
# Switch to non-root user
USER nextjs
# Expose port and add health check
EXPOSE 3000
HEALTHCHECK --interval=30s --timeout=3s --start-period=5s --retries=3 \
CMD curl -f http://localhost:3000/health || exit 1
# Start application
CMD ["node", "dist/server.js"]
Python Flask Applicationβ
# Use Python slim image
FROM python:3.11-slim AS base
# Set environment variables
ENV PYTHONDONTWRITEBYTECODE=1 \
PYTHONUNBUFFERED=1 \
PIP_NO_CACHE_DIR=1 \
PIP_DISABLE_PIP_VERSION_CHECK=1
# Install system dependencies
RUN apt-get update && \
apt-get install -y --no-install-recommends \
build-essential \
curl && \
rm -rf /var/lib/apt/lists/*
# Create non-root user
RUN useradd --create-home --shell /bin/bash app
# Set working directory
WORKDIR /app
# Install Python dependencies
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
# Copy application code
COPY --chown=app:app . .
# Switch to non-root user
USER app
# Expose port
EXPOSE 5000
# Health check
HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
CMD curl -f http://localhost:5000/health || exit 1
# Start application
CMD ["gunicorn", "--bind", "0.0.0.0:5000", "--workers", "4", "app:app"]
Go Applicationβ
# Build stage
FROM golang:1.21-alpine AS builder
# Install git (required for go modules)
RUN apk add --no-cache git
# Set working directory
WORKDIR /app
# Copy go mod files
COPY go.mod go.sum ./
# Download dependencies
RUN go mod download
# Copy source code
COPY . .
# Build binary
RUN CGO_ENABLED=0 GOOS=linux go build -a -installsuffix cgo -o main .
# Production stage
FROM scratch
# Copy CA certificates for HTTPS
COPY --from=builder /etc/ssl/certs/ca-certificates.crt /etc/ssl/certs/
# Copy binary
COPY --from=builder /app/main /main
# Expose port
EXPOSE 8080
# Run binary
ENTRYPOINT ["/main"]
Advanced Dockerfile Featuresβ
Health Checksβ
# Basic health check
HEALTHCHECK --interval=30s --timeout=3s --retries=3 \
CMD curl -f http://localhost:3000/health || exit 1
# Custom health check script
COPY healthcheck.sh /usr/local/bin/
RUN chmod +x /usr/local/bin/healthcheck.sh
HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
CMD /usr/local/bin/healthcheck.sh
# Disable inherited health check
HEALTHCHECK NONE
Build Arguments and Environment Variablesβ
# Build arguments
ARG NODE_VERSION=18
ARG APP_VERSION=1.0.0
FROM node:${NODE_VERSION}-alpine
# Environment variables
ENV NODE_ENV=production \
APP_VERSION=${APP_VERSION} \
PORT=3000
# Use in labels
LABEL version=${APP_VERSION}
Conditional Instructionsβ
# Using shell conditions
RUN if [ "$NODE_ENV" = "development" ]; then \
npm install; \
else \
npm ci --only=production; \
fi
# Using build arguments
ARG INSTALL_DEV=false
RUN if [ "$INSTALL_DEV" = "true" ]; then \
apt-get install -y development-tools; \
fi
Dockerfile Linting and Validationβ
Using Hadolintβ
# Install hadolint
brew install hadolint
# Lint Dockerfile
hadolint Dockerfile
# Ignore specific rules
hadolint --ignore DL3008 --ignore DL3009 Dockerfile
Common Dockerfile Issuesβ
- DL3008: Pin versions in apt-get install
- DL3009: Delete apt-get lists after installing
- DL3015: Avoid additional packages by specifying --no-install-recommends
- DL4006: Set SHELL option -o pipefail before RUN with a pipe
Testing Dockerfilesβ
Build and Testβ
# Build image
docker build -t my-app:test .
# Test image
docker run --rm my-app:test
# Test with different build args
docker build --build-arg NODE_ENV=development -t my-app:dev .
# Test specific stage
docker build --target test -t my-app:test .
Image Analysisβ
# Analyze image layers
docker history my-app:latest
# Check image size
docker images my-app
# Inspect image configuration
docker inspect my-app:latest
# Scan for vulnerabilities
docker scout cves my-app:latest
Troubleshooting Dockerfile Issuesβ
Common Problemsβ
Build Context Too Large
# Use .dockerignore
echo "node_modules" >> .dockerignore
echo ".git" >> .dockerignore
# Check build context size
docker build --no-cache .
Layer Caching Issues
# Disable cache
docker build --no-cache .
# Clear build cache
docker builder prune
Permission Issues
# Fix ownership
COPY --chown=user:group files /app/
# Set permissions
RUN chmod +x /app/script.sh
Ready to build efficient Docker images? Start with simple Dockerfiles and gradually implement advanced techniques! π