Skip to content
Go back

Docker & Containerization: A Complete Guide

Edit page

Let me tell you something - before containers, deploying applications was a nightmare. You’d write code on your laptop, it works perfectly, then you send it to the server and boom - nothing works. “But it works on my machine!” became the most frustrating phrase in development. Docker changed all of that.

What is Containerization?

Think of containerization like shipping containers in the real world. Before standardized shipping containers, moving goods was chaos - different sizes, different handling methods, complete mess. Then someone said “what if everything went into standard boxes?” Game changer.

That’s exactly what containerization does for software. It packages your application with everything it needs to run - code, runtime, system tools, libraries - into a standardized unit called a container. Ship it anywhere, and it’ll work the same way.

Why does this matter?

Images and Containers: Understanding the Difference

This confuses people at first, but it’s simple:

Docker Image = The blueprint

Docker Container = The actual running thing

When you run docker run redis, you’re taking the Redis image and creating a running container from it.

Layers: The Secret Sauce of Docker Images

Here’s where Docker gets clever. Images aren’t just one big blob - they’re made of layers, stacked on top of each other like a cake.

┌─────────────────────────────┐
│   Your Application Code     │  ← Layer 4 (2 MB)
├─────────────────────────────┤
│   Application Dependencies  │  ← Layer 3 (150 MB)
├─────────────────────────────┤
│   Runtime (Node/Python/etc) │  ← Layer 2 (200 MB)
├─────────────────────────────┤
│   Base OS (Alpine Linux)    │  ← Layer 1 (5 MB)
└─────────────────────────────┘

Why is this brilliant?

  1. Storage Efficiency: If you have 10 Node.js apps, they all share the same Node.js layer. Only store it once.

  2. Fast Builds: Changed one line of code? Docker only rebuilds the top layer, not everything.

  3. Fast Downloads: Pulling an image? If you already have some layers, Docker only downloads what’s new.

Each layer is read-only and cached. When you modify something, Docker creates a new layer on top. This is why the order of commands in a Dockerfile matters (we’ll get to that).

Container Repositories and Docker Hub

You can’t email Docker images around (well, you could, but don’t). Instead, we use container repositories - like GitHub but for Docker images.

Docker Hub is the public repository. It’s got:

Think of it as npm or PyPI for containers. When you run docker pull postgres, it’s grabbing that image from Docker Hub by default.

Running Your First Container

Let’s get practical. Here’s the simplest way to run PostgreSQL:

docker run -e POSTGRES_PASSWORD=mysecret postgres

What just happened?

  1. Docker checked if you have the postgres image locally
  2. Didn’t find it? Downloaded it from Docker Hub
  3. Created a container from that image
  4. Set an environment variable POSTGRES_PASSWORD
  5. Started PostgreSQL inside the container

Your terminal is now attached to the container’s output. Press Ctrl+C and the container stops.

Check what’s running:

docker ps

This shows all running containers - like ps command in Linux but for containers.

Docker vs Virtual Machines

“Wait, isn’t this just VMs?” No. This is where people get confused, so let’s clear it up.

Your operating system has two main layers:

  1. Kernel - talks to hardware, manages memory, processes, etc.
  2. Application Layer - where your apps run

Virtual Machines:

Docker Containers:

┌─────────────────────────────────────┐
│           Applications              │
├──────────────┬──────────────────────┤
│   Container  │  Container │Container│  ← Containers
├──────────────┴──────────────────────┤
│         Docker Engine               │
├─────────────────────────────────────┤
│        Host OS Kernel               │  ← Shared kernel
├─────────────────────────────────────┤
│         Infrastructure              │
└─────────────────────────────────────┘

vs.

┌─────────────────────────────────────┐
│    App  │  App  │  App              │
├─────────┼───────┼───────────────────┤
│   OS    │  OS   │  OS               │  ← Each VM has full OS
├─────────┼───────┼───────────────────┤
│      Hypervisor                     │
├─────────────────────────────────────┤
│        Host OS                      │
├─────────────────────────────────────┤
│         Infrastructure              │
└─────────────────────────────────────┘

The Linux Catch: Docker containers use Linux kernel features. So on Mac and Windows, you need Docker Desktop, which runs a lightweight Linux VM in the background. Your containers run inside that VM. It’s still way more efficient than traditional VMs.

Docker Architecture: How It All Works

Docker isn’t just one thing - it’s a system with multiple components working together:

┌─────────────┐
│   Docker    │  ← What you interact with
│     CLI     │     (docker run, docker ps, etc.)
└──────┬──────┘
       │ REST API

┌─────────────────────────────────────┐
│        Docker Server (Daemon)        │
│  ┌───────────────────────────────┐  │
│  │   Container Runtime           │  │ ← Actually runs containers
│  │   (containerd, runc)          │  │    Alternatives: CRI-O
│  └───────────────────────────────┘  │
│  ┌───────────────────────────────┐  │
│  │   Image Builder (BuildKit)    │  │ ← Builds images from Dockerfile
│  └───────────────────────────────┘  │
│  ┌───────────────────────────────┐  │
│  │   Volumes                     │  │ ← Persistent storage
│  └───────────────────────────────┘  │
│  ┌───────────────────────────────┐  │
│  │   Networks                    │  │ ← Container networking
│  └───────────────────────────────┘  │
└─────────────────────────────────────┘

Components breakdown:

  1. Docker CLI: The command-line tool you use. It’s actually just a client that sends requests to the Docker daemon.

  2. Docker Daemon (dockerd): The server that does the heavy lifting. Listens for API requests and manages everything.

  3. Container Runtime: Actually executes containers. Docker uses containerd by default, but you could swap it for CRI-O if needed.

  4. Volumes: Handles persistent storage (more on this later).

  5. Networks: Manages how containers talk to each other and the outside world.

  6. BuildKit: Modern image building system (faster, better caching).

Essential Docker Commands

Let’s go through the commands you’ll use daily:

Pulling Images

docker pull redis

Downloads the image to your machine without running it. Good for pre-downloading images.

Listing Images

docker images

Shows all images you have locally.

Image Tags/Versions

Images have tags (versions). Default is latest, but you should be specific:

docker pull redis:7.0-alpine
docker pull postgres:15.2

Running Containers

docker run redis

Creates and starts a container. But this blocks your terminal. Better:

docker run -d redis

The -d flag runs it in detached mode (background).

Checking Running Containers

docker ps          # Running only
docker ps -a       # All containers (running + stopped)

Stopping Containers

docker stop abc123  # Use container ID or name

Gracefully stops the container (sends SIGTERM, waits, then SIGKILL if needed).

Starting Stopped Containers

docker start abc123

Restarts a container that was stopped.

Port Mapping: Critical Concept

Containers have their own network space. If Redis runs on port 6379 inside the container, you can’t access it from your laptop unless you expose it.

docker run -p 3000:6379 redis

Format: -p HOST_PORT:CONTAINER_PORT

Now Redis is accessible at localhost:3000 on your machine, but it’s still 6379 inside the container.

Why different ports? You might run multiple Redis containers for different projects:

docker run -p 3001:6379 redis  # Project A
docker run -p 3002:6379 redis  # Project B

Viewing Logs

docker logs abc123
docker logs -f abc123  # Follow mode (like tail -f)

Naming Containers

docker run --name my-redis redis

Now you can use the name instead of the ID:

docker stop my-redis
docker logs my-redis

Executing Commands in Running Containers

docker exec -it abc123 /bin/bash

Super useful for debugging. You’re now inside the container, poke around, check files, whatever.

Force Killing Containers

docker kill abc123

Immediately stops (SIGKILL). Use when docker stop doesn’t work.

Cleaning Up

docker rm container_id    # Remove stopped container
docker rmi image_id       # Remove image

Docker in CI/CD

Here’s where Docker shines in real-world development:

Traditional CI/CD problem:

With Docker:

  1. Developer writes code and Dockerfile
  2. CI server pulls code
  3. CI server runs: docker build -t myapp:1.0 .
  4. CI server runs tests in a container
  5. If tests pass, push image to registry
  6. Deployment server pulls and runs the image

Everything is in the image. The CI server doesn’t need anything pre-installed except Docker itself.

Docker Networks: Making Containers Talk

By default, containers are isolated. If you have a Node.js app and a MongoDB database in separate containers, they can’t talk to each other. Networks solve this.

Creating a Network

docker network create test-network

Running Containers on Same Network

# Start MongoDB
docker run -d \
  --name db \
  --net test-network \
  -e MONGO_INITDB_ROOT_USERNAME=admin \
  -e MONGO_INITDB_ROOT_PASSWORD=password \
  mongo

# Start Mongo Express (web UI for MongoDB)
docker run -d \
  --name mongo-ui \
  --network test-network \
  -e ME_CONFIG_MONGODB_SERVER=db \
  -e ME_CONFIG_MONGODB_ADMINUSERNAME=admin \
  -e ME_CONFIG_MONGODB_ADMINPASSWORD=password \
  -p 8081:8081 \
  mongo-express

The magic: Mongo Express connects to MongoDB using the hostname db - which is the container name. Docker’s internal DNS resolves this.

Listing Networks

docker network ls

You’ll see default networks: bridge, host, none.

Docker Compose: Sanity for Multi-Container Apps

Running two containers manually is annoying. Imagine 10 microservices. Docker Compose lets you define everything in a YAML file.

Here’s the same MongoDB + Mongo Express setup:

version: '3.8'

services:
  mongodb:
    image: mongo
    container_name: db
    ports:
      - "27017:27017"
    environment:
      - MONGO_INITDB_ROOT_USERNAME=admin
      - MONGO_INITDB_ROOT_PASSWORD=password
    networks:
      - test-network

  mongo-express:
    image: mongo-express
    container_name: mongo-ui
    ports:
      - "8081:8081"
    restart: always  # Restart if crashes (useful if MongoDB isn't ready yet)
    environment:
      - ME_CONFIG_MONGODB_SERVER=db
      - ME_CONFIG_MONGODB_ADMINUSERNAME=admin
      - ME_CONFIG_MONGODB_ADMINPASSWORD=password
    networks:
      - test-network
    depends_on:
      - mongodb

networks:
  test-network:
    driver: bridge

Breaking it down:

Running it:

docker-compose up         # Start in foreground
docker-compose up -d      # Start in background
docker-compose down       # Stop and remove containers
docker-compose logs -f    # View logs

Custom file name:

docker-compose -f custom-compose.yml up
docker-compose -f custom-compose.yml down

Docker Compose automatically creates a network for your services, so they can communicate using service names as hostnames.

Dockerfile: Building Your Own Images

A Dockerfile is a recipe for building an image. Let’s create one for a Node.js app:

# Start from an official base image
FROM node:18-alpine

# Set environment variables
ENV NODE_ENV=production \
    PORT=3000

# Set working directory in container
WORKDIR /app

# Copy package files first (better caching - explained below)
COPY package*.json ./

# Install dependencies
RUN npm ci --only=production

# Copy application code
COPY . .

# Expose port (documentation, doesn't actually publish)
EXPOSE 3000

# Default command to run
CMD ["node", "server.js"]

Command explanations:

Building the image:

docker build -t myapp:1.0 .

Docker sends all files in the build context to the daemon. That’s why .dockerignore is important.

Layer Caching Optimization

Look at this Dockerfile order:

COPY package*.json ./
RUN npm install
COPY . .

Why copy package.json separately? Caching.

Each instruction creates a layer. Docker caches layers. If nothing changed in a layer, Docker reuses the cached version.

Scenario 1: You change application code

Scenario 2: If we copied everything first

COPY . .              # Code changed - rebuild
RUN npm install       # Have to reinstall all dependencies - slow!

Every time you change a single line of code, you’d reinstall all dependencies. With the optimized order, npm install is cached.

.dockerignore: Don’t Send Everything

Create a .dockerignore file to exclude stuff from the build context:

node_modules
npm-debug.log
.git
.gitignore
README.md
.env
.env.local
.DS_Store
Dockerfile
docker-compose.yml
.dockerignore
dist
build
coverage
.cache
*.log

Why?

Multi-Stage Builds: Build vs Runtime

Some files are needed to build your app but not to run it. For example:

Multi-stage builds solve this:

# Stage 1: Build
FROM node:18-alpine AS builder

WORKDIR /app

COPY package*.json ./
RUN npm install  # Install ALL dependencies including devDependencies

COPY . .
RUN npm run build  # Build your app

# Stage 2: Runtime
FROM node:18-alpine

WORKDIR /app

COPY package*.json ./
RUN npm ci --only=production  # Only production dependencies

# Copy built artifacts from builder stage
COPY --from=builder /app/dist ./dist

EXPOSE 3000

CMD ["node", "dist/server.js"]

What happens:

  1. First stage installs everything and builds
  2. Second stage starts fresh with a clean image
  3. Only copies the built artifacts and runtime dependencies
  4. Build tools and devDependencies are left behind

Final image is much smaller and doesn’t contain build tools.

Security: Don’t Run as Root

By default, containers run as root. That’s a security risk. If someone exploits your app and breaks out of the container, they have root access.

Create a non-privileged user:

FROM node:18-alpine

# Create app user
RUN addgroup -g 1001 -S appgroup && \
    adduser -u 1001 -S appuser -G appgroup

WORKDIR /app

# Copy files as root
COPY package*.json ./
RUN npm ci --only=production

COPY . .

# Change ownership to app user
RUN chown -R appuser:appgroup /app

# Switch to non-root user
USER appuser

EXPOSE 3000

CMD ["node", "server.js"]

Some base images (like node) already include a non-root user:

FROM node:18-alpine

WORKDIR /app

COPY --chown=node:node package*.json ./
RUN npm ci --only=production

COPY --chown=node:node . .

USER node

CMD ["node", "server.js"]

Private Docker Registries

In companies, you don’t push images to public Docker Hub. You use private registries.

AWS ECR (Elastic Container Registry)

Image naming in registries:

registryDomain/imageName:tag

Examples:

Complete ECR workflow:

  1. Create repository in ECR

    • One repo per application
    • Each repo contains different tags of the same image
  2. Authenticate Docker to ECR

aws ecr get-login-password --region us-east-1 | \
  docker login --username AWS --password-stdin \
  123456789.dkr.ecr.us-east-1.amazonaws.com
  1. Tag your local image
docker tag myapp:1.0 123456789.dkr.ecr.us-east-1.amazonaws.com/myapp:1.0

The docker tag command creates a copy/alias of an image with a new name. You’re not duplicating data, just creating a new reference.

  1. Push to ECR
docker push 123456789.dkr.ecr.us-east-1.amazonaws.com/myapp:1.0

Version management: When you update your app, push with a new tag:

docker tag myapp:1.1 123456789.dkr.ecr.us-east-1.amazonaws.com/myapp:1.1
docker push 123456789.dkr.ecr.us-east-1.amazonaws.com/myapp:1.1

Keep old versions for rollback capability.

Nexus as Private Registry

Nexus is another popular private registry option, especially in enterprises.

Setup Nexus:

  1. Create Docker hosted repository

    • Enable HTTP (not HTTPS for simplicity, but use HTTPS in production)
    • Choose a port (e.g., 8082, different from Nexus UI port 8081)
  2. Configure security groups

    • Allow traffic on the Docker repository port
    • Allow Nexus UI port
  3. Create role and user

    • Create role with Docker view/upload permissions
    • Create user and assign the role
  4. Enable Docker Bearer Token

    • Go to Security → Realms
    • Activate “Docker Bearer Token Realm”
  5. Configure Docker daemon to allow insecure registry

On your machine, edit /etc/docker/daemon.json:

{
  "insecure-registries": ["your-nexus-ip:8082"]
}

Restart Docker:

sudo systemctl restart docker
  1. Login and push
docker login your-nexus-ip:8082
# Enter username and password

docker tag myapp:1.0 your-nexus-ip:8082/myapp:1.0
docker push your-nexus-ip:8082/myapp:1.0

Docker Volumes: Persistent Data

Containers are ephemeral - delete one, and its data is gone. For databases and stateful apps, this is a problem.

The problem:

docker run --name db postgres
# Add some data to the database
docker rm db
# All data is gone!

The solution: Volumes

Volumes are directories on your host system that are mounted into containers. Data written to a volume persists even after the container is deleted.

Host Machine                Container
┌─────────────────┐        ┌──────────────┐
│                 │        │              │
│ /var/lib/docker/│        │              │
│   volumes/      │◄──────►│ /data/db     │
│   mydata/       │        │              │
│                 │        │              │
└─────────────────┘        └──────────────┘

Data written to /data/db in the container is actually stored in /var/lib/docker/volumes/mydata/ on the host.

Three Types of Volumes

1. Host Volumes (you choose the path)

docker run -v /home/user/data:/var/lib/mysql/data mysql

2. Anonymous Volumes (Docker chooses the path)

docker run -v /var/lib/mysql/data mysql

3. Named Volumes (Docker manages, but you name it)

docker run -v mysql-data:/var/lib/mysql/data mysql

Managing Volumes

# Create a volume
docker volume create my-volume

# List volumes
docker volume ls

# Inspect a volume (see where it's actually stored)
docker volume inspect my-volume

# Remove a volume
docker volume rm my-volume

# Remove all unused volumes
docker volume prune

Volumes in Docker Compose

version: '3.8'

services:
  mongodb:
    image: mongo
    ports:
      - "27017:27017"
    volumes:
      - mongo-data:/data/db  # Named volume
    environment:
      - MONGO_INITDB_ROOT_USERNAME=admin
      - MONGO_INITDB_ROOT_PASSWORD=password

  postgres:
    image: postgres:15
    volumes:
      - postgres-data:/var/lib/postgresql/data  # Another named volume
    environment:
      - POSTGRES_PASSWORD=secret

# Define volumes at root level
volumes:
  mongo-data:
    driver: local
  postgres-data:
    driver: local

How it works:

  1. Define volumes at the bottom
  2. Reference them in services
  3. Docker Compose creates and manages these volumes

Docker Best Practices

1. Use Official Base Images

Bad:

FROM alpine:latest
RUN apk add --no-cache nodejs npm
# Install Node manually

Good:

FROM node:18-alpine
# Official image, maintained, secure, optimized

Official images are:

2. Always Use Specific Versions

Bad:

FROM node:latest

Good:

FROM node:18.17.1-alpine3.18

Why?

3. Use Minimal Base Images

Bad:

FROM ubuntu:22.04  # ~77 MB

Good:

FROM alpine:3.18  # ~7 MB

Alpine Linux is tiny but has everything you need. Most official images have alpine variants:

Smaller images = faster builds, faster pulls, less attack surface.

4. Optimize Layer Caching

Good order:

FROM node:18-alpine
COPY package*.json ./
RUN npm install
COPY . .

Change application code → only rebuild the last COPY layer. Dependencies are cached.

General rule: Order commands from least likely to change to most likely to change.

5. Scan Images for Vulnerabilities

Use Docker Scout (built into Docker Desktop):

# Install plugin
docker scout cves myapp:1.0

This scans your image for known security vulnerabilities in:

Fix critical and high severity issues before deploying.

6. Don’t Leak Secrets

Never do this:

FROM node:18-alpine
COPY .env .  # ❌ .env is now in the image!

Better approaches:

Advanced Topics

Health Checks

Add health checks to your Dockerfile:

HEALTHCHECK --interval=30s --timeout=3s --start-period=5s --retries=3 \
  CMD curl -f http://localhost:3000/health || exit 1

Docker will monitor your app and restart if unhealthy.

Resource Limits

Prevent one container from hogging all resources:

docker run -d \
  --memory="512m" \
  --cpus="1.5" \
  myapp:1.0

Or in Docker Compose:

services:
  api:
    image: myapp:1.0
    deploy:
      resources:
        limits:
          cpus: '1.5'
          memory: 512M
        reservations:
          cpus: '0.5'
          memory: 256M

Container Logging Best Practice

Don’t write logs to files inside containers. Write to stdout/stderr. Docker captures these:

// Good - goes to stdout
console.log('Request received');

// Bad - file inside container, lost on restart
fs.appendFileSync('/var/log/app.log', 'Request received');

View logs:

docker logs myapp
docker logs -f myapp  # Follow
docker logs --tail 100 myapp  # Last 100 lines

In production, use a logging driver to send logs to external systems (CloudWatch, Splunk, etc.):

docker run -d \
  --log-driver=awslogs \
  --log-opt awslogs-region=us-east-1 \
  --log-opt awslogs-group=my-app \
  myapp:1.0

Quick Reference Cheat Sheet

# Images
docker images                          # List images
docker pull image:tag                  # Download image
docker build -t name:tag .             # Build image
docker rmi image_id                    # Remove image
docker tag source target               # Tag image

# Containers
docker run image                       # Create and start
docker run -d image                    # Detached mode
docker run -p 8080:80 image           # Port mapping
docker run -e VAR=value image         # Environment variable
docker run -v volume:/path image      # Volume mount
docker run --name myapp image         # Custom name
docker ps                              # List running
docker ps -a                           # List all
docker stop container_id               # Stop gracefully
docker kill container_id               # Force stop
docker start container_id              # Start stopped
docker restart container_id            # Restart
docker rm container_id                 # Remove container
docker logs container_id               # View logs
docker logs -f container_id            # Follow logs
docker exec -it container_id bash     # Interactive shell

# Networks
docker network ls                      # List networks
docker network create name             # Create network
docker network inspect name            # Network details
docker run --network name image       # Connect to network

# Volumes
docker volume ls                       # List volumes
docker volume create name              # Create volume
docker volume inspect name             # Volume details
docker volume rm name                  # Remove volume

# Docker Compose
docker-compose up                      # Start services
docker-compose up -d                   # Detached mode
docker-compose down                    # Stop and remove
docker-compose down -v                 # Also remove volumes
docker-compose ps                      # List services
docker-compose logs -f                 # Follow logs
docker-compose build                   # Build images
docker-compose pull                    # Pull images

# Cleanup
docker system prune                    # Remove unused data
docker system prune -a                 # Remove all unused
docker container prune                 # Remove stopped containers
docker image prune                     # Remove unused images
docker volume prune                    # Remove unused volumes

# Info & Debugging
docker version                         # Docker version
docker info                            # System info
docker inspect container_id            # Container details
docker stats                           # Resource usage
docker top container_id                # Running processes
docker system df                       # Disk usage

Final Thoughts

Docker fundamentally changed how we build and deploy software. What used to take hours of environment setup now takes minutes. What used to break across different machines now “just works.”

But like any tool, Docker is only as good as how you use it. Build images thoughtfully, follow security best practices, and don’t skip the boring stuff like .dockerignore and multi-stage builds.

The beauty of Docker is that once you containerize your app properly, you can run it anywhere - your laptop, a server, the cloud, doesn’t matter. That’s the real power.

Start simple: get one app running in a container. Then add Docker Compose for multiple services. Then move to a private registry. Before you know it, you’ll be deploying containers in production and wondering how you ever lived without them.

Now go build something. And remember: if it works in the container, it works everywhere.


Additional Resources:


Edit page
Share this post on:

Previous Post
Jenkins: Build Automation & CI/CD (Part 1)
Next Post
Artifact Repository Management with Nexus