Docker and Kubernetes for Beginners: From Zero to Deployment
A practical, hands-on guide to Docker and Kubernetes — covering containers, images, Dockerfiles, docker-compose, Kubernetes pods, services, deployments, and when to use each tool.
Why Every Developer Needs to Understand Containers
Here is a scenario that has happened to every developer at least once: your application works perfectly on your machine, you push it to a server, and it breaks. Different OS version, different library versions, missing dependencies, conflicting configurations. You spend hours debugging environment differences instead of writing features.
Containers solve this problem completely. A container packages your application with everything it needs to run — code, runtime, libraries, system tools, configuration — into a single portable unit that runs identically on any machine. My laptop, your laptop, a staging server in Mumbai, a production server in AWS us-east-1 — the same container produces the same result everywhere.
Docker is the tool that made containers practical and mainstream. Kubernetes is the tool that orchestrates containers at scale — managing hundreds or thousands of containers across multiple servers. Together, they form the backbone of modern application deployment.
I remember feeling completely overwhelmed when I first encountered Docker. The terminology was alien, the documentation assumed prior knowledge I did not have, and most tutorials jumped to complex setups without explaining the basics. This guide is my attempt to write the tutorial I wished existed back then.
Part 1: Docker Fundamentals
What Is a Container, Really?
A container is a lightweight, isolated environment that shares the host operating system's kernel but has its own filesystem, processes, network, and resource limits. Think of it as a very efficient virtual machine, but without the overhead of running a separate OS.
Containers vs Virtual Machines:
| Feature | Container | Virtual Machine |
|---|---|---|
| Boot time | Seconds | Minutes |
| Size | Megabytes | Gigabytes |
| OS | Shares host kernel | Full guest OS |
| Performance | Near-native | 5-20% overhead |
| Isolation | Process-level | Hardware-level |
| Resource usage | Minimal | Significant |
| Use case | Application packaging | Full OS isolation |
A single server that can run 2-3 VMs can easily run 20-50 containers. That efficiency difference is why containers have taken over the deployment world.
Installing Docker
Docker Desktop is available for Windows, macOS, and Linux. On Ubuntu (which most Indian developers use for servers), you can install Docker Engine directly:
# Install Docker on Ubuntu
sudo apt-get update
sudo apt-get install -y docker.io
# Start Docker and enable it on boot
sudo systemctl start docker
sudo systemctl enable docker
# Add your user to the docker group (so you don't need sudo)
sudo usermod -aG docker $USER
# Log out and back in, then verify
docker --version
docker run hello-world
If hello-world runs successfully and prints a greeting message, Docker is working.
Images and Containers: The Key Distinction
This trips up most beginners. An image is a blueprint — a read-only template that contains the instructions for creating a container. A container is a running instance of an image. You can create multiple containers from the same image, just like you can create multiple objects from the same class in OOP.
# Pull an image from Docker Hub
docker pull nginx:latest
# Create and run a container from the image
docker run -d -p 8080:80 --name my-web-server nginx:latest
# List running containers
docker ps
# Stop the container
docker stop my-web-server
# Remove the container
docker rm my-web-server
In this example, nginx:latest is the image. my-web-server is the container. The -d flag runs it in the background (detached). The -p 8080:80 flag maps port 8080 on your machine to port 80 inside the container.
Writing a Dockerfile
A Dockerfile is a text file that defines how to build an image. Each instruction creates a layer, and Docker caches these layers for fast rebuilds.
Let me walk through a practical example — containerising a Node.js Express application:
# Use the official Node.js 20 image as the base
FROM node:20-alpine
# Set the working directory inside the container
WORKDIR /app
# Copy package files first (for better caching)
COPY package.json package-lock.json ./
# Install dependencies
RUN npm ci --only=production
# Copy the rest of the application code
COPY . .
# Expose the port the app runs on
EXPOSE 3000
# Define the command to run the app
CMD ["node", "server.js"]
Why copy package.json before the rest of the code? Docker caches each layer. If your application code changes but package.json does not, Docker reuses the cached dependency layer and only rebuilds from the COPY . . step. This makes rebuilds much faster — installing node_modules might take 30 seconds, but copying application code takes milliseconds.
Build and run this image:
# Build the image (the dot means "use current directory as build context")
docker build -t my-node-app .
# Run a container from the image
docker run -d -p 3000:3000 --name my-app my-node-app
# View logs
docker logs my-app
# Execute a command inside the running container
docker exec -it my-app sh
Docker Compose: Multi-Container Applications
Real applications rarely run as a single container. You typically need an app server, a database, a cache, and maybe a reverse proxy. Docker Compose lets you define and run multi-container applications with a single YAML file.
Here is a docker-compose.yml for a Node.js app with PostgreSQL and Redis:
version: "3.8"
services:
app:
build: .
ports:
- "3000:3000"
environment:
DATABASE_URL: postgresql://postgres:secret@db:5432/myapp
REDIS_URL: redis://cache:6379
depends_on:
db:
condition: service_healthy
cache:
condition: service_started
restart: unless-stopped
db:
image: postgres:16-alpine
environment:
POSTGRES_USER: postgres
POSTGRES_PASSWORD: secret
POSTGRES_DB: myapp
volumes:
- postgres_data:/var/lib/postgresql/data
ports:
- "5432:5432"
healthcheck:
test: ["CMD-SHELL", "pg_isready -U postgres"]
interval: 5s
timeout: 5s
retries: 5
cache:
image: redis:7-alpine
ports:
- "6379:6379"
volumes:
- redis_data:/data
volumes:
postgres_data:
redis_data:
# Start all services
docker compose up -d
# View logs for all services
docker compose logs -f
# Stop everything
docker compose down
# Stop and remove volumes (deletes data!)
docker compose down -v
Notice how services reference each other by name (db, cache). Docker Compose creates a network where containers can communicate using service names as hostnames. Your app connects to PostgreSQL at db:5432 and Redis at cache:6379 — no IP addresses needed.
Docker Best Practices
Before we move on to Kubernetes, here are some Docker best practices that will save you pain down the road:
-
Use specific image tags, not
latest. Instead ofFROM node:latest, useFROM node:20.11-alpine. Thelatesttag can change unexpectedly and break your build. -
Use multi-stage builds to reduce image size:
# Build stage
FROM node:20-alpine AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci
COPY . .
RUN npm run build
# Production stage
FROM node:20-alpine
WORKDIR /app
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/node_modules ./node_modules
EXPOSE 3000
CMD ["node", "dist/server.js"]
This produces a much smaller final image because the build tools, source code, and dev dependencies are discarded.
- Do not run as root. Add a non-root user in your Dockerfile:
RUN addgroup -S appgroup && adduser -S appuser -G appgroup
USER appuser
-
Use
.dockerignoreto exclude unnecessary files (node_modules, .git, tests) from the build context. -
Keep images small. Use Alpine-based images when possible (
node:20-alpineis ~50MB vsnode:20at ~350MB).
Part 2: Kubernetes — Orchestration at Scale
Docker is excellent for running a few containers on a single machine. But what happens when your application needs to run on multiple servers? When you need automatic scaling based on traffic? When a container crashes and needs to be restarted without human intervention? When you need to deploy a new version without downtime?
That is where Kubernetes comes in.
What Is Kubernetes?
Kubernetes (often abbreviated as K8s — "K" + 8 middle letters + "s") is a container orchestration platform. It manages the deployment, scaling, and operation of containerised applications across a cluster of machines.
Kubernetes was originally designed by Google, based on their internal system called Borg, and is now maintained by the Cloud Native Computing Foundation (CNCF). It is the industry standard for container orchestration.
Core Kubernetes Concepts
Cluster: A set of machines (nodes) that run containerised applications managed by Kubernetes.
Node: A single machine in the cluster. Nodes can be physical servers or virtual machines. There are two types:
- Control plane node: Runs the Kubernetes management components (API server, scheduler, controller manager)
- Worker node: Runs your application containers
Pod: The smallest deployable unit in Kubernetes. A pod encapsulates one or more containers that share storage and network. In most cases, a pod runs a single container.
Service: An abstraction that defines how to access a set of pods. Services provide stable IP addresses and DNS names, load balancing across pods, and service discovery.
Deployment: A declarative configuration that describes the desired state for your pods — how many replicas, which image to use, resource limits, update strategy. Kubernetes continuously works to make the actual state match the desired state.
Namespace: A way to divide cluster resources between multiple users or teams. Think of it as virtual clusters within a physical cluster.
Setting Up Minikube for Local Development
Minikube runs a single-node Kubernetes cluster on your local machine. It is perfect for learning and development.
# Install minikube (Linux)
curl -LO https://storage.googleapis.com/minikube/releases/latest/minikube-linux-amd64
sudo install minikube-linux-amd64 /usr/local/bin/minikube
# Start a cluster
minikube start
# Verify the cluster is running
kubectl cluster-info
kubectl get nodes
# Enable useful addons
minikube addons enable dashboard
minikube addons enable metrics-server
kubectl is the command-line tool for interacting with Kubernetes clusters. You will use it constantly.
Deploying Your App to Kubernetes
Let us deploy the Node.js application we containerised earlier. First, push the Docker image to a registry (Docker Hub, GitHub Container Registry, or a private registry):
# Tag the image for Docker Hub
docker tag my-node-app yourusername/my-node-app:1.0.0
# Push to Docker Hub
docker push yourusername/my-node-app:1.0.0
Now create a Kubernetes Deployment:
# deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-node-app
labels:
app: my-node-app
spec:
replicas: 3
selector:
matchLabels:
app: my-node-app
template:
metadata:
labels:
app: my-node-app
spec:
containers:
- name: my-node-app
image: yourusername/my-node-app:1.0.0
ports:
- containerPort: 3000
resources:
requests:
memory: "128Mi"
cpu: "100m"
limits:
memory: "256Mi"
cpu: "500m"
readinessProbe:
httpGet:
path: /api/health
port: 3000
initialDelaySeconds: 5
periodSeconds: 10
livenessProbe:
httpGet:
path: /api/health
port: 3000
initialDelaySeconds: 15
periodSeconds: 20
And a Service to expose it:
# service.yaml
apiVersion: v1
kind: Service
metadata:
name: my-node-app-service
spec:
selector:
app: my-node-app
ports:
- protocol: TCP
port: 80
targetPort: 3000
type: LoadBalancer
Apply both configurations:
# Apply the deployment
kubectl apply -f deployment.yaml
# Apply the service
kubectl apply -f service.yaml
# Check the status
kubectl get deployments
kubectl get pods
kubectl get services
# Watch pods come up in real-time
kubectl get pods -w
This creates three replicas of your application, each running in its own pod. The Service load-balances incoming traffic across all three pods. If a pod crashes, Kubernetes automatically creates a new one to maintain the desired count of three.
Scaling Your Application
Scaling is as simple as changing the replica count:
# Scale to 5 replicas
kubectl scale deployment my-node-app --replicas=5
# Or use autoscaling based on CPU usage
kubectl autoscale deployment my-node-app --min=3 --max=10 --cpu-percent=70
With autoscaling, Kubernetes monitors CPU usage across your pods and automatically adds or removes replicas to maintain approximately 70% CPU utilisation. During traffic spikes, it scales up. When traffic drops, it scales down. No manual intervention required.
Rolling Updates (Zero-Downtime Deployments)
When you deploy a new version, Kubernetes performs a rolling update by default — it gradually replaces old pods with new ones, ensuring that some pods are always available to serve traffic.
# Update the image to a new version
kubectl set image deployment/my-node-app my-node-app=yourusername/my-node-app:2.0.0
# Watch the rollout progress
kubectl rollout status deployment/my-node-app
# If something goes wrong, roll back
kubectl rollout undo deployment/my-node-app
The rollout strategy is configurable. By default, Kubernetes replaces 25% of pods at a time, waiting for the new pods to pass their readiness probes before proceeding. This ensures zero downtime during deployments.
Monitoring with kubectl
Here are the kubectl commands you will use most frequently:
# List all resources in the default namespace
kubectl get all
# Describe a specific pod (detailed info including events)
kubectl describe pod my-node-app-abc123
# View logs from a pod
kubectl logs my-node-app-abc123
# Stream logs in real-time
kubectl logs -f my-node-app-abc123
# Execute a command inside a pod
kubectl exec -it my-node-app-abc123 -- sh
# View resource usage
kubectl top pods
kubectl top nodes
# Delete a resource
kubectl delete pod my-node-app-abc123
kubectl delete -f deployment.yaml
Part 3: When to Use Docker Alone vs Kubernetes
Not every application needs Kubernetes. In fact, Kubernetes adds significant operational complexity that is not justified for smaller deployments.
Use Docker (Without Kubernetes) When:
- You are running a small number of containers (fewer than 10-15)
- Your application runs on a single server
- You do not need auto-scaling
- You are in early development or running personal projects
- Your team does not have Kubernetes expertise
- Docker Compose handles your orchestration needs
Use Kubernetes When:
- You need to run your application across multiple servers for reliability
- You need automatic scaling based on traffic or resource usage
- You require zero-downtime deployments
- You have multiple services that need service discovery and load balancing
- Your organisation has dedicated DevOps or platform engineering expertise
- You are running a SaaS product or high-traffic application
The Middle Ground: Managed Container Services
If Kubernetes feels like overkill but Docker Compose feels too simple, several managed services offer a middle ground:
| Service | Provider | Complexity | Cost |
|---|---|---|---|
| AWS ECS (Fargate) | Amazon | Medium | Pay per container |
| Google Cloud Run | Low | Pay per request | |
| Azure Container Apps | Microsoft | Low-Medium | Pay per container |
| Railway | Railway | Very Low | Pay per resource |
| Fly.io | Fly.io | Low | Pay per resource |
Google Cloud Run deserves special mention. You give it a Docker image, and it handles everything — scaling to zero when idle, scaling up on traffic, HTTPS, custom domains. For many applications, especially those with variable traffic, Cloud Run is the ideal deployment target.
Part 3.5: Managed Kubernetes Options
If you do need Kubernetes but do not want to manage the control plane yourself, every major cloud provider offers managed Kubernetes:
| Service | Provider | Starting Cost | Best For |
|---|---|---|---|
| EKS | AWS | ~$73/month (control plane) | AWS-heavy shops |
| GKE | Google Cloud | Free tier available | Best managed K8s experience |
| AKS | Azure | Free (control plane) | Microsoft/Azure shops |
GKE (Google Kubernetes Engine) is generally considered the best managed Kubernetes experience — which makes sense, since Google created Kubernetes. It offers an Autopilot mode that manages node pools automatically, reducing operational overhead further.
For Indian startups, GKE Autopilot or AWS ECS Fargate are usually the most practical choices. They let you focus on your application instead of managing infrastructure.
Common Pitfalls and How to Avoid Them
After years of working with Docker and Kubernetes, here are the mistakes I see most often:
-
Using
latesttags in production. Always pin your image versions.my-app:latesttoday might be a completely different image tomorrow. -
Not setting resource limits. A single pod without memory limits can consume all available memory on a node and crash other pods. Always set
requestsandlimits. -
Storing secrets in plain text. Never put passwords, API keys, or certificates in your Dockerfile or deployment YAML. Use Kubernetes Secrets or external secret management tools like HashiCorp Vault.
-
Not using health checks. Without readiness and liveness probes, Kubernetes cannot know if your application is actually healthy. A pod might be running but unable to serve traffic — health checks catch this.
-
Over-engineering early. Starting with Kubernetes for a two-container app is like renting a warehouse for your bicycle. Start with Docker Compose. Graduate to Kubernetes when the complexity justifies it.
-
Ignoring image size. Large images mean slow deployments, higher storage costs, and larger attack surfaces. Use multi-stage builds and Alpine-based images.
A Practical Learning Path
If this guide has sparked your interest, here is how I would suggest learning Docker and Kubernetes systematically:
Week 1-2: Docker basics
- Install Docker, run some images, learn the CLI
- Write Dockerfiles for your own projects
- Use Docker Compose for multi-container setups
Week 3-4: Docker in practice
- Containerise a real project (not just a hello-world app)
- Set up a CI/CD pipeline that builds Docker images (GitHub Actions is great for this)
- Push images to Docker Hub or GitHub Container Registry
Week 5-6: Kubernetes concepts
- Install Minikube, learn kubectl
- Deploy a simple app with Deployments and Services
- Experiment with scaling, rolling updates, and rollbacks
Week 7-8: Kubernetes in practice
- Deploy a multi-service application
- Set up Ingress for HTTP routing
- Learn ConfigMaps and Secrets for configuration management
- Explore Helm charts for packaging Kubernetes applications
The infrastructure side of software development can feel intimidating, but Docker and Kubernetes are tools, not magic. They are complicated tools, sure, but they follow logical patterns that make sense once you work through the fundamentals. Every senior developer I know says the same thing: understanding how your code gets from your editor to a running production environment makes you a dramatically better engineer. This is knowledge that pays dividends for your entire career.
Priya Patel
Senior Tech Writer
Covers AI, machine learning, and emerging technologies. Previously at TechCrunch India.
Comments (0)
Leave a Comment
Related Articles
API Design Best Practices: REST, GraphQL, and the Patterns That Scale
A practical guide to designing APIs that last, covering REST conventions, GraphQL fundamentals, tRPC, authentication patterns, rate limiting, error handling, and testing tools with Node.js examples.
DSA Roadmap for Campus Placements: What Actually Matters in 2026
A realistic DSA preparation roadmap for campus placements in India, covering topic priorities, platform choices, language selection, company-wise patterns, and time management strategies.
AWS vs Google Cloud vs Azure in 2026: A Developer's Honest Comparison
A practical comparison of AWS, Google Cloud, and Azure for Indian developers covering compute, serverless, databases, pricing, free tiers, and career advice.