System Design Interview Prep: A Practical Guide for Indian Engineers
A hands-on guide to cracking system design interviews at FAANG and top Indian companies, with step-by-step design walkthroughs and key concepts explained.
The Interview Round That Terrifies Everyone
I bombed my first system design interview spectacularly. It was for a senior engineer role at a well-known Bangalore startup, and the question was simple enough: "Design a URL shortener like bit.ly." I knew the concept. I had used URL shorteners a thousand times. But when the whiteboard was in front of me and the interviewer was watching, my brain froze. I started rambling about databases without first understanding the requirements. I forgot to discuss scale. I never mentioned caching. The interviewer politely thanked me after 30 minutes, and I knew it was over.
That experience taught me something important: system design is not about knowing the right answer (there is no single right answer). It is about demonstrating a structured thinking process. Companies want to see how you break down ambiguous problems, make trade-offs, communicate your reasoning, and respond to pushback.
After that failure, I spent four months preparing systematically. I cracked system design rounds at two FAANG companies and a top Indian fintech firm. This guide is what I wish someone had given me before that first disastrous interview.
How Companies Ask System Design Questions
The format varies, but the fundamentals are consistent across companies.
FAANG (Google, Amazon, Meta, Apple, Netflix)
- Duration: 45 minutes (sometimes 60)
- Format: Open-ended problem, interviewer acts as a collaborative partner
- Evaluation: Requirements gathering, high-level design, deep dive into components, trade-off analysis, scalability discussion
- Expected level: Senior (L5+) candidates almost always face this round; some companies include it for mid-level (L4) too
Indian Companies (Flipkart, Razorpay, Zerodha, Swiggy, PhonePe)
- Duration: 45-60 minutes
- Format: Similar to FAANG but often with more emphasis on practical, India-specific constraints (high traffic during sales, payment system reliability, vernacular support)
- Unique aspects: Flipkart might ask you to design for Big Billion Days traffic. Razorpay wants you to think about payment idempotency. Zerodha cares about real-time data at scale with minimal latency.
Common Questions Asked
| Company | Frequently Asked Questions |
|---|---|
| Design YouTube, Google Maps, Google Docs | |
| Amazon | Design an e-commerce platform, delivery tracking system |
| Meta | Design Facebook News Feed, Instagram Stories, Messenger |
| Flipkart | Design a product catalog, flash sale system, search |
| Razorpay | Design a payment gateway, recurring payments |
| Zerodha | Design a stock trading platform, real-time order book |
| Swiggy | Design a food delivery system, driver allocation |
The Framework: How to Structure Your 45 Minutes
I use a four-phase framework that keeps me on track and ensures I cover everything the interviewer expects. The timing is approximate but helps prevent spending too long on any one phase.
Phase 1: Requirements and Scope (5-7 minutes)
This is the most important phase. Many candidates skip it and jump straight into drawing boxes and arrows. That is a mistake. Spend the first few minutes asking clarifying questions to narrow down the scope.
Functional requirements: What should the system do?
- What are the core features?
- Who are the users?
- What are the key user flows?
Non-functional requirements: How should the system behave?
- What is the expected scale? (DAU, requests per second, data volume)
- What are the latency requirements?
- What is the availability target? (99.9%? 99.99%?)
- Is consistency or availability more important? (CAP theorem territory)
Back-of-the-envelope estimation:
- Quick math on storage, bandwidth, and QPS
- This shows the interviewer you think about scale practically
Phase 2: High-Level Design (10-12 minutes)
Sketch the major components and how they interact:
- API design (key endpoints)
- Core services/microservices
- Database choices (SQL vs NoSQL, why)
- Data flow between components
Do not go deep into any single component yet. The goal is to establish the overall architecture so the interviewer can see the big picture and steer the conversation.
Phase 3: Deep Dive (15-20 minutes)
The interviewer will usually pick one or two components to explore in detail. This is where your knowledge of specific technologies and patterns matters:
- Database schema design
- Caching strategy
- Data partitioning and sharding
- Consistency models
- API rate limiting
- Failure handling
Phase 4: Scaling and Trade-offs (5-8 minutes)
Discuss how the system handles growth:
- Horizontal scaling strategies
- Single points of failure and how to eliminate them
- Monitoring and alerting
- Trade-offs you made and why
Design Walkthrough 1: URL Shortener
Since this is the question that humbled me, let me walk through how I would answer it now.
Phase 1: Requirements
Functional:
- Users can submit a long URL and receive a shortened URL
- Clicking the short URL redirects to the original URL
- Optional: custom aliases, expiration dates, click analytics
Non-functional:
- High availability (links should always work)
- Low latency for redirection (< 100ms)
- Short URLs should be as short as possible
- Not easily guessable (no sequential IDs)
Scale estimation:
- 100 million new URLs per month (write-heavy creation)
- 10 billion redirects per month (read-heavy usage)
- Read:Write ratio = 100:1
- Average URL length: 200 bytes
- Storage per year: 100M * 12 * 200B = ~240 GB (manageable)
- QPS for reads: 10B / (30 * 24 * 3600) = ~3,850 reads/second
- Peak QPS: ~10,000 reads/second
Phase 2: High-Level Design
Client -> Load Balancer -> API Servers -> Cache (Redis)
-> Database (read replicas)
-> ID Generator Service
API Design:
POST /api/shorten
Body: { "url": "https://example.com/very/long/path", "custom_alias": "mylink" }
Response: { "short_url": "https://short.ly/abc123" }
GET /:shortCode
Response: 301 Redirect to original URL
Key decision: How to generate the short code?
Option A: Hash the URL (MD5/SHA256) and take the first 7 characters. Problem: collisions.
Option B: Pre-generate unique IDs using a distributed ID generator (like Snowflake or a counter-based service) and encode them in Base62. This is cleaner.
I would go with Base62 encoding of a unique ID. A 7-character Base62 string gives 62^7 = 3.5 trillion possible URLs, which is more than enough.
Phase 3: Deep Dive -- Database and Caching
Database choice: A simple key-value store works here. The access pattern is straightforward: given a short code, return the long URL. Options include:
- DynamoDB or Cassandra for high write throughput and horizontal scalability
- PostgreSQL with good indexing if you need relational features (analytics, user accounts)
I would choose DynamoDB for the core URL mapping (short_code -> long_url) because the access pattern is pure key-value lookup, and DynamoDB handles this at massive scale with single-digit millisecond latency.
Schema:
Table: url_mappings
short_code (PK): String -- "abc123"
long_url: String -- "https://example.com/..."
created_at: Timestamp
expires_at: Timestamp (optional)
user_id: String (optional)
click_count: Number
Caching strategy: Since reads outnumber writes 100:1, caching is critical. Place a Redis cache in front of the database:
- On redirect request, check Redis first
- If cache hit, redirect immediately (< 5ms)
- If cache miss, query DynamoDB, cache the result, then redirect
- Use LRU eviction policy
- Cache size: top 20% of URLs likely handle 80% of traffic (Pareto principle). If we have 1.2 billion URLs, cache 240 million. At ~200 bytes each, that is ~48 GB of Redis -- feasible with a Redis cluster.
Phase 4: Scaling
- API servers are stateless, so horizontal scaling is straightforward behind a load balancer
- DynamoDB scales automatically with provisioned or on-demand capacity
- Redis cluster can be sharded by short_code hash
- 301 vs 302 redirect: Use 301 (permanent) to allow browsers to cache, reducing server load. Use 302 if you need accurate click analytics (since 301 may bypass your server on subsequent clicks).
Design Walkthrough 2: Chat System (WhatsApp-like)
Phase 1: Requirements
Functional:
- One-on-one messaging
- Group messaging (up to 256 members)
- Online/offline status
- Message delivery receipts (sent, delivered, read)
- Media sharing (images, videos)
Non-functional:
- Real-time message delivery (< 500ms)
- Messages must never be lost
- High availability
- End-to-end encryption
Scale: 500 million DAU, each sending 40 messages/day = 20 billion messages/day = ~230,000 messages/second.
Phase 2: High-Level Design
Client <-> WebSocket Gateway <-> Chat Service <-> Message Queue (Kafka)
-> Message Storage (Cassandra)
-> User Service
-> Media Service (S3)
-> Push Notification Service
Why WebSockets? Chat requires bidirectional, real-time communication. HTTP polling wastes resources; WebSocket maintains a persistent connection where the server can push messages instantly.
Phase 3: Deep Dive -- Message Delivery
When both users are online:
- User A sends a message through their WebSocket connection
- The Chat Service identifies User B's WebSocket gateway server
- Message is forwarded directly to User B through their WebSocket connection
- Delivery receipt is sent back to User A
When the recipient is offline:
- Message is stored in Cassandra with delivery status = "sent"
- A push notification is sent via FCM (Android) or APNs (iOS)
- When User B comes online, their client pulls all undelivered messages
- Delivery status is updated to "delivered"
How to find which server a user is connected to?
- Maintain a presence service using Redis:
user_id -> gateway_server_id - When a user connects via WebSocket, register their server in Redis
- When they disconnect, remove the entry
- Heartbeat mechanism to detect stale connections
Message storage in Cassandra:
Table: messages
Partition Key: chat_id (conversation ID)
Clustering Key: message_id (TimeUUID for chronological ordering)
Columns: sender_id, content, timestamp, status, media_url
Cassandra is excellent here because:
- Write-heavy workload (230K writes/second)
- Time-series access pattern (load messages in chronological order)
- Easy horizontal scaling through partitioning
- Tunable consistency (can prioritize availability)
Phase 4: Scaling Considerations
- WebSocket gateways need to handle millions of concurrent connections. Each server handles ~100K connections. For 500M DAU with ~100M concurrent at peak, you need ~1,000 gateway servers.
- Kafka acts as a buffer between the chat service and storage, preventing message loss during traffic spikes
- End-to-end encryption means the server never sees plaintext messages -- only the clients have the keys. This is a non-trivial implementation using the Signal Protocol.
Key Concepts You Must Know
Load Balancing
Distributes traffic across multiple servers. Know the algorithms:
- Round Robin: Simple rotation, works for stateless services
- Least Connections: Routes to the server with fewest active connections
- Consistent Hashing: Routes based on a hash of the request, useful for cache-friendly routing
Caching
Layers of caching from closest to the user:
- Browser cache (client-side)
- CDN (edge servers geographically close to users)
- Application cache (Redis/Memcached)
- Database query cache
Cache invalidation strategies: Write-through (update cache on write), Write-behind (async cache update), Cache-aside (application manages cache), TTL-based expiration.
Database Sharding
Splitting a large database across multiple servers. Common strategies:
- Range-based: Shard by ID range (1-1M on shard 1, 1M-2M on shard 2)
- Hash-based: Hash the shard key and mod by number of shards
- Geography-based: Indian users on India shard, US users on US shard
CAP Theorem
In a distributed system, you can only guarantee two out of three:
- Consistency: Every read gets the most recent write
- Availability: Every request gets a response
- Partition tolerance: System works despite network failures
Since network partitions are unavoidable, the real choice is between CP (consistent but may reject requests) and AP (always responds but may serve stale data). Banking systems choose CP. Social media feeds choose AP.
Message Queues
Asynchronous communication between services. Kafka for high-throughput event streaming, RabbitMQ for task queues, SQS for simple cloud-native queuing.
When to use a message queue:
- Decoupling services
- Handling traffic spikes (buffering)
- Ensuring message delivery (at-least-once)
- Enabling event-driven architectures
Common Mistakes I See (and Made)
1. Not Clarifying Requirements
Jumping into the solution without understanding the problem. Always spend the first 5 minutes asking questions. Interviewers specifically evaluate whether you clarify ambiguity.
2. Over-Engineering
Designing for billions of users when the question implies thousands. Start simple and scale up when the interviewer pushes. A monolith is fine for the initial design -- you can discuss microservices as a scaling step.
3. Ignoring Trade-offs
Every design decision has trade-offs. SQL vs NoSQL, consistency vs availability, latency vs throughput. The interviewer wants to hear you articulate these trade-offs, not just pick a technology because it is popular.
4. Not Doing Back-of-the-Envelope Math
"We will use a database" is weak. "We need to store 240 GB per year, so a single PostgreSQL instance handles this easily" shows you understand the actual scale.
5. Forgetting About Failure
What happens when a server crashes? When the database goes down? When a network partition occurs? Discussing failure modes and recovery strategies shows maturity.
Resources That Actually Helped Me
Books
- "Designing Data-Intensive Applications" by Martin Kleppmann: The single best book on distributed systems concepts. Dense but incredibly thorough. I read it twice.
- "System Design Interview" by Alex Xu (Vol 1 & 2): Practical, structured walkthroughs of common design problems. Great for pattern recognition.
YouTube Channels
- Gaurav Sen: Excellent explanations of system design concepts with clear diagrams. His videos on consistent hashing and load balancing are outstanding.
- Tech Dummies Narendra L: Detailed, practical system design walkthroughs. Very popular among Indian engineers.
- ByteByteGo (Alex Xu): Clean animations explaining complex distributed systems concepts.
Practice Platforms
- interviewing.io: Anonymous mock interviews with real engineers from FAANG companies
- Pramp: Free peer-to-peer mock interview platform
- HelloInterview: AI-powered system design practice with feedback
Open-Source Study Guides
- system-design-primer (GitHub): Comprehensive collection of system design resources and examples
- ByteByteGo newsletter: Weekly deep dives into system design topics
How to Handle the 45-Minute Format
Time management is everything. Here is my minute-by-minute breakdown:
| Time | Activity | What to Say |
|---|---|---|
| 0:00-0:02 | Listen to the question | Take notes, do not interrupt |
| 0:02-0:07 | Ask clarifying questions | "What is the expected scale? Which features are in scope?" |
| 0:07-0:10 | Back-of-envelope estimation | "Given 100M DAU, that is roughly X QPS..." |
| 0:10-0:22 | High-level design | Draw the architecture, explain data flow |
| 0:22-0:38 | Deep dive (interviewer-guided) | Schema, caching, algorithms for 1-2 components |
| 0:38-0:43 | Scaling and trade-offs | "If traffic 10x, here is how we scale..." |
| 0:43-0:45 | Summary and questions | Recap key decisions, ask if they want to explore anything else |
Pro tip: Narrate your thinking out loud. The interviewer cannot evaluate what they cannot hear. Saying "I am choosing Cassandra over PostgreSQL because our write-heavy workload benefits from Cassandra's LSM-tree storage engine" is far better than silently writing "Cassandra" on the whiteboard.
India-Specific Preparation Tips
Prepare for Indian Scale
Indian companies operate at unique scales. Flipkart processes millions of orders during Big Billion Days. PhonePe handles billions of UPI transactions monthly. Zerodha processes lakhs of orders per second during market open. Understanding these scale characteristics gives you an edge.
Know the Indian Tech Stack
Many Indian companies use specific technologies:
- Kafka is ubiquitous for event streaming
- Kubernetes for container orchestration
- PostgreSQL and MySQL for relational data
- Redis for caching (almost everywhere)
- Go and Java for backend services (more than Node.js at scale)
Cultural Differences in Interviews
Indian company interviews tend to be slightly more technical and less collaborative than FAANG. You may be asked more specific implementation questions ("How would you implement the LRU cache?") alongside the high-level design. Be prepared for both.
The system design interview is a skill that improves with practice. Do at least 10-15 mock interviews before your real ones. Record yourself, review the recording, and identify where you got stuck or rambled. Every iteration makes you sharper. The framework, the concepts, and the practice -- that is what separates candidates who pass from those who do not.
Advertisement
Advertisement
Ad Space
Anurag Sharma
Founder & Editor
Tech enthusiast and founder of Tech Tips India. Passionate about making technology accessible to everyone across India.
Comments (0)
Leave a Comment
Related Articles
API Design Best Practices: REST, GraphQL, and the Patterns That Scale
A practical guide to designing APIs that last, covering REST conventions, GraphQL fundamentals, tRPC, authentication patterns, rate limiting, error handling, and testing tools with Node.js examples.
DSA Roadmap for Campus Placements: What Actually Matters in 2026
A realistic DSA preparation roadmap for campus placements in India, covering topic priorities, platform choices, language selection, company-wise patterns, and time management strategies.
Redis Caching in Practice: Speed Up Your App Without the Headaches
A practical walkthrough of Redis caching strategies, data structures, cache invalidation, and integration with Node.js, Next.js, and serverless platforms like Upstash.