System Design Interview Prep for Indian Engineers
A hands-on guide to cracking system design interviews at FAANG and top Indian companies, with step-by-step design walkthroughs and key concepts explained.

I Froze in Front of the Whiteboard
Let me tell you about the most embarrassing 30 minutes of my career. A well-known Bangalore startup had called me in for a senior engineer position, and the interviewer asked me to design a URL shortener like bit.ly. Simple enough, right? I'd used URL shorteners a thousand times. Knew the concept cold. But the second that whiteboard marker hit my hand and the interviewer locked eyes with me — nothing. My brain just went blank. I started rambling about databases without bothering to ask about requirements first. Didn't mention scale. Didn't bring up caching. Forgot to discuss trade-offs. After 30 painful minutes, the interviewer thanked me politely, and I knew I'd blown it before the door closed behind me.
That humiliation taught me something I think a lot of engineers learn the hard way: system design isn't about having the "right" answer. There isn't one right answer. What companies actually want is to watch you demonstrate a structured thinking process. Can you break down a fuzzy problem? Can you weigh trade-offs out loud? Can you respond when someone pushes back on your design? Those are the things that matter.
After that disaster, I spent four months preparing with real structure. Eventually I cleared system design rounds at two FAANG companies and a top Indian fintech firm. Everything in this guide is what I wish someone had handed me before I walked into that room and drew a blank.
How Companies Frame These Questions
Formats differ from company to company, but the core expectations stay pretty consistent.
FAANG (Google, Amazon, Meta, Apple, Netflix)
- Duration: 45 minutes (sometimes 60)
- Format: Open-ended problem, interviewer acts as a collaborative partner
- Evaluation: Requirements gathering, high-level design, deep dive into components, trade-off analysis, scalability discussion
- Expected level: Senior (L5+) candidates almost always face this round; some companies include it for mid-level (L4) too
Indian Companies (Flipkart, Razorpay, Zerodha, Swiggy, PhonePe)
- Duration: 45-60 minutes
- Format: Similar to FAANG but often with more emphasis on practical, India-specific constraints (high traffic during sales, payment system reliability, vernacular support)
- Unique aspects: Flipkart might ask you to design for Big Billion Days traffic. Razorpay wants you to think about payment idempotency. Zerodha cares about real-time data at scale with minimal latency.
Commonly Asked Questions
| Company | Frequently Asked Questions |
|---|---|
| Design YouTube, Google Maps, Google Docs | |
| Amazon | Design an e-commerce platform, delivery tracking system |
| Meta | Design Facebook News Feed, Instagram Stories, Messenger |
| Flipkart | Design a product catalog, flash sale system, search |
| Razorpay | Design a payment gateway, recurring payments |
| Zerodha | Design a stock trading platform, real-time order book |
| Swiggy | Design a food delivery system, driver allocation |
A Four-Phase Framework for Your 45 Minutes
Here's the approach I use now. It keeps me on track and makes sure I cover everything the interviewer expects. Timing is approximate, but it prevents you from burning 20 minutes on one section and having nothing left for the rest.
Phase 1: Requirements and Scope (5-7 minutes)
Don't skip this. Seriously. So many candidates jump straight into drawing boxes and arrows, and it's a guaranteed way to go sideways. Spend those first few minutes asking clarifying questions to narrow down what you're actually building.
Functional requirements: What should the system do?
- What are the core features?
- Who are the users?
- What are the key user flows?
Non-functional requirements: How should the system behave?
- What's the expected scale? (DAU, requests per second, data volume)
- What are the latency requirements?
- What's the availability target? (99.9%? 99.99%?)
- Is consistency or availability more important? (CAP theorem territory)
Back-of-the-envelope estimation:
- Quick math on storage, bandwidth, and QPS
- Shows the interviewer you think about scale practically, not abstractly
Phase 2: High-Level Design (10-12 minutes)
Sketch out the major components and how they interact:
- API design (key endpoints)
- Core services/microservices
- Database choices (SQL vs NoSQL, why)
- Data flow between components
Don't go deep into any single component yet. You're painting the big picture so the interviewer can see where you're headed and steer the conversation if they want to.
Phase 3: Deep Dive (15-20 minutes)
Now the interviewer will probably pick one or two components to explore in detail. Here's where your knowledge of specific technologies and patterns actually matters:
- Database schema design
- Caching strategy
- Data partitioning and sharding
- Consistency models
- API rate limiting
- Failure handling
Phase 4: Scaling and Trade-offs (5-8 minutes)
Discuss how the system handles growth:
- Horizontal scaling strategies
- Single points of failure and how to eliminate them
- Monitoring and alerting
- Trade-offs you made and why
Design Walkthrough 1: URL Shortener
Since this is the question that humbled me, let me walk through how I'd answer it now. Probably the most frequently asked system design question out there, and for good reason — it's simple enough to scope in 45 minutes but deep enough to test real knowledge.
Phase 1: Requirements
Functional:
- Users can submit a long URL and receive a shortened URL
- Clicking the short URL redirects to the original URL
- Optional: custom aliases, expiration dates, click analytics
Non-functional:
- High availability (links should always work)
- Low latency for redirection (< 100ms)
- Short URLs should be as short as possible
- Not easily guessable (no sequential IDs)
Scale estimation:
- 100 million new URLs per month (write-heavy creation)
- 10 billion redirects per month (read-heavy usage)
- Read:Write ratio = 100:1
- Average URL length: 200 bytes
- Storage per year: 100M * 12 * 200B = ~240 GB (manageable)
- QPS for reads: 10B / (30 * 24 * 3600) = ~3,850 reads/second
- Peak QPS: ~10,000 reads/second
Phase 2: High-Level Design
Client -> Load Balancer -> API Servers -> Cache (Redis)
-> Database (read replicas)
-> ID Generator Service
API Design:
POST /api/shorten
Body: { "url": "https://example.com/very/long/path", "custom_alias": "mylink" }
Response: { "short_url": "https://short.ly/abc123" }
GET /:shortCode
Response: 301 Redirect to original URL
Key decision: How to generate the short code?
Option A: Hash the URL (MD5/SHA256) and take the first 7 characters. Problem: collisions.
Option B: Pre-generate unique IDs using a distributed ID generator (like Snowflake or a counter-based service) and encode them in Base62. Cleaner approach.
I'd go with Base62 encoding of a unique ID. A 7-character Base62 string gives 62^7 = 3.5 trillion possible URLs, which is way more than enough for our scale.
Phase 3: Deep Dive -- Database and Caching
Database choice: A simple key-value store works perfectly here. Given a short code, return the long URL — that's the access pattern. Options include:
- DynamoDB or Cassandra for high write throughput and horizontal scalability
- PostgreSQL with good indexing if you need relational features (analytics, user accounts)
I'd choose DynamoDB for the core URL mapping (short_code -> long_url) because the access pattern is pure key-value lookup, and DynamoDB handles this at massive scale with single-digit millisecond latency. Seems like the cleanest fit.
Schema:
Table: url_mappings
short_code (PK): String -- "abc123"
long_url: String -- "https://example.com/..."
created_at: Timestamp
expires_at: Timestamp (optional)
user_id: String (optional)
click_count: Number
Caching strategy: Since reads outnumber writes 100:1, caching is absolutely worth it. Place a Redis cache in front of the database (our Redis caching practical guide covers the implementation details in depth):
- On redirect request, check Redis first
- If cache hit, redirect immediately (< 5ms)
- If cache miss, query DynamoDB, cache the result, then redirect
- Use LRU eviction policy
- Cache size: top 20% of URLs likely handle 80% of traffic (Pareto principle). With 1.2 billion URLs, cache 240 million. At ~200 bytes each, that's ~48 GB of Redis — feasible with a Redis cluster.
Phase 4: Scaling
- API servers are stateless, so horizontal scaling is straightforward behind a load balancer
- DynamoDB scales automatically with provisioned or on-demand capacity
- Redis cluster can be sharded by short_code hash
- 301 vs 302 redirect: Use 301 (permanent) to allow browsers to cache, reducing server load. Use 302 if you need accurate click analytics (since 301 may bypass your server on subsequent clicks).
Design Walkthrough 2: Chat System (WhatsApp-like)
Phase 1: Requirements
Functional:
- One-on-one messaging
- Group messaging (up to 256 members)
- Online/offline status
- Message delivery receipts (sent, delivered, read)
- Media sharing (images, videos)
Non-functional:
- Real-time message delivery (< 500ms)
- Messages must never be lost
- High availability
- End-to-end encryption
Scale: 500 million DAU, each sending 40 messages/day = 20 billion messages/day = ~230,000 messages/second. Huge numbers.
Phase 2: High-Level Design
Client <-> WebSocket Gateway <-> Chat Service <-> Message Queue (Kafka)
-> Message Storage (Cassandra)
-> User Service
-> Media Service (S3)
-> Push Notification Service
Why WebSockets? Chat requires bidirectional, real-time communication. HTTP polling wastes resources. WebSocket maintains a persistent connection where the server can push messages instantly. No contest here.
Phase 3: Deep Dive -- Message Delivery
When both users are online:
- User A sends a message through their WebSocket connection
- Chat Service identifies User B's WebSocket gateway server
- Message gets forwarded directly to User B through their WebSocket connection
- Delivery receipt is sent back to User A
When the recipient is offline:
- Message is stored in Cassandra with delivery status = "sent"
- A push notification is sent via FCM (Android) or APNs (iOS)
- When User B comes online, their client pulls all undelivered messages
- Delivery status updates to "delivered"
How do you find which server a user is connected to?
- Maintain a presence service using Redis:
user_id -> gateway_server_id - When a user connects via WebSocket, register their server in Redis
- When they disconnect, remove the entry
- Heartbeat mechanism to detect stale connections
Message storage in Cassandra:
Table: messages
Partition Key: chat_id (conversation ID)
Clustering Key: message_id (TimeUUID for chronological ordering)
Columns: sender_id, content, timestamp, status, media_url
Why Cassandra? I think it's an excellent fit here because:
- Write-heavy workload (230K writes/second)
- Time-series access pattern (load messages in chronological order)
- Easy horizontal scaling through partitioning
- Tunable consistency (can prioritize availability)
Phase 4: Scaling Considerations
- WebSocket gateways need to handle millions of concurrent connections. Each server handles ~100K connections. For 500M DAU with ~100M concurrent at peak, you'd need ~1,000 gateway servers.
- Kafka acts as a buffer between the chat service and storage, preventing message loss during traffic spikes
- End-to-end encryption means the server never sees plaintext messages — only the clients have the keys. Not a trivial implementation; it typically uses the Signal Protocol.
Key Concepts You Can't Skip
Load Balancing
Distributes traffic across multiple servers. Know these algorithms:
- Round Robin: Simple rotation, works for stateless services
- Least Connections: Routes to the server with fewest active connections
- Consistent Hashing: Routes based on a hash of the request, useful for cache-friendly routing
Caching
Layers of caching from closest to the user:
- Browser cache (client-side)
- CDN (edge servers geographically close to users)
- Application cache (Redis/Memcached)
- Database query cache
Cache invalidation strategies: Write-through (update cache on write), Write-behind (async cache update), Cache-aside (application manages cache), TTL-based expiration. Pick the wrong one and you'll spend weeks debugging stale data issues. I suspect cache invalidation causes more production incidents than people admit.
Database Sharding
Splitting a large database across multiple servers. Common strategies:
- Range-based: Shard by ID range (1-1M on shard 1, 1M-2M on shard 2)
- Hash-based: Hash the shard key and mod by number of shards
- Geography-based: Indian users on India shard, US users on US shard
Each has trade-offs. Range-based can create hotspots. Hash-based makes range queries harder. Geography-based works well for user-specific data but not for global data.
CAP Theorem
In a distributed system, you can only guarantee two out of three:
- Consistency: Every read gets the most recent write
- Availability: Every request gets a response
- Partition tolerance: System works despite network failures
Since network partitions are unavoidable, the real choice is between CP (consistent but may reject requests) and AP (always responds but may serve stale data). Banking systems choose CP. Social media feeds choose AP. Knowing when to pick which one is probably the most important thing you can signal in an interview.
Message Queues
Asynchronous communication between services. Kafka for high-throughput event streaming, RabbitMQ for task queues, SQS for simple cloud-native queuing.
When to use a message queue:
- Decoupling services
- Handling traffic spikes (buffering)
- Ensuring message delivery (at-least-once)
- Enabling event-driven architectures
Don't throw a message queue into every design just because it sounds impressive. Use one when there's a clear reason for async processing.
Mistakes I've Made (and Keep Seeing Others Make)
1. Not Clarifying Requirements
Jumping straight into the solution without understanding the problem. Always spend those first 5 minutes asking questions. Interviewers specifically evaluate whether you clarify ambiguity. Skip this and you're telling them you build first, think later.
2. Over-Engineering
Designing for billions of users when the question implies thousands. Start simple. Scale up when the interviewer pushes you. A monolith is perfectly fine for the initial design — you can discuss microservices as a scaling step. I've seen candidates draw 15 microservices for a system that could be a single Node.js server with a PostgreSQL database.
3. Ignoring Trade-offs
Every design decision has trade-offs. SQL vs NoSQL. Consistency vs availability. Latency vs throughput. Interviewers want to hear you articulate these trade-offs, not just pick a technology because it's popular on tech Twitter. "I chose Cassandra because everyone uses it" is a terrible answer. "I chose Cassandra because our write-heavy workload benefits from its LSM-tree storage engine, though we're trading away strong consistency" — that's what gets you hired.
4. Not Doing Back-of-the-Envelope Math
"We'll use a database" is weak. "We need to store 240 GB per year, so a single PostgreSQL instance handles this easily" shows you understand actual scale. Quick math isn't hard, but it separates candidates who think in abstractions from those who think in real systems.
5. Forgetting About Failure
What happens when a server crashes? When the database goes down? When a network partition occurs? Discussing failure modes and recovery strategies shows maturity. I think most junior candidates forget this entirely, and it's one of the easiest ways to stand out.
Resources That Actually Helped Me Prepare
Books
- "Designing Data-Intensive Applications" by Martin Kleppmann: Arguably the single best book on distributed systems concepts. Dense but incredibly thorough. I read it twice and got something new the second time through.
- "System Design Interview" by Alex Xu (Vol 1 & 2): Practical, structured walkthroughs of common design problems. Great for pattern recognition and building muscle memory.
YouTube Channels
- Gaurav Sen: Excellent explanations of system design concepts with clear diagrams. His videos on consistent hashing and load balancing are outstanding.
- Tech Dummies Narendra L: Detailed, practical system design walkthroughs. Very popular among Indian engineers for good reason.
- ByteByteGo (Alex Xu): Clean animations explaining complex distributed systems concepts in digestible chunks.
Practice Platforms
- interviewing.io: Anonymous mock interviews with real engineers from FAANG companies
- Pramp: Free peer-to-peer mock interview platform
- HelloInterview: AI-powered system design practice with feedback
Open-Source Study Guides
- system-design-primer (GitHub): Massive collection of system design resources and examples
- ByteByteGo newsletter: Weekly deep dives into system design topics
Managing Your 45 Minutes Like a Pro
Time management is everything. Seriously. Here's my minute-by-minute breakdown:
| Time | Activity | What to Say |
|---|---|---|
| 0:00-0:02 | Listen to the question | Take notes, don't interrupt |
| 0:02-0:07 | Ask clarifying questions | "What's the expected scale? Which features are in scope?" |
| 0:07-0:10 | Back-of-envelope estimation | "Given 100M DAU, that's roughly X QPS..." |
| 0:10-0:22 | High-level design | Draw the architecture, explain data flow |
| 0:22-0:38 | Deep dive (interviewer-guided) | Schema, caching, algorithms for 1-2 components |
| 0:38-0:43 | Scaling and trade-offs | "If traffic 10x, here's how we scale..." |
| 0:43-0:45 | Summary and questions | Recap key decisions, ask if they want to explore anything else |
Pro tip: Narrate your thinking out loud. The interviewer can't evaluate what they can't hear. Saying "I'm choosing Cassandra over PostgreSQL because our write-heavy workload benefits from Cassandra's LSM-tree storage engine" is far better than silently writing "Cassandra" on the whiteboard. Silence is your enemy in this round.
India-Specific Preparation Tips
Prepare for Indian Scale
Indian companies operate at unique scales that you won't find in a US-focused study guide. Flipkart processes millions of orders during Big Billion Days. PhonePe handles billions of UPI transactions monthly. Zerodha processes lakhs of orders per second during market open. Understanding these scale characteristics gives you a real edge over candidates who only prepare with generic examples.
Know the Indian Tech Stack
Many Indian companies use specific technologies worth knowing:
- Kafka is everywhere for event streaming
- Kubernetes for container orchestration (see our Docker and Kubernetes beginner's guide if you need a refresher)
- PostgreSQL and MySQL for relational data
- Redis for caching (used almost universally)
- Go and Java for backend services (more than Node.js at scale)
Cultural Differences in Interviews
Indian company interviews tend to be slightly more technical and less collaborative than FAANG interviews, in my experience. You might get asked more specific implementation questions ("How would you implement the LRU cache?") alongside the high-level design. Be prepared for both. Not sure if every company follows this pattern, but I've noticed it enough to mention.
System design interviews pair naturally with coding rounds — if you're also preparing for DSA, our DSA roadmap for placements covers how to structure that preparation alongside system design.
Practice Beats Memorization Every Single Time
Here's what I want to leave you with. You can memorize every system design pattern in every textbook, watch 200 YouTube videos, and read Alex Xu's books cover to cover. None of that matters if you haven't practiced explaining your thinking out loud under time pressure. Memorization gets you started. Practice gets you hired.
Do at least 10-15 mock interviews before your real ones. Record yourself. Watch the recording back — it's painful, but you'll spot every awkward pause, every moment you rambled, every time you forgot to mention a trade-off. Each iteration makes you sharper. Find a friend who's also preparing and take turns interviewing each other. Use Pramp or interviewing.io if you don't have someone available.
The framework. The concepts. And the practice. That's what separates candidates who pass from those who don't. Everything else is just noise.
Anurag Sharma
Founder & Editor
Software engineer with 8+ years of experience in full-stack development and cloud architecture. Founder of Tech Tips India, where he breaks down complex tech concepts into practical, actionable guides for Indian developers and enthusiasts.
Stay Ahead in Tech
Get the latest tech news, tutorials, and reviews delivered straight to your inbox every week.
No spam ever. Unsubscribe anytime.
Comments (0)
Leave a Comment
All comments are moderated before appearing. Please be respectful and follow our community guidelines.
Related Articles

API Design: REST and GraphQL Patterns That Scale
API design guide covering REST, GraphQL, tRPC, authentication, rate limiting, error handling, and testing with Node.js examples.

DSA Roadmap for Placements: What Matters in 2026
Practical DSA roadmap for campus placements in India: topic priorities, platform choices, company patterns, and time management.

Redis Caching: Speed Up Your App in Practice
Redis caching strategies, data structures, and cache invalidation with Node.js, Next.js, and Upstash integration guide.