Yandex

System Design CourseSystem Design Course1

Capacity Estimation in System Design



What is Capacity Estimation?

Capacity estimation is the process of calculating how much load your system needs to handle. It helps you estimate the required hardware, bandwidth, database size, and other resources based on expected usage.

This step is typically performed early in the system design phase, especially in interviews and real-world system architecture planning. It ensures that the system will be scalable, responsive, and cost-effective.

Why is Capacity Estimation Important?

Without proper capacity estimation, a system might crash under high traffic or overuse resources unnecessarily. It helps in:

  • Planning infrastructure and budget
  • Scaling the system appropriately
  • Avoiding system failures under peak load

Example 1: Designing a URL Shortener - Estimating Capacity

Let’s say we are designing a URL shortener like Bit.ly. First, we must understand the usage pattern.

Assumptions:

  • 100 million users
  • 10% users are active daily → 10 million daily active users (DAU)
  • Each active user shortens 5 URLs per day

Step-by-Step Estimation:

  • Total URLs created per day: 10M users × 5 = 50M URLs/day
  • Storage for each URL:
    • Short URL: 8 characters = ~8 bytes
    • Long URL: average 100 bytes
    • Total = ~108 bytes
  • Daily Storage Requirement: 50M × 108 bytes ≈ 5.4 GB/day
  • Monthly Storage (30 days): 5.4 × 30 = 162 GB

Question:

If we wanted to support the service for 3 years without archiving any data, how much storage would we need?

Answer:

3 years = 36 months → 162 GB × 36 = 5.8 TB approximately

What Else Do We Estimate in Capacity Planning?

  • Read/Write Traffic: How many reads/writes per second?
  • Database QPS (Queries per second): Important to size DB clusters
  • Bandwidth: Needed for upload/download, API traffic, video content, etc.
  • Peak vs Average Load: Systems must be designed to handle peak load

Example 2: Estimating Capacity for a Video Streaming Service

Imagine you’re designing a basic video streaming service.

Assumptions:

  • 5 million daily active users
  • Each user watches 2 videos/day
  • Average video length = 10 minutes
  • Bitrate = 2 Mbps

Step-by-Step Estimation:

  • Total video minutes per day: 5M × 2 × 10 = 100M minutes
  • Total data streamed:
    • 10 minutes = 600 seconds
    • 600 sec × 2 Mbps = 1.2 Gb (per video)
    • 1.2 Gb = 150 MB per video
  • Daily bandwidth: 5M users × 2 × 150 MB = 1.5 PB (petabytes)

Question:

How many CDN servers would we need if each server can handle 5 Gbps of traffic?

Answer:

  • 1.5 PB/day = ~17.36 GBps
  • 17.36 GBps = 138.88 Gbps
  • 138.88 ÷ 5 = ~28 CDN servers minimum

How to Approach Capacity Estimation in Interviews

Always follow a logical, layered approach:

  1. Start with user base
  2. Estimate daily active users (DAU)
  3. Calculate requests per second (QPS)
  4. Estimate data size and storage needs
  5. Project bandwidth and compute capacity

Quick Tip:

Assume 1 month = 30 days and 1 year = 365 days unless specified otherwise.

Practice Question:

You are building a chat app. If 2 million users each send 20 messages per day (average message size 200 bytes), how much storage is needed per day?

Answer:

  • 2M × 20 = 40M messages
  • 40M × 200 bytes = 8,000,000,000 bytes = 8 GB/day

Key Takeaways

  • Capacity estimation is about making realistic assumptions
  • Break down into users, data size, and frequency of actions
  • Helps ensure performance and scalability of your system


Welcome to ProgramGuru

Sign up to start your journey with us

Support ProgramGuru.org

You can support this website with a contribution of your choice.

When making a contribution, mention your name, and programguru.org in the message. Your name shall be displayed in the sponsors list.

PayPal

UPI

PhonePe QR

MALLIKARJUNA M