Introduction
YouTube is one of the most popular video-sharing platforms in the world. It allows users to upload, stream, comment on, and search for videos. Designing such a system is a classic system design interview question that tests your understanding of storage, bandwidth, scalability, and user experience.
Functional Requirements
- Users should be able to upload videos.
- Users can stream videos on demand.
- Support for comments, likes, dislikes, and subscriptions.
- Video recommendations on the homepage and sidebar.
- Search for videos by title, tags, and creators.
Non-Functional Requirements
- Highly available and fault-tolerant system.
- Scalable to billions of videos and users.
- Low latency for both uploads and playback.
Step 1: How do users upload videos?
When a user uploads a video, the frontend sends the video file to a backend server. This server stores the file temporarily and sends it to a storage service (e.g., Amazon S3 or Google Cloud Storage).
Question: Why not directly store the video in the database?
Answer: Video files are large binary files. Storing them in a database (like MySQL or PostgreSQL) would slow it down and increase backup/restore complexity. Instead, we store metadata in the database and the actual video in a distributed object storage service.
Step 2: Video Transcoding
After uploading, the video must be transcoded into different resolutions (144p, 360p, 720p, 1080p). This is important to support various devices and internet speeds.
Transcoding is usually done using a job queue. For example:
- Upload triggers a message to a
VideoProcessingQueue
. - A worker service picks up the message and runs FFmpeg to create multiple versions of the video.
- These processed versions are then saved to object storage.
Question: What happens if the transcoding fails?
Answer: The system should implement retries and send alerts. Failed jobs are usually logged and stored in a dead-letter queue for manual or automated reprocessing.
Step 3: Storage and Video Metadata
The actual videos are stored in object storage (like Amazon S3). Only a reference (URL or ID) is stored in the database along with metadata:
- Video ID
- Uploader User ID
- Title, description, and tags
- Upload time and status
Step 4: Content Delivery via CDN
To serve videos efficiently across the globe, we use a Content Delivery Network (CDN). When a user requests a video, the system serves it from the nearest CDN edge location.
Question: What if the video is not available at a nearby CDN node?
Answer: The CDN will fetch it from the origin (e.g., main storage bucket) and cache it at the edge for future users.
Step 5: Streaming the Video
We don’t download the full video at once. Instead, we stream it in chunks using HLS (HTTP Live Streaming) or MPEG-DASH protocols. The video is split into tiny segments (e.g., 2–10 seconds long).
Step 6: Search and Recommendations
For search, we need to index the metadata (title, tags, etc.) using a search engine like Elasticsearch. For recommendations, YouTube uses collaborative filtering and machine learning models.
Example: Simple Recommendation Strategy
For beginners, we can implement a rule-based recommendation system:
- Show most viewed videos from the same category.
- Show videos liked by users who watched the same content.
Step 7: Comments, Likes, Subscriptions
These features involve separate microservices:
- Comment service – stores threaded comments.
- Like/dislike service – stores user preferences.
- Subscription service – notifies users of new uploads.
Question: Should these services update video metadata instantly?
Answer: No. They should write to their own databases. A separate batch job or event system can aggregate the data for analytics and update the video stats periodically.
Step 8: Scaling the System
- Video Storage: Use a distributed file system or cloud storage.
- Metadata DB: Shard by video ID or uploader ID.
- Load Balancer: Distribute user requests among servers.
- CDN: Serve cached videos close to the user.
- Microservices: Independent scaling and deployment.
Step 9: Handling Abuse and Security
- Spam filtering in comments.
- Rate limiting to prevent abuse.
- Moderation pipelines to detect copyright violations.
Step 10: Database Schema Example
Let’s look at a simplified video metadata schema:
Table: Videos
-------------
video_id (PK)
uploader_id (FK)
title
description
tags
upload_time
status
video_url
Points to Remember
- Separate video storage from metadata storage.
- Use job queues for time-consuming operations like transcoding.
- CDNs improve latency for global delivery.
- Design with scale in mind from day one.
- Keep services decoupled to improve maintainability and fault isolation.