Introduction to Caching
Caching is a technique used in system design to store copies of frequently accessed data in a temporary storage location, called a cache. The main goal of caching is to reduce the time it takes to access data and to reduce the load on backend systems like databases or remote servers.
Imagine you're frequently visiting a website. Instead of the server fetching the same data from the database every time you refresh the page, it can temporarily store that data in a cache. The next time you visit, the data is served from the cache—faster and with less effort.
Why Use Caching?
Caching offers several benefits:
- Faster response times: Data is served from memory instead of disk or database.
- Reduced load: Backend services like databases or APIs are accessed less frequently.
- Better scalability: Systems can handle more users or requests.
Where is Cache Used?
Caching can be applied at multiple levels in a system:
- Browser cache: Stores static assets (like images, CSS files) on the user’s device.
- Server-side cache: Temporarily stores results of expensive operations.
- Database cache: Caches frequently queried database records.
Example 1: Web Server Caching
Let’s say your website has a "Top News" section that updates every hour. Every time a user visits the homepage, the server queries the database for the top news articles.
Instead of hitting the database for every request, the server can cache the news results in memory (e.g., using Redis) for 1 hour. Now, all visitors within that hour will receive the cached data instantly.
Question:
What happens when the news changes before the 1-hour cache expires?
Answer:
This depends on the cache invalidation strategy. You can either:
- Wait until the 1-hour period expires (called time-to-live or TTL).
- Manually update (or "invalidate") the cache when news updates happen.
Example 2: API Response Caching
Suppose you have a weather application that fetches weather details for New York City. Thousands of users request the same data within minutes.
Instead of calling the weather API for every request, you can store the response in a cache for, say, 10 minutes. During those 10 minutes, all requests are served from the cache, reducing the number of external API calls and improving speed.
Question:
What if a user wants the absolute latest weather, not the 10-minute-old one?
Answer:
In such cases, a hybrid approach can be used:
- Show cached data by default (fast and efficient).
- Allow users to click “refresh” to fetch real-time data, bypassing the cache.
Example 3: Database Query Caching
Consider an e-commerce website where users frequently visit the “Best Selling Products” page. This data may be derived from a heavy database query involving sorting, filtering, and aggregating sales data.
Instead of executing this costly query every time, the application can cache the query result in memory for a few minutes. This drastically improves page load time and reduces stress on the database.
Question:
How do we ensure the cached data doesn’t show outdated products?
Answer:
We set an appropriate expiration time (e.g., 5 minutes) or invalidate the cache whenever a new sale occurs that could affect the best sellers list.
Types of Caching
- In-Memory Caching: Uses RAM (e.g., Redis, Memcached). Fastest but volatile.
- Disk-Based Caching: Stores cached data on disk (slower but persistent).
- Distributed Caching: Shared cache across multiple servers (e.g., in microservices).
Key Concepts in Caching
- TTL (Time to Live): How long the cached data is valid.
- Cache Hit: Data was found in cache.
- Cache Miss: Data not in cache; fallback to database/API.
- Eviction Policies: Determines which cache data to discard (e.g., LRU - Least Recently Used).
Summary
Caching is a vital part of system design, especially in large-scale systems. It reduces latency, offloads backend systems, and improves user experience. While it's powerful, caching also introduces challenges like stale data and cache invalidation strategies.
Intuition-Building Question
Imagine your website crashes every time traffic spikes. What’s the first thing you would check?
Answer:
You should check whether the backend (e.g., database) is overwhelmed with repeated requests. Introducing a cache layer might solve this by reducing the number of repetitive backend calls.
Next Steps
In the next lesson, we will explore Client-Side vs Server-Side Caching and learn how caching strategies differ based on where the data is stored and used.