System Design CourseSystem Design Course1

Cache Invalidation Strategies



What is Cache Invalidation?

Cache invalidation is the process of updating or removing stale data from the cache to ensure that users get the most accurate and up-to-date information. It plays a critical role in maintaining consistency between the cache and the underlying database or source of truth.

Why Do We Need Cache Invalidation?

When data changes in the database, the cache may still hold the old value. If the cache isn't updated or invalidated, users may see outdated information, leading to data inconsistency. Cache invalidation helps solve this problem.

Common Cache Invalidation Strategies

There are three widely used strategies:

1. Write-Through Caching

In write-through caching, data is written to both the cache and the database at the same time. This ensures that the cache is always consistent with the database.

Example:

Let’s say we are building an e-commerce product catalog. When an admin updates the price of a product:

This keeps both layers in sync. When users query for the product price, they get the latest value directly from the cache.

Question:

What happens if the database write fails but the cache update succeeds?

Answer:

This leads to inconsistency. In production systems, we use transaction support or retry mechanisms to ensure both cache and database are updated successfully or rolled back together.

Use Case:

Write-through caching is good when read performance is critical and we want the cache to always be up-to-date.

2. Write-Around Caching

In write-around caching, data is written directly to the database but not to the cache. When a read request comes, the system checks the cache first. If not found, it fetches from the database and stores it in the cache.

Example:

Continuing with our product catalog, when a product price is updated:

When a customer views the product later, the system checks the cache. Since the cache doesn’t have the updated price, it fetches it from the database and then stores it in the cache for next time.

Question:

Why would we not write to cache immediately?

Answer:

To reduce write load on the cache, especially when write operations are frequent and the data is not read immediately after the update.

Use Case:

Write-around is useful when writes are frequent and reads are less frequent. It saves cache space and reduces cache write overhead.

3. Write-Back (Write-Behind) Caching

In write-back caching, data is written to the cache first and the database is updated later asynchronously. This improves write performance but introduces complexity.

Example:

Let’s say we have a user profile system. When a user updates their name:

This gives a fast response to the user, but the database will have stale data for a short period.

Question:

What happens if the system crashes before writing to the database?

Answer:

Data may be lost. That’s why write-back caching is usually used with a persistent cache and retry mechanisms to ensure durability.

Use Case:

Write-back caching is great for high-throughput systems where speed is critical and occasional delays in persistence are acceptable.

Which Strategy Should You Use?

It depends on your use case:

Bonus: Time-To-Live (TTL) and Expiry

Many systems also use TTL (Time-To-Live) to auto-expire cache entries. For example, setting a 10-minute TTL ensures the data is refreshed every 10 minutes.

Question:

Is TTL a form of cache invalidation?

Answer:

Yes. TTL ensures that stale data is automatically removed after a fixed time, making it a passive cache invalidation strategy.

Summary

Interview Tip

In system design interviews, always discuss cache invalidation strategy when suggesting a caching solution. Show awareness of trade-offs between performance and consistency.



Welcome to ProgramGuru

Sign up to start your journey with us

Support ProgramGuru.org

Mention your name, and programguru.org in the message. Your name shall be displayed in the sponsers list.

PayPal

UPI

PhonePe QR

MALLIKARJUNA M