Yandex

SQL Denormalization
Balancing Performance with Redundancy



Introduction

You've learned about normalization — the art of splitting data to remove redundancy. But what if your queries now need to join five tables just to show a student's marks and teacher name? That's where denormalization comes in. It’s about breaking the rules — but for a reason: performance.

What is Denormalization?

Denormalization is the process of combining data from multiple related tables into one to reduce joins and improve query speed. While it introduces redundancy, the trade-off can be worth it — especially for read-heavy systems.

Why Denormalize?

  • To improve query performance
  • To reduce the number of JOINs
  • To simplify reporting or analytics queries
  • When data is mostly read and rarely updated

Example Use Case – School System

Normalized Design

-- Table: students
CREATE TABLE students (
  roll_no INT PRIMARY KEY,
  name VARCHAR(50),
  class VARCHAR(10)
);

-- Table: results
CREATE TABLE results (
  roll_no INT,
  subject VARCHAR(30),
  marks INT,
  FOREIGN KEY (roll_no) REFERENCES students(roll_no)
);

-- Table: subjects
CREATE TABLE subjects (
  subject VARCHAR(30) PRIMARY KEY,
  teacher VARCHAR(50)
);

To get full student result with teacher:

SELECT s.name, s.class, r.subject, r.marks, sub.teacher
FROM students s
JOIN results r ON s.roll_no = r.roll_no
JOIN subjects sub ON r.subject = sub.subject;

This query is correct and efficient for small data. But if millions of records grow and performance lags, consider denormalization.

Denormalized Version

CREATE TABLE student_report (
  roll_no INT,
  name VARCHAR(50),
  class VARCHAR(10),
  subject VARCHAR(30),
  marks INT,
  teacher VARCHAR(50)
);

Now to fetch student result with teacher:

SELECT name, class, subject, marks, teacher
FROM student_report
WHERE roll_no = 1;
name           | class | subject | marks | teacher
----------------+--------+---------+--------+-----------
Aarav Sharma   | 10A   | Maths   | 85    | Mr. Nair
Aarav Sharma   | 10A   | Science | 88    | Ms. Banerjee

Risks of Denormalization

  • Data redundancy – same teacher name stored multiple times
  • Update anomalies – if a teacher is reassigned, you must update many rows
  • More storage – although storage is cheap, it's still a factor

When to Denormalize?

  • You have read-heavy workloads (e.g., dashboards, analytics)
  • Joins are causing slow queries
  • Updates are infrequent and acceptable to handle manually or via scripts
  • You want to simplify query logic for reporting teams

Hybrid Approach: Materialized Views

Some systems use materialized views (precomputed queries stored as tables) to balance normalization with performance — often refreshed daily or hourly.

Best Practices

  • Keep normalized base tables for core operations
  • Use denormalized views or tables only for fast access/reporting
  • Document where data is duplicated to avoid confusion
  • Schedule updates to sync denormalized data regularly

Real-World School Examples

  • Denormalized student_report table for parent reports
  • Exam performance table with pre-joined subject, student, and teacher info
  • Attendance summary tables with student name, class, and monthly percentage

Summary

Denormalization is not the opposite of good design — it’s a practical extension of it. When your database grows and performance becomes critical, denormalizing wisely can simplify your queries and boost speed. Just remember: with great performance comes great responsibility — keep your redundant data in check.

What’s Next?

Coming up next: ER Diagrams – how to visually design and document your database schema for teams and stakeholders.

QUIZ

Question 1:What is the main motivation behind using denormalization in a relational database?

Question 2:Denormalization is a reverse process of normalization.

Question 3:Which of the following are common denormalization techniques?

Question 4:Consider a school database where student names are repeated in both students and attendance tables. What does this imply?

Question 5:Denormalization always improves database performance.

Question 6:In which of the following scenarios might denormalization be considered beneficial?



Welcome to ProgramGuru

Sign up to start your journey with us

Support ProgramGuru.org

You can support this website with a contribution of your choice.

When making a contribution, mention your name, and programguru.org in the message. Your name shall be displayed in the sponsors list.

PayPal

UPI

PhonePe QR

MALLIKARJUNA M