SQL GROUP BY Clause - Summarizing Data
SQL GROUP BY
Imagine your school wants to know how many students are in each class, or the average marks scored by subject. The GROUP BY
clause is how you group similar data points together and summarize them — turning rows into meaningful insights.
Syntax of GROUP BY
SELECT column1, aggregate_function(column2)
FROM table_name
WHERE condition
GROUP BY column1;
GROUP BY column1
The GROUP BY
clause is used to group rows in a result set based on the values in one or more columns. In this syntax, column1
is the grouping column. All rows with the same value in column1
are grouped together.
Once the rows are grouped, aggregate functions like SUM()
, AVG()
, COUNT()
, MIN()
, or MAX()
can be applied to each group to produce summary results for each group.
Key Points:
- Every column in the
SELECT
list that is not inside an aggregate function must appear in theGROUP BY
clause. - The
GROUP BY
clause comes after theWHERE
clause and before theORDER BY
clause (if used). - You can group by one or more columns, depending on the level of detail needed.
Sample Table – student_marks
Let’s consider the following table:
CREATE TABLE student_marks (
roll_no INT,
name VARCHAR(50),
class VARCHAR(10),
subject VARCHAR(30),
marks INT
);
INSERT INTO student_marks VALUES
(1, 'Aarav Sharma', '10A', 'Maths', 85),
(2, 'Diya Iyer', '9B', 'Maths', 90),
(3, 'Rohit Menon', '10A', 'Maths', 78),
(4, 'Sneha Patil', '8C', 'Maths', 88),
(5, 'Mehul Agarwal', '9B', 'Maths', 92),
(1, 'Aarav Sharma', '10A', 'Science', 91),
(2, 'Diya Iyer', '9B', 'Science', 87),
(3, 'Rohit Menon', '10A', 'Science', 85),
(4, 'Sneha Patil', '8C', 'Science', 89),
(5, 'Mehul Agarwal', '9B', 'Science', 93);
roll_no | name | class | subject | marks |
---|---|---|---|---|
1 | Aarav Sharma | 10A | Maths | 85 |
2 | Diya Iyer | 9B | Maths | 90 |
3 | Rohit Menon | 10A | Maths | 78 |
4 | Sneha Patil | 8C | Maths | 88 |
5 | Mehul Agarwal | 9B | Maths | 92 |
1 | Aarav Sharma | 10A | Science | 91 |
2 | Diya Iyer | 9B | Science | 87 |
3 | Rohit Menon | 10A | Science | 85 |
4 | Sneha Patil | 8C | Science | 89 |
5 | Mehul Agarwal | 9B | Science | 93 |
1. Count Students per Class
Let’s count how many marks entries exist per class:
Use case: A school administrator wants to find out how many student mark records exist for each class to analyze data entry volume by class.
roll_no | name | class | subject | marks |
---|---|---|---|---|
1 | Aarav Sharma | 10A | Maths | 85 |
2 | Diya Iyer | 9B | Maths | 90 |
3 | Rohit Menon | 10A | Maths | 78 |
4 | Sneha Patil | 8C | Maths | 88 |
5 | Mehul Agarwal | 9B | Maths | 92 |
1 | Aarav Sharma | 10A | Science | 91 |
2 | Diya Iyer | 9B | Science | 87 |
3 | Rohit Menon | 10A | Science | 85 |
4 | Sneha Patil | 8C | Science | 89 |
5 | Mehul Agarwal | 9B | Science | 93 |
SQL query description: SQL query to select the class
column and count the number of entries for each class from the student_marks
table using the GROUP BY
clause.
SELECT class, COUNT(*) AS total_entries
FROM student_marks
GROUP BY class;
Result description: Result contains each class along with the total number of student mark entries recorded for that class.
class | total_entries |
---|---|
10A | 4 |
9B | 4 |
8C | 2 |
2. Average Marks by Subject
To calculate the average marks for each subject:
Use case: A school administrator wants to calculate the average marks obtained by students in each subject to analyze overall subject performance.
roll_no | name | class | subject | marks |
---|---|---|---|---|
1 | Aarav Sharma | 10A | Maths | 85 |
2 | Diya Iyer | 9B | Maths | 90 |
3 | Rohit Menon | 10A | Maths | 78 |
4 | Sneha Patil | 8C | Maths | 88 |
5 | Mehul Agarwal | 9B | Maths | 92 |
1 | Aarav Sharma | 10A | Science | 91 |
2 | Diya Iyer | 9B | Science | 87 |
3 | Rohit Menon | 10A | Science | 85 |
4 | Sneha Patil | 8C | Science | 89 |
5 | Mehul Agarwal | 9B | Science | 93 |
SQL query description: SQL query to select the subject
and the average of marks
(renamed as avg_marks
) from the student_marks
table, grouping the results by subject
.
SELECT subject, AVG(marks) AS avg_marks
FROM student_marks
GROUP BY subject;
Result description: Result contains each subject along with the corresponding average marks scored by all students in that subject.
subject | avg_marks |
---|---|
Maths | 86.6 |
Science | 89.0 |
3. Highest Marks per Class
Let’s find the top marks in each class (across all subjects):
Use case: A school administrator wants to find the highest marks scored in each class to identify top performers across all classes.
roll_no | name | class | subject | marks |
---|---|---|---|---|
1 | Aarav Sharma | 10A | Maths | 85 |
2 | Diya Iyer | 9B | Maths | 90 |
3 | Rohit Menon | 10A | Maths | 78 |
4 | Sneha Patil | 8C | Maths | 88 |
5 | Mehul Agarwal | 9B | Maths | 92 |
1 | Aarav Sharma | 10A | Science | 91 |
2 | Diya Iyer | 9B | Science | 87 |
3 | Rohit Menon | 10A | Science | 85 |
4 | Sneha Patil | 8C | Science | 89 |
5 | Mehul Agarwal | 9B | Science | 93 |
SQL query description: SQL query to select the class
and the maximum value of marks
(aliased as top_score
) from the student_marks
table, grouped by class
.
SELECT class, MAX(marks) AS top_score
FROM student_marks
GROUP BY class;
Result description: Result contains one row per class, showing the class name and the highest marks obtained in that class.
class | top_score |
---|---|
10A | 91 |
9B | 93 |
8C | 89 |
4. Grouping by Multiple Columns
Now, let’s see average marks per subject per class:
Use case: A school administrator wants to calculate the average marks for each subject across all classes to analyze academic performance trends.
roll_no | name | class | subject | marks |
---|---|---|---|---|
1 | Aarav Sharma | 10A | Maths | 85 |
2 | Diya Iyer | 9B | Maths | 90 |
3 | Rohit Menon | 10A | Maths | 78 |
4 | Sneha Patil | 8C | Maths | 88 |
5 | Mehul Agarwal | 9B | Maths | 92 |
1 | Aarav Sharma | 10A | Science | 91 |
2 | Diya Iyer | 9B | Science | 87 |
3 | Rohit Menon | 10A | Science | 85 |
4 | Sneha Patil | 8C | Science | 89 |
5 | Mehul Agarwal | 9B | Science | 93 |
SQL query description: SQL query to select the class
, subject
, and the average of marks
(aliased as avg_marks
) from the student_marks
table, grouped by class
and subject
.
SELECT class, subject, AVG(marks) AS avg_marks
FROM student_marks
GROUP BY class, subject;
Result description: Result contains the average marks for each subject in every class, showing one row per class-subject combination.
class | subject | avg_marks |
---|---|---|
10A | Maths | 81.5 |
10A | Science | 88.0 |
9B | Maths | 91.0 |
9B | Science | 90.0 |
8C | Maths | 88.0 |
8C | Science | 89.0 |
5. Using GROUP BY with WHERE
What if we only want to group Maths marks?
Use case: A school administrator wants to calculate the average marks in Maths for each class to analyze overall class performance in the subject.
roll_no | name | class | subject | marks |
---|---|---|---|---|
1 | Aarav Sharma | 10A | Maths | 85 |
2 | Diya Iyer | 9B | Maths | 90 |
3 | Rohit Menon | 10A | Maths | 78 |
4 | Sneha Patil | 8C | Maths | 88 |
5 | Mehul Agarwal | 9B | Maths | 92 |
1 | Aarav Sharma | 10A | Science | 91 |
2 | Diya Iyer | 9B | Science | 87 |
3 | Rohit Menon | 10A | Science | 85 |
4 | Sneha Patil | 8C | Science | 89 |
5 | Mehul Agarwal | 9B | Science | 93 |
SQL query description: SQL query to select the class
and the average of marks
(renamed as avg_maths
) from the student_marks
table where the subject
is 'Maths', grouping the results by class
.
SELECT class, AVG(marks) AS avg_maths
FROM student_marks
WHERE subject = 'Maths'
GROUP BY class;
Result description: Result contains the average Maths marks for each class, with one row per class.
class | avg_maths |
---|---|
10A | 81.5 |
8C | 88.0 |
9B | 91.0 |
6. HAVING vs WHERE
WHERE
filters before grouping. HAVING
filters after. For example:
Use case: A school administrator wants to identify subjects where the average student marks are higher than 88, in order to recognize high-performing subjects or teachers.
roll_no | name | class | subject | marks |
---|---|---|---|---|
1 | Aarav Sharma | 10A | Maths | 85 |
2 | Diya Iyer | 9B | Maths | 90 |
3 | Rohit Menon | 10A | Maths | 78 |
4 | Sneha Patil | 8C | Maths | 88 |
5 | Mehul Agarwal | 9B | Maths | 92 |
1 | Aarav Sharma | 10A | Science | 91 |
2 | Diya Iyer | 9B | Science | 87 |
3 | Rohit Menon | 10A | Science | 85 |
4 | Sneha Patil | 8C | Science | 89 |
5 | Mehul Agarwal | 9B | Science | 93 |
SQL query description: SQL query to select the subject
and the average of marks
(aliased as avg_marks
) from the student_marks
table. It groups the records by subject
and includes only those groups where the average marks are greater than 88 using the HAVING
clause.
SELECT subject, AVG(marks) AS avg_marks
FROM student_marks
GROUP BY subject
HAVING AVG(marks) > 88;
Result description: Result contains the names of subjects and their average marks, limited to those subjects where the average is greater than 88.
subject | avg_marks |
---|---|
Science | 89.0 |
Best Practices
- Always include grouped columns in SELECT if not using an aggregate function.
- Use HAVING when filtering aggregated results.
- Group logically: by class, subject, or both — depending on your analysis goal.
Summary
GROUP BY
is your tool for summaries. It transforms lists into reports, raw data into insights. Whether you're calculating subject-wise averages or class-level toppers, this clause does the heavy lifting with just a few lines.
QUIZ
Question 1:What is the main purpose of the GROUP BY clause in SQL?
Question 2:The GROUP BY clause can be used without an aggregate function like COUNT or AVG.
Question 3:Which of the following SQL queries use GROUP BY correctly?
Question 4:What is the result of this query?
SELECT class, COUNT(*) FROM students GROUP BY class;
SELECT class, COUNT(*) FROM students GROUP BY class;
Comments
Loading comments...