SQL GROUP BY


SQL GROUP BY Statement

The SQL GROUP BY statement is used to arrange identical data into groups. This statement is commonly used with aggregate functions such as COUNT(), MAX(), MIN(), SUM(), and AVG() to perform operations on each group of data.


Syntax

SELECT column1, aggregate_function(column2)
FROM table_name
WHERE condition
GROUP BY column1;

The GROUP BY statement has the following components:

  • column1: The column to group by.
  • aggregate_function(column2): The aggregate function to apply to the grouped data.
  • table_name: The name of the table from which to retrieve the data.
  • condition: The condition for selecting rows (optional).

Example SQL GROUP BY Statement Queries

Let's look at some examples of SQL GROUP BY statement queries:

1. Basic GROUP BY Example

SELECT department, COUNT(*) AS num_employees
FROM employees
GROUP BY department;

This query groups the data in the employees table by the department column and counts the number of employees in each department. The result will be a list of departments with the number of employees in each.

2. GROUP BY with HAVING Clause

SELECT department, AVG(salary) AS avg_salary
FROM employees
GROUP BY department
HAVING AVG(salary) > 50000;

This query groups the data in the employees table by the department column and calculates the average salary for each department. The HAVING clause filters the results to include only departments with an average salary greater than 50000. The result will be a list of departments with their average salaries, filtered by the condition.

3. GROUP BY Multiple Columns

SELECT department, job_title, COUNT(*) AS num_employees
FROM employees
GROUP BY department, job_title;

This query groups the data in the employees table by both the department and job_title columns and counts the number of employees in each group. The result will be a list of departments and job titles with the number of employees in each group.


Full Example

Let's go through a complete example that includes creating a table, inserting data, and querying the table with the GROUP BY statement.

Step 1: Creating a Table

This step involves creating a new table named employees to store employee data.

CREATE TABLE employees (
    id INT PRIMARY KEY AUTO_INCREMENT,
    first_name VARCHAR(50),
    last_name VARCHAR(50),
    department VARCHAR(50),
    job_title VARCHAR(50),
    salary DECIMAL(10, 2)
);

In this example, we create a table named employees with columns for id, first_name, last_name, department, job_title, and salary.

Step 2: Inserting Data into the Table

This step involves inserting some sample data into the employees table.

INSERT INTO employees (first_name, last_name, department, job_title, salary)
VALUES ('John', 'Doe', 'Engineering', 'Developer', 60000.00);

INSERT INTO employees (first_name, last_name, department, job_title, salary)
VALUES ('Jane', 'Smith', 'Engineering', 'Developer', 65000.00);

INSERT INTO employees (first_name, last_name, department, job_title, salary)
VALUES ('Jim', 'Brown', 'HR', 'Manager', 50000.00);

INSERT INTO employees (first_name, last_name, department, job_title, salary)
VALUES ('Jake', 'White', 'HR', 'Recruiter', 40000.00);

Here, we insert data into the employees table.

Step 3: Querying the Table with GROUP BY

This step involves selecting the data from the employees table and grouping the result set using the GROUP BY statement.

SELECT department, COUNT(*) AS num_employees
FROM employees
GROUP BY department;

This query retrieves all the rows from the employees table, groups them by department, and counts the number of employees in each department. The result will be:

department   num_employees
-----------  --------------
Engineering  2
HR           2

Conclusion

The SQL GROUP BY statement is a powerful tool for arranging identical data into groups and performing aggregate operations on each group. Understanding how to use the GROUP BY statement and its syntax is essential for effective data analysis and reporting in SQL databases.