Introduction to Aggregation
Introduction to Aggregation
Aggregation in MongoDB is a way of processing a large number of documents and transforming them into meaningful summarized results. It is equivalent to the GROUP BY operations in SQL, but with far more flexibility and power.
The Aggregation Framework uses a concept called the aggregation pipeline, which consists of multiple stages. Each stage performs an operation on the input documents and passes the result to the next stage.
What is an Aggregation Pipeline?
An aggregation pipeline is an array of stages. Each stage transforms the data as it passes through.
Common stages include:
$match– Filters documents (likefind())$group– Groups documents and performs aggregations like sum, avg$project– Reshapes documents (e.g., show or hide fields)$sort– Sorts the documents$limit– Limits the number of documents
Why use Aggregation?
Aggregation is used to:
- Compute totals, averages, and counts
- Group documents by a field
- Transform documents into new shapes
Example Dataset
Let’s say we have a sales collection with the following documents:
db.sales.insertMany([
{ item: "Laptop", price: 70000, quantity: 2, region: "North" },
{ item: "Monitor", price: 12000, quantity: 5, region: "South" },
{ item: "Laptop", price: 70000, quantity: 1, region: "South" },
{ item: "Mouse", price: 500, quantity: 10, region: "North" },
{ item: "Keyboard", price: 1500, quantity: 3, region: "West" }
])
Example 1: Total Sales Amount per Item
We want to group documents by item and calculate the total sales amount (price × quantity).
db.sales.aggregate([
{
$project: {
item: 1,
totalSale: { $multiply: ["$price", "$quantity"] }
}
},
{
$group: {
_id: "$item",
totalRevenue: { $sum: "$totalSale" }
}
}
])
{ _id: "Laptop", totalRevenue: 210000 }
{ _id: "Monitor", totalRevenue: 60000 }
{ _id: "Mouse", totalRevenue: 5000 }
{ _id: "Keyboard", totalRevenue: 4500 }
Explanation:
$projectstage creates a new fieldtotalSaleby multiplying price and quantity.$groupthen aggregates these totalSale values grouped by item.
Intuition Check
Q: Why didn’t we calculate price * quantity directly inside $group?
A: Because $group cannot access multiple fields for direct arithmetic. We first need to use $project to calculate it, then use $group to aggregate the result.
Example 2: Total Quantity Sold by Region
db.sales.aggregate([
{
$group: {
_id: "$region",
totalUnits: { $sum: "$quantity" }
}
}
])
{ _id: "North", totalUnits: 12 }
{ _id: "South", totalUnits: 6 }
{ _id: "West", totalUnits: 3 }
Explanation: We directly group by the region field and sum the quantity field to get total units sold per region.
Example 3: Show Only Item and Region
Let’s say we want to only view item and region, hiding everything else:
db.sales.aggregate([
{
$project: {
_id: 0,
item: 1,
region: 1
}
}
])
{ item: "Laptop", region: "North" }
{ item: "Monitor", region: "South" }
{ item: "Laptop", region: "South" }
{ item: "Mouse", region: "North" }
{ item: "Keyboard", region: "West" }
Explanation: The $project stage is used to control which fields are shown. Setting _id: 0 hides the default _id field.
Key Takeaways
- Aggregation in MongoDB is done through pipelines made of stages.
$projectis used to compute fields or reshape documents.$groupis used to aggregate data like sum, avg, count, etc.
Next Step
In the next lesson, we'll dive deeper into each aggregation stage and build more advanced pipelines using $match, $sort, and $limit.
Next Topic ⮕Using $match, $group, $project, $sort in MongoDB Aggregation
Comments
Loading comments...