Using $match, $group, $project, $sort in MongoDB Aggregation
The MongoDB Aggregation Framework allows you to process and transform large volumes of documents in stages. Think of it as a data pipeline where each stage performs a specific task such as filtering, grouping, transforming, or sorting the data.
The most commonly used stages are:
$match
– Filter documents (like WHERE in SQL)$group
– Group and aggregate values (like GROUP BY)$project
– Include, exclude, or transform fields$sort
– Sort documents in ascending or descending order
Example Dataset: Sales Orders
Let’s assume you have a collection called orders
with documents like this:
db.orders.insertMany([
{ item: "pen", category: "stationery", price: 10, quantity: 3, store: "A" },
{ item: "notebook", category: "stationery", price: 50, quantity: 2, store: "A" },
{ item: "eraser", category: "stationery", price: 5, quantity: 5, store: "B" },
{ item: "apple", category: "grocery", price: 20, quantity: 10, store: "B" },
{ item: "banana", category: "grocery", price: 15, quantity: 8, store: "A" }
]);
$match – Filter Documents
$match
is used to filter documents based on conditions. It’s similar to WHERE in SQL.
db.orders.aggregate([
{ $match: { category: "stationery" } }
]);
Output:
[ { item: "pen", category: "stationery", price: 10, quantity: 3, store: "A" }, { item: "notebook", category: "stationery", price: 50, quantity: 2, store: "A" }, { item: "eraser", category: "stationery", price: 5, quantity: 5, store: "B" } ]
Explanation: This filters only the documents where category
is "stationery"
.
Question:
Can you combine multiple conditions in $match
like price > 10 and category = "stationery"?
Answer: Yes, by using operators like $gt
, $and
, etc.
db.orders.aggregate([
{ $match: { category: "stationery", price: { $gt: 10 } } }
]);
$group – Aggregate Data
$group
helps us to group documents and perform operations like sum
, avg
, count
, etc.
Example: Calculate total quantity sold per category
db.orders.aggregate([
{ $group: { _id: "$category", totalQuantity: { $sum: "$quantity" } } }
]);
Output:
[ { _id: "stationery", totalQuantity: 10 }, { _id: "grocery", totalQuantity: 18 } ]
Explanation: We grouped by category
and summed the quantity
field for each category.
Question:
What if you want to group by store
and get the total revenue (price × quantity) from each store?
Answer: You’ll need to first calculate revenue, and then group. Let’s use $project
to create a new field.
$project – Transform Fields
$project
is used to reshape documents — you can include/exclude fields or create new ones.
Example: Create a new field revenue = price * quantity
db.orders.aggregate([
{
$project: {
item: 1,
store: 1,
revenue: { $multiply: ["$price", "$quantity"] }
}
}
]);
Output:
[ { item: "pen", store: "A", revenue: 30 }, { item: "notebook", store: "A", revenue: 100 }, { item: "eraser", store: "B", revenue: 25 }, { item: "apple", store: "B", revenue: 200 }, { item: "banana", store: "A", revenue: 120 } ]
Explanation: $multiply
is used to calculate revenue by multiplying price
and quantity
.
Now combine $project and $group
Let’s find total revenue per store.
db.orders.aggregate([
{
$project: {
store: 1,
revenue: { $multiply: ["$price", "$quantity"] }
}
},
{
$group: {
_id: "$store",
totalRevenue: { $sum: "$revenue" }
}
}
]);
Output:
[ { _id: "A", totalRevenue: 250 }, { _id: "B", totalRevenue: 225 } ]
$sort – Sorting Results
Use $sort
to arrange documents in ascending (1
) or descending (-1
) order.
Example: Sort stores by total revenue (highest first)
db.orders.aggregate([
{
$project: {
store: 1,
revenue: { $multiply: ["$price", "$quantity"] }
}
},
{
$group: {
_id: "$store",
totalRevenue: { $sum: "$revenue" }
}
},
{ $sort: { totalRevenue: -1 } }
]);
Output:
[ { _id: "A", totalRevenue: 250 }, { _id: "B", totalRevenue: 225 } ]
Summary
$match
filters documents$project
transforms or adds new fields$group
aggregates data$sort
arranges data in a specific order
These four stages form the backbone of MongoDB's data processing capabilities. Practice chaining them to build powerful queries for real-world use cases.
Next: Dive deeper into aggregation pipelines and explore advanced operators like $unwind
and $lookup
.