⬅ Previous Topic
Common Use Cases: Grouping, Counting, SummingNext Topic ⮕
Schema Design Patterns: Embedding vs Referencing⬅ Previous Topic
Common Use Cases: Grouping, Counting, SummingNext Topic ⮕
Schema Design Patterns: Embedding vs ReferencingMongoDB provides two powerful tools for processing and transforming data: MapReduce and the Aggregation Framework. Both are used to perform operations like grouping, filtering, and summarizing data. However, they differ in syntax, performance, and use cases.
MapReduce is a data processing model borrowed from distributed systems like Hadoop. It consists of two main steps:
The Aggregation Framework uses a pipeline-based approach, where data passes through multiple stages such as $match
, $group
, $sort
, and $project
. It's much faster and easier to use compared to MapReduce in most scenarios.
Let’s use a sample collection orders
:
db.orders.insertMany([
{ customer: "Alice", amount: 250 },
{ customer: "Bob", amount: 400 },
{ customer: "Alice", amount: 150 },
{ customer: "Bob", amount: 100 },
{ customer: "Charlie", amount: 300 }
])
We want to calculate the total amount spent by each customer.
db.orders.aggregate([
{
$group: {
_id: "$customer",
totalSpent: { $sum: "$amount" }
}
}
])
Output:
[ { "_id": "Alice", "totalSpent": 400 }, { "_id": "Bob", "totalSpent": 500 }, { "_id": "Charlie", "totalSpent": 300 } ]
Explanation: The $group
stage groups documents by customer name and calculates the sum of their amount
fields.
var mapFunction = function() {
emit(this.customer, this.amount);
};
var reduceFunction = function(key, values) {
return Array.sum(values);
};
db.orders.mapReduce(
mapFunction,
reduceFunction,
{ out: "order_totals" }
)
Output:
db.order_totals.find().pretty()
{ "_id": "Alice", "value": 400 } { "_id": "Bob", "value": 500 } { "_id": "Charlie", "value": 300 }
Explanation: The map function emits each customer and their order amount. The reduce function adds up all values for each customer. The result is stored in a new collection order_totals
.
Q: Which one should I use if I just want to sum values or filter records?
A: Use the Aggregation Framework. It’s faster, more concise, and optimized for simple tasks like grouping, filtering, and projecting data.
MapReduce can still be useful when:
Now that you understand both approaches, the next lesson will dive deeper into building real-world aggregation pipelines using multiple stages like $match
, $group
, $project
, and more.
⬅ Previous Topic
Common Use Cases: Grouping, Counting, SummingNext Topic ⮕
Schema Design Patterns: Embedding vs ReferencingYou can support this website with a contribution of your choice.
When making a contribution, mention your name, and programguru.org in the message. Your name shall be displayed in the sponsors list.