Aggregation Pipeline Stages Explained

In MongoDB, the aggregation pipeline is a powerful framework for processing and transforming documents. Think of it as a data processing pipeline where each stage transforms the data and passes the results to the next stage.

The stages are executed in order, and the final result is returned after all transformations.

What is an Aggregation Pipeline?

An aggregation pipeline consists of multiple stages. Each stage performs an operation on the input documents and passes the results to the next stage.

Basic Syntax:


    db.collection.aggregate([
      { stage1 },
      { stage2 },
      ...
    ]);

Key Aggregation Stages

$match – Filters documents (like WHERE in SQL)
$group – Groups documents and performs aggregations (like GROUP BY)
$project – Reshapes the document, adds/removes fields
$sort – Sorts the documents
$limit and $skip – For pagination

Example Collection

Let’s use a sales collection:


    db.sales.insertMany([
      { item: "laptop", price: 800, quantity: 5, region: "North" },
      { item: "phone", price: 500, quantity: 10, region: "North" },
      { item: "tablet", price: 300, quantity: 8, region: "South" },
      { item: "laptop", price: 800, quantity: 3, region: "South" },
      { item: "phone", price: 500, quantity: 7, region: "East" }
    ]);

    { acknowledged: true, insertedIds: [...] }

$match – Filtering Documents

Use $match to filter documents based on a condition.


    db.sales.aggregate([
      { $match: { region: "North" } }
    ]);

    { item: "laptop", price: 800, quantity: 5, region: "North" }
    { item: "phone", price: 500, quantity: 10, region: "North" }

Explanation: This stage selects only documents from the "North" region.

$group – Aggregating Data

Use $group to aggregate documents. You must use an _id field to group by.


    db.sales.aggregate([
      { $group: { _id: "$item", totalQty: { $sum: "$quantity" } } }
    ]);

    { _id: "phone", totalQty: 17 }
    { _id: "tablet", totalQty: 8 }
    { _id: "laptop", totalQty: 8 }

Explanation: This groups all sales by item and sums the quantity.

Q&A to Build Intuition

Q: Can I group by multiple fields in MongoDB?

A: Yes. You can group by a compound key using a sub-document:


    db.sales.aggregate([
      {
        $group: {
          _id: { item: "$item", region: "$region" },
          totalQty: { $sum: "$quantity" }
        }
      }
    ]);

    { _id: { item: "phone", region: "East" }, totalQty: 7 }
    { _id: { item: "tablet", region: "South" }, totalQty: 8 }
    { _id: { item: "laptop", region: "South" }, totalQty: 3 }
    ...

$project – Shaping the Output

Use $project to select or rename fields in the output.


    db.sales.aggregate([
      {
        $project: {
          item: 1,
          revenue: { $multiply: ["$price", "$quantity"] },
          _id: 0
        }
      }
    ]);

    { item: "laptop", revenue: 4000 }
    { item: "phone", revenue: 5000 }
    { item: "tablet", revenue: 2400 }
    ...

Explanation: This calculates a new field revenue and excludes _id from the output.

$sort – Sorting Results

Use $sort to order results by a field.


    db.sales.aggregate([
      { $sort: { quantity: -1 } }
    ]);

    { item: "phone", quantity: 10, ... }
    { item: "tablet", quantity: 8, ... }
    ...

Explanation: Sorting in descending order (-1) by quantity.

Combining Stages

You can combine multiple stages in a pipeline to process documents step-by-step:


    db.sales.aggregate([
      { $match: { region: "North" } },
      { $project: { item: 1, total: { $multiply: ["$price", "$quantity"] }, _id: 0 } },
      { $sort: { total: -1 } }
    ]);

    { item: "phone", total: 5000 }
    { item: "laptop", total: 4000 }

Explanation: This pipeline filters by region, calculates total revenue, and sorts the results in descending order.

Summary

$match filters documents
$group aggregates documents
$project reshapes documents
$sort orders documents

Aggregation pipelines are essential for performing powerful analytics directly within MongoDB.

Up next: we’ll explore real-life aggregation use cases and practice building pipelines with complex conditions and transformations.

⬅ Previous TopicUsing $match, $group, $project, $sort in MongoDB Aggregation

Next Topic ⮕Common Use Cases: Grouping, Counting, Summing

Comments

Loading comments...

Aggregation Pipeline Stages Explained