MongoDB - Getting StartedMongoDB - Getting Started1

Schema Design Patterns: Embedding vs Referencing



Schema Design Patterns: Embedding vs Referencing

MongoDB allows flexible schema design, but choosing the right approach is key to performance and maintainability. The two primary patterns are:

What is Embedding?

Embedding means nesting one document inside another. It is useful when related data is mostly accessed together and doesn’t grow unbounded.

Example: Consider an e-commerce system where each order includes a list of items.


    db.orders.insertOne({
      _id: 1,
      customer: "John Doe",
      date: "2025-05-01",
      items: [
        { product: "Keyboard", quantity: 1, price: 1200 },
        { product: "Mouse", quantity: 2, price: 500 }
      ]
    });
    

Output:

    {
      acknowledged: true,
      insertedId: 1
    }
    

Explanation: The items are embedded inside the orders document. This structure allows fast reads since all order data is in one place.

Question: Is embedding suitable when items grow too large (e.g., thousands per order)?

Answer: No. Documents in MongoDB have a 16MB limit. If an embedded array grows too large, referencing is better.

What is Referencing?

Referencing separates data into multiple collections and links them using IDs. This is better for reusable or large datasets.

Example: Let’s say we have users and posts. Each post references the user who created it.


    db.users.insertOne({
      _id: ObjectId("644d1f95f0e2fc7a7f91a1c1"),
      name: "Alice"
    });

    db.posts.insertOne({
      title: "My first blog",
      content: "This is my post",
      authorId: ObjectId("644d1f95f0e2fc7a7f91a1c1")
    });
    

Output:

    {
      acknowledged: true,
      insertedId: ObjectId("...")
    }
    

Explanation: The posts collection stores only a reference to the users collection using authorId. When needed, we can perform a manual join in application code or aggregation.

When to Embed?

When to Reference?

Real-World Example: Blog with Comments

Embedding Comments Inside Blog


    db.blogs.insertOne({
      title: "MongoDB Schema Design",
      body: "This is about embedding vs referencing",
      comments: [
        { user: "Tom", text: "Great post!" },
        { user: "Jane", text: "Very helpful." }
      ]
    });
    

Output:

    {
      acknowledged: true,
      insertedId: ObjectId("...")
    }
    

Pros: Easy to retrieve post with comments.
Cons: Can be problematic if there are thousands of comments (growth limit).

Referencing Comments in Another Collection


    db.comments.insertMany([
      { blogId: ObjectId("..."), user: "Tom", text: "Great post!" },
      { blogId: ObjectId("..."), user: "Jane", text: "Very helpful." }
    ]);
    

Output:

    {
      acknowledged: true,
      insertedIds: [ObjectId("..."), ObjectId("...")]
    }
    

Explanation: Comments are in a separate collection. You can fetch them using blogId reference when needed. It scales better and keeps your documents lightweight.

Summary

MongoDB gives you flexibility in how you structure your data. You should choose between embedding and referencing based on access patterns, data growth, and reusability. Embedding is fast and simple; referencing is powerful and scalable.

Next Up

We'll learn about implementing schema validation to ensure data consistency in MongoDB collections.



Welcome to ProgramGuru

Sign up to start your journey with us

Support ProgramGuru.org

Mention your name, and programguru.org in the message. Your name shall be displayed in the sponsers list.

PayPal

UPI

PhonePe QR

MALLIKARJUNA M