Skip to content

MongoDB Schema Design: Embedded vs Referenced with Examples

Learn the difference between embedded and referenced data in MongoDB using simple examples from a streaming platform database.

Visualeaf cover image showing MongoDB schema design with an embedded users document on one side and a referenced users to payments relationship through user_id on the other side.
MongoDB schema design example in Visualeaf, comparing embedded data inside a users document with referenced data between users and payments.

When you design a MongoDB database, one of the first choices is this:

Should this data stay inside the same document, or should it live in another collection?

That is the difference between embedding and referencing.

Embedding means the data is stored inside the document.

Referencing means the data is stored in another collection and connected with an _id.

That’s the main difference.

Let’s use a simple database called streaming_platform_db.

It has collections like users, movies, series, episodes, payments, ratings, watch_history, and activity_logs.

The database works like a basic streaming platform.

A user can have a subscription, profiles, and devices. The same user can also make payments, watch content, rate movies, and create activity logs.

That makes this database a good example because it uses both embedded data and referenced data.

Visual Schema Design

In Visualeaf, I generated a schema diagram for the streaming_platform_db database.

This makes the structure easier to understand because I can see the collections and their relationships in one place.

Visualeaf schema diagram for a MongoDB streaming platform database with users, movies, series, episodes, payments, ratings, watch history, and activity logs.*
VisuaLeaf schema diagram for streaming_platform_db, showing embedded fields and referenced collections.

For example, the users collection has fields like:

subscription
profiles
devices

These fields are stored inside each user document.

That means they are embedded.

But other collections, like payments, ratings, watch_history, and activity_logs, are connected to users with user_id.

That means they are referenced.

So the diagram shows the main idea very clearly:

Some data lives inside a document.

Some data lives in another collection and points back with an ID.

A Quick Example in Mongo Shell

To make this easier to see, here is a small example from the users collection.

In this document, subscription, profiles, and devices are stored inside the user document.

That means they are embedded.

db.users.insertOne({
  full_name: "Oliver Smith",
  email: "oliver.smith@streaming.test",
  country: "UK",
  city: "London",

  subscription: {
    plan: "standard",
    status: "active"
  },

  profiles: [
    {
      profile_name: "Oliver",
      type: "adult",
      preferences: {
        favorite_genres: ["Action", "Sci-Fi"],
        subtitles: ["en"]
      }
    },
    {
      profile_name: "Kids",
      type: "kids",
      kids_mode: true
    }
  ],

  devices: [
    {
      type: "laptop",
      os: "Windows 11"
    }
  ]
});

After creating this document in Mongo Shell, Visualeaf makes the embedded structure easier to see. The fields are inside the users document, not in separate collections.

Embedded data in the users collection. The subscription, profiles, preferences, and devices are stored inside the same user document.
Embedded data in the users collection. The subscription, profiles, preferences, and devices are stored inside the same user document.

Now compare that with referenced data.

Payments are not stored inside the user document. They live in a separate collection and point back to the user with user_id.

db.payments.insertOne({
  user_id: ObjectId("USER_ID_HERE"),
  amount: 12.99,
  currency: "EUR",
  status: "paid",
  paid_at: new Date()
});

Here, user_id connects the payment to the user.

That is a reference.

Visualeaf MongoDB schema diagram showing the users collection connected to the payments collection with a one-to-many relationship through user_id.
Visualeaf schema view showing the reference between users and payments through user_id.

One important thing to know: a MongoDB reference is not the same as a SQL foreign key.

In SQL, a foreign key can enforce the relationship between two tables.

In MongoDB, user_id is just a field that stores another document’s _id.

MongoDB does not automatically check that the user exists.

So the difference from the main diagram is simple:

users.devices       // embedded
users.profiles      // embedded
users.subscription  // embedded

payments.user_id    // referenced
ratings.user_id     // referenced
watch_history.user_id // referenced
activity_logs.user_id // referenced

Embed data when it belongs inside the document.

Reference data when it should stay in another collection.

Where embedding fits

Embedding is a good choice when your data is associated with one primary document.

Examples:

users
 └── subscription
 └── profiles
      └── preferences
 └── devices

If you open a user account, you will likely also want to see the subscription, profiles, and devices too.
So it makes sense to keep them together.
You do not need a separate collection for every small detail.

Simple rule: Embed data when it belongs to one parent and is usually read together.

Where Referencing Fits

Referencing is a good choice when the data can grow or needs to stay separate.

For example, payments should not be stored inside the user document.

A user can have many payments over time. One payment this month, another payment next month, and so on.

So it is cleaner to store payments in a separate collection.

payments
└── user_id

The user_id field points back to the user.

The same idea works for ratings, watch_history, and activity_logs.

ratings
 └── user_id
 └── content_id

watch_history
 └── user_id
 └── content_id

activity_logs
 └── user_id

These records are related to the user, but they do not need to live inside the user document.

They can grow fast, so they are better as separate collections.

Simple rule:

Reference data when it can grow, repeat, or connect more than one collection.

When to Embed vs When to Reference

Here is a simple way to decide.

Situation Better choice Why
The data is small Embed It keeps related data in one place
The data belongs to one parent Embed The structure feels natural
You usually read the data together Embed You can get everything in one query
The data can grow a lot Reference It avoids huge documents
The data is shared by many records Reference It avoids copying the same data everywhere
You need to search or update it on its own Reference It is easier to query and index separately

Conclusion

MongoDB schema design is about deciding where your data should live.

Some data makes more sense inside the same document.

Other data makes more sense in a separate collection, connected with an ID.

That is the difference between embedding and referencing.

A GUI tool like Visualeaf makes this easier to understand because you can see the database structure visually.

Instead of reading only raw JSON, you can see how collections connect, where data is nested, and how the schema is organized.

This makes it easier to read, explain, and improve your MongoDB database.

CTA Image

Try VisuaLeaf today!

Download Free