Skip to main content

MongoDB Data Modeling Guide for Blogging Apps

Learn some data modeling possibilities that include nested documents when designing a content management system (CMS) or blog app.
Nov 12, 2025

So you want to build your own content management system (CMS), also sometimes known as a blog? This is a classic example when learning how to use a database, whether it be a relational database management system (RDBMS) or a NoSQL database, because it explores a potentially large amount of data as well as relationships in that data. The example of a blogging app also translates well for other data modeling needs.

In this article, we're going to explore some do's and don'ts when it comes to designing your NoSQL documents in MongoDB. However, we won't actually be developing a blogging app, only looking at things from a data perspective.

The Components of a Blogging App

Before we make an attempt at designing a document model for a blog, it's a good idea to take a step back and think about every component that might have data associated with it.

A typical blog might have the following features:

  • Many users or authors
  • Many blog posts for any one author
  • Many comments for any one blog post

Of course, that list of features can become more lengthy and complex depending on your business needs in a blog. To keep things simple, we'll model our data based on the above.

What You Can Do, But Shouldn't Do

Your first thought when looking at the above features might be to treat each bulleted item as its own document. This is the typical approach you'd use in a relational database like Postgres. In MongoDB, it might look something like the following:

{
	"_id": "author-1",
	"name": "Nic Raboy",
	"description": "I'm some dude named Nic."
}

The above might be a very trimmed-down representation of an author. Of course, you could have many other fields, like email and whatnot, but that's not truly important here.

Then, we might have the following for blog posts:

{
	"_id": "blog-1",
	"author_id": "author-1",
	"title": "MongoDB is Awesome!",
	"content": "This is some blog content..."
}

Just like the previous, the blog document could have other fields like tags and whatever you can think up. The big thing here is that it has an id reference between the two documents. More on that in a moment, though.

The final document model in this scenario would be around comments:

{
	"_id": "comment-1",
	"blog_id": "blog-1",
	"username": "Anonymous User",
	"content": "Hey great article, it really helped a lot!"
}

Once again, the comment document in this scenario has a reference relationship based on an id value between the comment document and the blog document. This is all very common in a relational database, and the truth is that it would work fine in MongoDB too. However, this is not going to give you the best MongoDB experience.

One reason you wouldn't want to do this in MongoDB is because of performance.

To join each of these three documents, you'd want to use a $lookup operation within an aggregation pipeline. These $lookup operations are not cheap and could negatively impact your performance as your database scales, even more so when you have more than one $lookup operation in your pipeline.

There are better ways to get the job done in MongoDB.

A Better, More MongoDB-Friendly Approach to Data Modeling

One of the first rules of MongoDB is that data that is accessed together should be stored together. Storing each feature in a separate document or collection breaks that rule because we're no longer doing a single fetch operation for the data presented to the user.

Let's take another stab at the problem, modeling the documents for our blog like this:

{
	"_id": "author-1",
	"name": "Nic Raboy",
	"description": "I'm some dude named Nic.",
	"posts": [
		{
			"title": "MongoDB is Awesome!",
			"content": "This is some blog content...",
			"comments": [
				{
					"username": "Anonymous User",
					"content": "Hey great article, it really helped a lot!"
				}
			]
		}
	]
}

Alright, so one of those main MongoDB rules is satisfied in the above model. We have a single document that uses nesting to store all blog posts and all comments with the author.

While it could work, it's probably not a good idea to model your blogging app documents like that.

MongoDB does have document sizing limits, and when you have unbounded arrays in those documents, you run the risk of exceeding those limits. Not to mention you might hit performance problems as the arrays get large. In this case, we could have an unlimited number of blog posts and an unlimited number of comments per blog post. It's going to get messy at some point.

What might work better in this example is a mix-and-match of the two strategies.

Remember, the authors, blog posts, and comments could have any number of fields associated with them. In this example, we kept it minimal. With that in mind, we know that the most frequently accessed data will be the blog itself. Take the new design:

{
	"_id": "blog-1",
	"author": {
		"_id": "author-1",
		"name": "Nic Raboy",
	},
	"title": "MongoDB is Awesome!",
	"content": "This is some blog content...",
	"comments": [
		{
			"username": "Anonymous User",
			"content": "Hey great article, it really helped a lot!"
		}
	]
}

The second document we have in this example would still be the original document for author information:

{
	"_id": "author-1",
	"name": "Nic Raboy",
	"description": "I'm some dude named Nic."
}

We know that some of the author information should be included when displaying a blog article, but we don't necessarily need all the information. We can embed the author information that is needed along with the _id inside the blog post, and if the user needs to find out more about the author, the user can dig deeper in your application. The $lookup operation doesn't happen every single time, which improves performance.

Now, we have the problem of the potentially infinite amount of comments for any given blog article currently attached as an array.

There are a few approaches to handle this and it is ultimately up to you:

  • You can do the naive approach and keep them as an array which probably isn't a great idea.
  • You can move the comments back to the original design of one comment per document all with a reference id to the blog article and using $lookup operations to include them.
  • You can use an archival strategy, storing a fixed number of comments in an array within the blog article while also creating a separate comment document for each, allowing you to only query for extra comments if needed.
  • You can explore the bucketing pattern, grouping comments into "buckets" of around 100 comments per document, querying however many comment buckets you need at a time. Bucketing avoids the document size limit and also helps with pagination. You can also boost performance in this approach by creating an index on the blog_id to quickly fetch comments for any particular blog article. More information on indexing can be found in the MongoDB documentation.
  • Mix, match, and more...

In the case of the bucketing pattern, your comments documents might look like the following:

{
	"_id": "comment-bucket-1",
	"blog_id": "blog-1",
	"comments": [
		{
			"username": "Anonymous User",
			"content": "Hey great article, it really helped a lot!"
		},
		// More comments here...
	],
	"comment_count": 57
}

In this scenario, as new comments are created, you query for a bucket that isn't full. You push the new comment into the array and increase the count value. While this strategy has more dependency on application-level logic to maintain the buckets, it will result in better performance. If you've got an infinite scroll mechanism on your blog, even better because then you can paginate better using the buckets.

While out of the scope of this particular article, if you wanted a taste of what it takes to add a new comment, you’d filter the comment buckets collection (see the example query below) for buckets that match the blog_id as well as have a comment_count less than your chosen value.

For the match, you would use the $push operator in your update criteria and the $inc operator to increase the count value of your comment count. All of this can be done in a single operation, and if you use the upsert flag, you can create a new comment bucket if one doesn’t already exist that matches your criteria.

Example of update query: 

db.commentBuckets.updateOne(
  { blog_id: "blog-1", comment_count: { $lt: 100 } },
  { $push: { comments: newComment }, $inc: { comment_count: 1 } },
  { upsert: true }
)

Conclusion

As much as it'd be great to say you should store everything about your blog in a single nested document, it probably wouldn't be a great idea. Just like it probably wouldn't be a great idea to implement a data model that does well with a relational database, but not necessarily a document database like MongoDB.

With MongoDB, you want to reduce your referenced relationships as much as possible to get the best possible performance.

Keep in mind the following as you scale:

  • If you expect heavy commenting, consider the bucket pattern or some combination of it alongside partial embedding in the blog document.
  • What are your pagination needs?
  • Index your collections based on how you plan to query in your application.

To reiterate, think about how you'll be presenting the blog data in your frontend. Do you really need all of the author information with the request or is some of it fine? Do you really want to load one million comments with your blog article or is it fine to load a small handful? When you really think about it, you'd be surprised about how many join operations you don't need.

If you're new to MongoDB, I recommend checking out the Introduction to MongoDB in Python course.

FAQs

Can I transfer my blog app data model from a relational database to MongoDB?

You can create flat documents with id references and it’d work, but you’d not be leveraging the max performance you could get with MongoDB.

Why not just store everything in a single document?

There are size limits in MongoDB and when you have unbounded arrays, you can easily reach those limits.

The bucket pattern was mentioned, but are there others?

There are other patterns that might work, but the bucket pattern is a good option when it comes to large arrays.

What are MongoDB’s document size limits?

MongoDB documents have a maximum size of 16MB.

How do I handle updates to nested data (e.g., editing a comment)?

For embedded data, you can use update operators like $set, and for bucketed comments, you can locate the bucket and update the specific comment within it.


Nic Raboy's photo
Author
Nic Raboy

Nic Raboy is a Developer Relations Lead at MongoDB where he leads a team of Python, Java, C#, and PHP developers who create awesome content to help developers be successful at including MongoDB in their projects. He has experience with Golang and JavaScript and often writes about many of his development adventures.

Topics

Top DataCamp Courses

Course

Introduction to MongoDB in Python

3 hr
22.4K
Learn to manipulate and analyze flexibly structured data with MongoDB.
See DetailsRight Arrow
Start Course
See MoreRight Arrow
Related

blog

Top 7 Concepts to Know When Using MongoDB as a Beginner

Learn about collections, documents, indexes, queries, and more to build a strong foundation in NoSQL databases.
Moses Anumadu's photo

Moses Anumadu

blog

What Is MongoDB? Key Concepts, Use Cases, and Best Practices

This guide explains MongoDB, how it works, why developers love it, and how to start using this flexible NoSQL database.
Karen Zhang's photo

Karen Zhang

15 min

Tutorial

MongoDB Dot Notation Tutorial: Querying Nested Fields

Learn a few querying techniques for nested documents and arrays within MongoDB.
Nic Raboy's photo

Nic Raboy

Tutorial

MongoDB Schema Validation: A Practical Guide with Examples

This guide teaches you how to enforce clean and consistent data in MongoDB using schema validation, balancing flexibility with structure.
Samuel Molling's photo

Samuel Molling

Tutorial

A Comprehensive NoSQL Tutorial Using MongoDB

Learn about NoSQL databases - why NoSQL, how they differ from relational databases, the different types, and design your own NoSQL database using MongoDB.
Arunn Thevapalan's photo

Arunn Thevapalan

Tutorial

Introduction to MongoDB and Python

In this tutorial, you'll learn how to integrate MongoDB with your Python applications.
Derrick Mwiti's photo

Derrick Mwiti

See MoreSee More