Skip to main content

How to Store and Query Embeddings in MongoDB

Learn how to store, index, and query embeddings in MongoDB using Atlas Vector Search.
Dec 19, 2025  · 8 min read

The rise of LLMs and semantic search has fundamentally changed how we build search, recommendation, and retrieval systems. Traditional keyword search—whether through SQL LIKE, Lucene inverted, or full-text indexes—is increasingly insufficient when users expect natural-language understanding.

This is where embeddings and vector databases enter the picture.

MongoDB has evolved rapidly in this space with Atlas Vector Search, giving developers a single database for documents + metadata + vectors—all under one API. In this guide, we’ll walk through:

  • What MongoDB is.
  • What query embeddings are and why they matter.
  • When you should use embeddings.
  • How to store embeddings in MongoDB.
  • How to generate and query them using Python.

This tutorial is hands-on and ready to integrate into your retrieval-augmented generation (RAG), similarity search, or recommendation pipeline.

What is MongoDB?

MongoDB is a document-oriented NoSQL database that stores data in flexible BSON documents—BSON is a binary encoded JavaScript Object Notation (JSON)—a textual object notation widely used to transmit and store data across web-based applications. Unlike rigid relational databases, MongoDB makes schema evolution simple and provides:

  • Horizontal scalability (sharding).
  • Rich aggregation framework.
  • Full-text search.
  • Vector search.
  • Time series collections.
  • Schema flexibility.

MongoDB Atlas—the fully managed cloud service—extends this with:

  • Atlas Search (built on Lucene).
  • Atlas Vector Search (semantic search via HNSW indexing).
  • Automatic scaling, backups, security, and monitoring.

MongoDB has become a strong choice for AI workloads because vectors often live alongside metadata, and MongoDB Atlas bundles all of that into one system.

What Are Embeddings and Why Do We Use Query Embeddings?

When you work with search or retrieval in AI applications, one of the first concepts you encounter is embeddings

An embedding is essentially a numerical representation, a long list of floating-point values, that captures the meaning of a piece of text (or image, audio, etc.). Instead of looking at the exact words, embeddings map text into a vector space where similar ideas end up close together. For example, consider three different user statements:

  • Billing issue
  • Payment error
  • Salary not credited

Even though the words are different, all three describe a financial problem. A good embedding model places these texts near each other in vector space, allowing you to retrieve them based on meaning rather than keywords.

This becomes especially important when the user performs a search. The text they type—“Why is my salary delayed?”—is also converted into an embedding. This new vector is called a query embedding. You then compare this query vector against all stored embeddings to find which documents are semantically similar. The closer the vectors are, the more relevant the result.

MongoDB Atlas makes this whole workflow extremely straightforward with Atlas Vector Search. After storing your documents and their embeddings in a MongoDB collection, Atlas creates a vector index (using HNSW under the hood) that can quickly retrieve similar documents at scale. For example, if you store support articles with their embeddings, a query like…

query = "Why isn't my salary showing up this month?"
query_vec = get_embedding(query) # Explained in below examples

…can be used directly in MongoDB:

results = collection.aggregate([
    {
        "$vectorSearch": {
            "path": "embedding",
            "queryVector": query_vec,
            "index": "vector_index",
            "limit": 3
        }
    }
])

MongoDB will automatically return the top 3 support articles whose embeddings are closest to the query—without you having to do any keyword matching or manual scoring.

This is the power of embeddings in MongoDB Atlas: you store meaning, not just text, and you query based on intent, not just words. This shift is why embeddings have become a core building block for semantic search, RAG systems, AI assistants, and intelligent knowledge bases.

When to Use Query Embeddings

Use embeddings when you want semantic similarity, such as:

Retrieval-augmented generation (RAG)

  • Feed your LLM accurate context from your knowledge base
  • Replace brittle keyword search

Chatbots that require deep understanding

  • Portal helpdesk
  • Customer support

Product recommendation

  • “similar items to this item”

Document or duplicate detection

  • Clustering
  • Topic grouping

When keyword search fails

  • Queries like “salary credit delay” vs. “payment processing issue”

Do not use embeddings when:

  • You require exact match (IDs, invoice numbers).
  • Your dataset is tiny (simple search works).
  • You need numeric filtering/comparisons (metadata should handle that).

How to Store Embeddings in MongoDB

The beauty of MongoDB is that embeddings fit naturally inside documents. You just store them as arrays of floats.

Example document schema

Here’s a typical document structure:

{
  "_id": "uuid-123",
  "text": "This is a sample document about MongoDB vector search.",
  "embedding": [0.123, -0.241, 0.998, ...],
  "tags": ["mongodb", "vector", "ai"],
  "createdAt": "2025-01-01T10:20:00Z"
}

Important things to keep in mind:

  • Every embedding in a collection must have the same dimensionality.
  • Embeddings should ideally be stored as float32 (MongoDB will internally convert to float64).
  • If your documents are long, chunk them before embedding.
  • Store metadata separately—tags, category, URL, timestamps, etc. This makes hybrid search powerful.

Creating a Vector Index in MongoDB (Critical Step)

Before storing or querying embeddings, you must create a vector search index in MongoDB Atlas.

In MongoDB Atlas, go to: Atlas → Database → Collections → Search → Create Search Index

Use a definition like:

{
  "fields": [
    {
      "type": "vector",
      "path": "embedding",
      "numDimensions": 1536,
      "similarity": "cosine"
    }
  ]
}

This tells MongoDB:

  • The field embedding contains vectors.
  • Each vector has 1536 dimensions.
  • Use cosine similarity for ranking.

This index powers all vector search queries.

MongoDB includes native vector search capabilities with support for:

  • Cosine similarity.
  • Euclidean distance.
  • dotProduct scoring.

Choose based on your embedding model—but how do we know what to pick? Let’s dive deep.

Cosine similarity

What it measures

It measures how aligned two vectors are—i.e., the angle between them. It ignores magnitude and focuses only on direction.

Intuition

  • If two vectors point in the same direction → similarity close to 1
  • Opposite directions → –1
  • Orthogonal (unrelated) → 0

When to use

Use cosine when magnitude doesn’t matter, only the pattern or concept being expressed.

Common uses

  • Text embeddings/NLP
  • Semantic search
  • Recommendation systems based on meaning, not scale

Cosine is the safest default for most embedding models.

Euclidean distance

What it measures

It measures the straight-line distance between two vectors in space.

Intuition

  • Small distance → vectors are similar
  • Large distance → vectors are far apart

When to use

Use Euclidean when absolute values and magnitudes matter, not just direction.

Common uses

  • Image embeddings
  • Physical measurements/sensor data
  • Clustering algorithms like K-means (which internally uses Euclidean)

If your data naturally lives in a geometric space (e.g., pixel intensities), Euclidean often makes more sense.

Dot product

What it measures

It measures a combination of similarity and magnitude. 

Mathematically: dot(u, v) = |u| × |v| × cos(θ)

Intuition

  • Larger magnitude AND alignment → bigger score
  • Sensitive to vector length
  • Not normalized automatically

When to use

Use dot product when you want longer vectors to have more influence or when the embedding model is trained with dot product in mind.

Common uses

  • Large language model embeddings that assume inner product scoring
  • Neural network similarity layers
  • Recommendation models where “strength” of features is meaningful

Quick rule of thumb

Metric

Good for

Avoid when

Cosine

Text, semantic search, general embeddings

Magnitude matters

Euclidean

Images, geometric features, clustering

Data has arbitrary scaling

Dot product

Models trained with inner product, recsys

Magnitudes distort similarity

Generate, Store, and Query Embeddings in MongoDB (Python Tutorial)

Let’s walk through this as if we’re building a real system.

Install dependencies

pip install pymongo openai

Generate embeddings

Using OpenAI (swap in any embedding provider you prefer):

import os
from openai import OpenAI

os.environ["OPENAI_API_KEY"] = "your_api_key_here"
client = OpenAI()

def get_embedding(text: str):
    response = client.embeddings.create(
        model="text-embedding-3-small",
        input=text
    )
    return response.data[0].embedding

Connect to MongoDB and insert documents

Using OpenAI (swap in any embedding provider you prefer):

from pymongo import MongoClient
import uuid

mongo = MongoClient("<MONGODB_ATLAS_URI>")
db = mongo["vector_db"]
collection = db["documents"]

text = "This is a sample document about MongoDB vector search."
embedding = get_embedding(text)

document = {
    "_id": str(uuid.uuid4()),
    "text": text,
    "embedding": embedding,
    "tags": ["mongodb", "search"],
}

collection.insert_one(document)
print("Document inserted successfully!")

At this point, your data is ready for semantic search.

Querying embeddings in MongoDB

Now comes the fun part: retrieving similar documents using a vector query. MongoDB uses $vectorSearch in an aggregation pipeline.

Generate query vector

from pymongo import MongoClient
query = "How does MongoDB handle semantic search?"
query_vec = get_embedding(query)

Perform vector search (query embeddings)

results = collection.aggregate([
    {
        "$vectorSearch": {
            "index": "vector_index",
            "path": "embedding",
            "queryVector": query_vec,
            "numCandidates": 100,
            "limit": 5,
            "similarity": "cosine"
        }
    },
    {
        "$project": {
            "text": 1,
            "score": {"$meta": "vectorSearchScore"}
        }
    }
])

for r in results:
    print(r)

This returns the five most semantically similar documents.

This is powerful for RAG pipelines where you want semantic search.

Step-by-step explanation of the above query

$vectorSearch → Finds documents whose embedding field is closest to queryVector using the specified similarity metric

numCandidates → How many potential matches the engine should consider before selecting the top results

limit → Number of final similar results to return

$project → Returns the original text plus a special field called score

What is score field?

MongoDB Atlas Vector Search returns a special metadata field, "score": {"$meta": "vectorSearchScore"}, which tells you how similar a stored vector is to your query vector.

The meaning of the score depends on the similarity metric you choose, as explained below:

Similarity metric

What score means

Higher score = more similar?

Score range

cosine

Angle similarity between vectors

Yes

–1 to +1

euclidean

Negative L2 distance

No—closer to 0 = more similar

–∞ to 0

dotProduct

Vector magnitude alignment

Yes

–∞ to +∞

Best Practices

  • Chunk long documents (300–500 tokens each): Better relevance + avoids embedding long blobs
  • Store metadata separately: Helps filtering and ranking
  • Batch embedding generation: Improves throughput and reduces model cost
  • Use float32 embeddings in your model: MongoDB internally uses float64 but accepts float32 arrays
  • Index only the vector field: Extra fields in the index increase memory and latency

Common Pitfalls to Avoid

Problem

Why it happens

Dimension mismatch

Query vector dim ≠ index dim

Irrelevant results

Using different models for storage vs query

Slow search

Too many numCandidates or large cluster requirements

Document too large

16MB document limit

Wrong similarity metric

Model trained for cosine but using dot product

Final Thoughts

Embeddings and semantic search are now foundational to modern AI products. Whether you’re building a chatbot, RAG pipeline, recommendation engine, or smart search layer, you need a reliable way to store vectors and query them efficiently.

MongoDB gives you a unified platform where:

  • Your text lives.
  • Your metadata lives.
  • Your embeddings live.
  • Your vector search runs.
  • Everything scales together.

This avoids the complexity of juggling separate databases and lets you focus on building product capabilities, not infrastructure.

FAQs

Do I need a separate vector database if I use MongoDB for embeddings?

No. MongoDB Atlas Vector Search is built-in, so you can store documents, metadata, and vectors in the same place without using a separate vector DB like Pinecone or Milvus.

What embedding dimensionality should I use in MongoDB?

Use the dimensionality provided by your embedding model (e.g., 768 or 1536). All documents in a collection must use the same dimension as the vector index.

Can MongoDB perform hybrid search (metadata + vector search)?

You can filter with $match and then apply $vectorSearch in the same aggregation pipeline—ideal for RAG and recommendation systems.

Is MongoDB suitable for large-scale vector search workloads?

Yes. MongoDB uses HNSW-based indexing in Atlas Search, which scales well for millions of documents. Throughput depends on index size, cluster tier, and workload patterns.

Can I use any embedding model with MongoDB?

Absolutely. MongoDB is model-agnostic. You can use OpenAI, HuggingFace, Cohere, local models, or custom embeddings—as long as dimensions match the index.


Nilesh Soni's photo
Author
Nilesh Soni

Staff Software Engineer @ Uber

Topics

Top DataCamp Courses

Course

Introduction to MongoDB in Python

3 hr
22.7K
Learn to manipulate and analyze flexibly structured data with MongoDB.
See DetailsRight Arrow
Start Course
See MoreRight Arrow
Related

blog

Top 7 Concepts to Know When Using MongoDB as a Beginner

Learn about collections, documents, indexes, queries, and more to build a strong foundation in NoSQL databases.
Moses Anumadu's photo

Moses Anumadu

Tutorial

How to Build a Vector Search Application with MongoDB Atlas and Python

Learn how to run your first MongoDB vector search. This tutorial walks you through finding similar items with embeddings and step-by-step examples.
Nilesh Soni's photo

Nilesh Soni

Tutorial

Mastering Vector Search in MongoDB: A Guide With Examples

Learn how to set up and use MongoDB's Vector Search for building smart apps. This practical guide covers vector indexing, query best practices, and real-world examples.
Karen Zhang's photo

Karen Zhang

Tutorial

MongoDB Dot Notation Tutorial: Querying Nested Fields

Learn a few querying techniques for nested documents and arrays within MongoDB.
Nic Raboy's photo

Nic Raboy

Tutorial

Hybrid Search: Combining Vector and Keyword Queries in MongoDB

Learn about hybrid search and how to utilize it in MongoDB.
Anaiya Raisinghani's photo

Anaiya Raisinghani

Tutorial

Getting Started with MongoDB Query API

Master the MongoDB Query API with this comprehensive guide to CRUD operations, advanced filters, data aggregation, and performance-boosting indexing.
Karen Zhang's photo

Karen Zhang

See MoreSee More