Building Multimodal AI Applications Using MongoDB and Voyage AI
Key Takeaways:- Learn how to search and retrieve multimodal content using Voyage AI.
- Understand how to use MongoDB to store and serve data for AI applications.
- Build a simple, functional multimodal AI app from scratch.
Description
Session Pre-Requisites
- MongoDB cluster setup:
- Obtain a Voyage AI API key: Follow the steps here to get a Voyage AI API key.
- Obtain a Gemini API key: Follow the steps here to get a Gemini API key via Google AI Studio.
As AI applications expand beyond text, the ability to work with image, video, and other modalities is becoming a must-have skill. Building effective multimodal systems requires not only the right models, but also the right infrastructure to store, retrieve, and serve diverse data types at scale.
In this code-along, Apoorva Joshi, a Senior AI Developer Advocate at MongoDB, will teach you how to build a simple multimodal AI application using MongoDB and Voyage AI. You’ll learn how to structure and query image and video data, apply retrieval techniques with Voyage AI, and connect everything in a functional pipeline. This session is ideal for data scientists and AI engineers looking to expand their application-building toolkit.
Presenter Bio

Apoorva is a Data Scientist turned Developer Advocate, with over 7 years of experience applying machine learning to problems in domains such as cybersecurity and mental health. As an AI Developer Advocate at MongoDB, she now helps developers be successful at building AI applications through written content and hands-on workshops.