Google Project Genie Guide: Google's Foundation World Model Explained

A guide to Project Genie and Genie 3, Google’s interactive foundation world model, with hands-on examples.

13 févr. 2026 · 7 min lire

Imagine a world where you don't just watch a video, you step inside it. Today, we are exploring Genie 3, a breakthrough from Google DeepMind that is redefining how we build and interact with digital environments. You will learn how this foundation world model works, why it is a major milestone for artificial intelligence, and how you can get your hands on it.

What is Project Genie?

Project Genie is not a traditional video generator like OpenAI’s Sora. While Sora creates high-quality but passive video clips, Project Genie is a Foundation World Model. This means it doesn't just predict pixels; it predicts how a world should behave based on user input.

The engine behind this prototype is Genie 3, the latest iteration of DeepMind’s generative AI family. It is designed to understand the underlying physics of an environment. If you move a character left, the model calculates what the next frame should look like from that new perspective.

This matters because it represents a significant leap toward Artificial General Intelligence (AGI) and pushes forward the world of frontier models. The model must grasp concepts like cause and effect, object permanence, and spatial consistency.

Genie 3 demonstrates these capabilities by maintaining a stable environment even as you move through it. Explore historical landmarks, fantasy lands, and become different characters easily with just some prompting.

Let’s take a look at an example:

World Prompt: A rolling hilly landscape filled with Hobbit-like homes. Each home has its own little garden with a wooden fence and vegetables. THe region is idyllic with small meadows and wildflowers all around.

Character: A ranger venturing across the lands

Key Features of Project Genie

Let’s look at some of the key features of Project Genie.

Interactive world generation: Unlike a video, you are in the driver's seat. You control the camera or a character, and the model generates the next frame in real time based on your specific actions.
World sketching: You can seed a world using a simple text prompt (e.g., "A neon-lit cyberpunk city") or an uploaded image. The AI then generates a playable 3D version of that static input.
World remixing: This allows you to take an existing world and transform it. You can change the art style, add new elements, or alter the atmosphere while keeping the core layout the same.
Action controllability: The model is trained to recognize inputs like jumping, flying, or walking. It maps these commands to logical visual changes, creating a sense of agency within the simulation.

Hands-on With Project Genie

Let’s get into some basic use cases of Project Genie to bring our ideas to life.

World sketching

Creating a world starts with the World Sketching phase. You enter a prompt or upload a photo, and the system uses Nano Banana Pro (Google's latest high-fidelity image engine) to refine the initial visual.

This integration ensures that the starting frame has the architectural depth and texture necessary for a complex 3D environment.

You can even "sketch" the logic of the world by defining how your character moves through the space before the simulation begins.

Below, you’ll see me write a sample environment and description of a character. This lets the model generate the initial world that it will start in.

World remixing

The World Remixing feature is a standout for creative workflows. For example, you can take a realistic photo of a local park and prompt the model to "Remix this into a Studio Ghibli cartoon style."

The resulting playable world retains the physical layout of the park (the trees, the paths) but overlays a completely new artistic aesthetic. The transitions are generally smooth, though complex textures can sometimes melt slightly during the transformation.

For instance, I ask the model to make the lake pink, and it reacts accordingly. The art style and overall physical layout remain the same, while it updates the lake.

3D exploration

Once the world is generated, you enter 3D Exploration mode. This allows you to navigate the world in either first person or third person.

You use the WASD keys to move your character around the world while using the arrow keys to look around. You can even use your space bar to jump and ascend over obstacles.

Each session gives you 60 seconds to explore the world.

In my experience, the overall environment feels remarkable. As you move, the AI generates new terrain that matches the established style. There are definitely some limitations when playing with complex textures, and you feel that the model generates ideas live, so the framerate is not always the smoothest.

However, to think that I wrote a simple prompt in less than a minute and having that become a generated, explorable world in mere minutes is absolutely mind-blowing.

How Can I Access Project Genie?

As of early 2026, Project Genie remains an experimental research prototype. It is not yet available to the general public.

Currently, access is limited to Google AI Ultra subscribers located in the United States.

If you have an active subscription, you can find the tool within the Google Labs FX portal.

Note that users must be 18 or older to participate in the research program.

How Good is Project Genie?

Navigating a world generated in real time is pretty different from playing a video game of pre-rendered graphics.

Of course, we can’t judge it the same way we would with a standard LLM since we can’t test a “world” by asking questions. We can only evaluate it through the visual and interactive experience.

First, the technical bits. The video is capped at 720p with a frame rate of around 24 fps. This makes it less “crisp” than Sora since the focus on generating smooth video is different from generating an interactive world.

It takes a lot more computational intensity to generate a world in real time.

Visually, the experience is quite nice. Again, I experienced some slightly stuttered frames due to the lower frame rate. But, generally, the world is pretty consistent.

However, I noticed in some runs that objects that were generated sometimes disappear or change when you walk past and turn around to view them. This tends to happen with unique objects (like characters) as opposed to static objects like trees.

The interaction with the world worked most of the time. When you press a button to move or look around, the camera pans and moves accordingly.

In that sense, the world is fairly interactive. I did run into some delays with jumping, and I think that tends to happen when you are on a flat surface and the model does not expect you to jump.

It works more consistently when there are obstacles on the ground, and it “makes sense” for you to jump over them.

Generally speaking, it did feel a lot like the real world (aside from the stylized art style). For the most part, object distance and sizing made sense. Things like motion parallax were consistent throughout the experience as well.

I will say that I was able to consistently break the model by asking it to add multiple actions, like swimming and walking at the same time. I think that prompt wording has to be pretty precise to get what you want.

All in all, though, it was an amazing experience and hard to believe how easy it is to generate simple interactive worlds.

Use Cases of Project Genie

I can think of a lot of ways people of all kinds can use Project Genie.

Rapid game prototyping: Designers can bypass months of environment modeling to test a level's "feel" instantly. Feed it precise information about your game's physics, art style, and provide sketches of a level to see how it might play out.
Training AI agents: World models provide a safe, infinite gym for robots to learn navigation and object manipulation before being deployed in the physical world.
Education: Students could generate a historical setting, such as Ancient Rome, and walk through the streets to understand the scale and architecture in a way a textbook cannot provide.

Here’s a little visit to the Globe Theater in London. Although not quite 100%, with further prompt refinement, we can start building a more perfect rendition of the world.

Here’s the prompt I used:

World Prompt:
A hyper realistic 360-degree immersive recreation of London's Southbank. Focus on the Globe Theater, a massive three-story open-air polygonal wooden structure with a thatched roof. The exterior is white lim-wash with dark oak timber framing. Nearby, include the Rose Theatre. 

The streets are narrow, unpaved, and muddy. Include the River Thames and the distance silhouette of Old London Bridge. 

Character: A person that wears a doublet of velvet and stained leather breeches. He has ink stains and a small jagged scar across his eyebrow.

Final Thoughts

Project Genie marks the beginning of the shift from Generative Media to Generative Simulation. We are moving past an era of simply looking at AI-generated content and into an era of living inside it. We are beginning to see a growth of large action models that understand your goals as a human and provide the environment and actions for those goals. We’re also seeing next-level video creation from tools like ByteDance’s Seedance 2.0, which we cover in a separate guide.

To dive deeper into what it takes to create these models, I recommend the following resources:

What is a Foundation World Model?

How does Project Genie differ from OpenAI Sora?

What is the technology powering Project Genie?

Who can access Project Genie in 2026?

What are the main limitations of Genie 3?

Author

Tim Lu

Sujets

Artificial Intelligence

Top AI Courses

Cursus

Principes fondamentaux de l'IA

10 h

Découvrez les principes fondamentaux de l'IA, apprenez à l'utiliser efficacement dans votre travail et explorez des modèles tels que chatGPT pour vous orienter dans le paysage dynamique de l'IA.

Afficher les détails

Commencer le cours

Cursus

Ingénieur IA associé pour les scientifiques de données

40 h

Entraînez et affinez les derniers modèles d'IA pour la production, y compris les LLM comme le Llama 3. Commencez dès aujourd'hui votre parcours pour devenir ingénieur en IA !

Afficher les détails

Commencer le cours

Cours

Apprentissage par renforcement profond en Python

4 h

4.9K

Apprenez et appliquez des algorithmes d’apprentissage par renforcement profond, avec techniques d’affinage et d’optimisation.

Afficher les détails

Commencer le cours

Contenus associés

blog

Introduction to Foundation Models

Explore the concept of AI foundation models, focusing on their key characteristics, applications, and future in the AI era.

Andrea Valenzuela

10 min

blog

What are Foundation Models?

Discover the key technology that is powering the generative AI boom

Javier Canales Luna

9 min

blog

A Beginner's Guide to GPT-3

GPT-3 is transforming the way businesses leverage AI to empower their existing products and build the next generation of products and software.

Sandra Kublik

15 min

Tutoriel

GitHub Models: A Guide With Practical Examples

Learn what GitHub Models are and how they enhance productivity by integrating AI capabilities into development workflows.

Patrick Brus

Tutoriel

Gemini 2.0 Flash: Step-by-Step Tutorial With Demo Project

Learn how to use Google's Gemini 2.0 Flash model to develop a visual assistant capable of reading on-screen content and answering questions about it using Python.

François Aubry

Tutoriel

Gemini 2.5 Pro API: A Guide With Demo Project

Learn how to use the Gemini 2.5 Pro API to build a web app for code analysis, taking advantage of the model's large context window.

Abid Ali Awan

Voir plus Voir plus

What is Project Genie?

Key Features of Project Genie

Hands-on With Project Genie

World sketching

World remixing

3D exploration

How Can I Access Project Genie?

How Good is Project Genie?

Use Cases of Project Genie

Final Thoughts

Project Genie FAQs

What is the technology powering Project Genie?

Who can access Project Genie in 2026?