Gemini 2.0 Flash Thinking Experimental: A Guide With Examples

Learn about Gemini 2.0 Flash Thinking Experimental, including its features, benchmarks, limitations, and how it compares to other reasoning models.

Feb 6, 2025 · 8 min read

This year was predicted to be the year of agentic AI, but to me, it feels more like the year of reasoning models. Google has just introduced Gemini 2.0 Flash Thinking Experimental to Gemini app users, expanding access beyond its previous availability in Google AI Studio, the Gemini API, and Vertex AI.

The 2.0 Flash Thinking model is designed to compete with other reasoning models from OpenAI’s o-series and DeepSeek’s R-series.

Rather than just providing answers, Flash Thinking focuses on showing its thought process—breaking down steps, evaluating alternatives, and making reasoning more transparent. My first impression is that the chat-based version of Gemini 2.0 Flash Thinking is noticeably faster than its OpenAI and DeepSeek counterparts.

In this blog, I’ll cover what Gemini 2.0 Flash Thinking Experimental is, how you can use it, its benchmarks, and how and why to use it alongside tools like Search, Maps, and YouTube.

If you want to get more hands-on with Gemini 2.0, I recommend this video by my colleague Adel Nehme:

Multimodal Apps With Gemini 2.0 | Build a Local YouTube Content Creator Tool

What Is Gemini 2.0 Flash Thinking Experimental?

Google has introduced Gemini 2.0 Flash Thinking Experimental as a new AI model designed for advanced reasoning. Initially available through Google AI Studio, the Gemini API, and Vertex AI, it has now been rolled out to Gemini app users as well.

Unlike standard language models that prioritize generating fluent responses, Flash Thinking aims to break down its thought process, showing step-by-step reasoning, evaluating multiple options, and explaining its conclusions in a more structured way.

Gemini 2.0 Flash Thinking Experimental is a multimodal AI model, meaning it can process both text and images as input. This means we can use for tasks that require visual context, such as interpreting diagrams, analyzing charts, or extracting insights from complex documents.

However, unlike some other multimodal models, Flash Thinking only produces text-based outputs, meaning it won’t generate images or visual data in response.

One of its standout features is its massive context window, supporting up to 1 million tokens for input and generating responses of up to 64,000 tokens. This makes it one of the most expansive AI models available for long-form reasoning, allowing it to analyze entire books, research papers, or extended conversations while maintaining coherence.

The large token limit ensures the model can track complex arguments over extended interactions, reducing the need for users to repeatedly reintroduce context.

One key limitation to keep in mind is that Flash Thinking has a knowledge cutoff of June 2024. This means it does not have built-in knowledge of events beyond that point.

For this reason, it can hallucinate or make wrong assumptions, like in the example below. Because it has never heard about Gemini 2.0 Flash Thinking Experimental, it wrongly assumes that this model is the equivalent of Gemini Advanced and outputs a wrong date.

Gemini 2.0 Flash Thinking Experimental With Apps

To address the knowledge cutoff mentioned above, Google has integrated YouTube, Maps, and Search functionality into Gemini 2.0 Flash Thinking Experimental. You can enable this functionality from the model drop-down menu:

Note that using these tools doesn’t guarantee a correct answer. For instance, it still got the release date wrong, but it was much closer this time—the correct release date is February 5, 2025 (see example below). This happened because the model used the Search tool and sourced its information from two cited websites. Since these sites were published on February 6, 2025, the model wrongly inferred that was the release date of Gemini 2.0 Flash Thinking Experimental.

The model automatically selects the tools it needs based on the prompt. For instance, when I asked for the best driving route from Bucharest to London, the model automatically chose the Google Maps tool. Notice the Google Maps icon below and the mention of the Google Maps tool in its reasoning process.

Note that I’m only using these trivial questions to explore the model—for the question above, the simple 2.0 Flash model is enough. As we’ll see in the section below, Gemini 2.0 Flash Thinking Experimental is best used for advanced math, science, and multimodal reasoning.

Benchmarks: Gemini 2.0 Flash Thinking Experimental

Google’s Gemini 2.0 Flash Thinking Experimental has shown significant improvements in reasoning-based tasks, outperforming its predecessors in multiple key benchmarks (with a focus on mathematics, science, and multimodal reasoning).

Benchmark	Gemini 1.5 Pro 002	Gemini 2.0 Flash Exp	Gemini 2.0 Flash Thinking Exp 01-21
AIME2024 (Math)	19.3%	35.5%	73.3%
GPQA Diamond (Science)	57.6%	58.6%	74.2%
MMMU (Multimodal reasoning)	64.9%	70.7%	75.4%

Source: Google

On the AIME2024 (math) benchmark, Gemini 2.0 Flash Thinking Experimental achieved 73.3%, a large increase from the 35.5% score of Gemini 2.0 Flash Experimental and the 19.3% of Gemini 1.5 Pro. For comparison, OpenAI’s o3-mini (high) scored 87.3% on this benchmark.

Similarly, for GPQA Diamond (science), the model reached 74.2%, improving from the 58.6% of the previous version. For comparison, DeepSeek-R1 scored 71.5% (pass@1—at the first pass) on this benchmark, while OpenAI’s o1 scored 75.7 (pass@1), and o3-mini (high) scored 79.7%.

In MMMU (multimodal reasoning), which evaluates the model’s ability to integrate and interpret information across multiple modalities, Gemini 2.0 Flash Thinking Experimental scored 75.4%, again surpassing its earlier iterations.

Just as we’ve seen with other reasoning models, 2.0 Flash Thinking’s reasoning capabilities scale with increased inference compute (see graph below). Inference compute refers to the amount of computational power used after you give your prompt to the model.

Source: Google

How to Access Gemini 2.0 Flash Thinking Experimental

Google has made Gemini 2.0 Flash Thinking Experimental available through multiple platforms, including direct access for Gemini users.

Gemini Chat (Gemini App & Web)

The easiest way to access Gemini 2.0 Flash Thinking is through Google’s Gemini chat interface, available both in the Gemini web app and the Gemini mobile app.

At the time of publishing this article, access to 2.0 Flash Thinking is available for free to all users.

Use the drop-down menu on the upper left to access the model:

Google AI Studio

Google AI Studio is another way to use Flash Thinking. This web-based platform is directed at more advanced users, and it allows you to experiment with the model’s reasoning capabilities, control parameters like temperature, test complex queries, explore structured responses, etc.

Gemini API

For developers looking to integrate Flash Thinking into their applications, Gemini 2.0 Flash Thinking is available via the Gemini API.

Conclusion

Gemini 2.0 Flash Thinking Experimental expands Google’s push into AI reasoning, now available to Gemini app users. It stands out by explaining its thought process, structuring responses step by step, and using tools like Search, YouTube, and Maps.

While promising, Flash Thinking still faces challenges like occasional inaccuracies and over-reliance on sources. With OpenAI and DeepSeek also advancing reasoning models, it will be interesting to see how competition in this space evolves. I also recommend checking out our guide to the new Gemini 3 model.

Can I fine-tune Gemini 2.0 Flash Thinking for my own use case?

Can Gemini 2.0 Flash Thinking generate images?

What kind of questions should I use Flash Thinking for?

Author

Alex Olteanu

Topics

Artificial Intelligence

Large Language Models

AI News

Learn AI with these courses!

Track

AI Fundamentals

0 min

Discover the fundamentals of AI, dive into models like ChatGPT, and decode generative AI secrets to navigate the dynamic AI landscape.

See Details

Start Course

Track

Llama Fundamentals

0 min

Experiment with Llama 3 to run inference on pre-trained models, fine-tune them on custom datasets, and optimize performance.

See Details

Start Course

Course

Implementing AI Solutions in Business

2 hr

41.4K

Discover how to extract business value from AI. Learn to scope opportunities for AI, create POCs, implement solutions, and develop an AI strategy.

See Details

Start Course

blog

Gemini 2.5 Pro: Features, Tests, Access, Benchmarks, and More

Explore Google's Gemini 2.5 Pro, and learn about its impressive 1 million token context window, multimodal capabilities, hands-on test results, and how to access it.

Alex Olteanu

8 min

Tutorial

Gemini 2.0 Flash: Step-by-Step Tutorial With Demo Project

Learn how to use Google's Gemini 2.0 Flash model to develop a visual assistant capable of reading on-screen content and answering questions about it using Python.

François Aubry

Tutorial

Gemini Image Editing: A Guide With 10 Practical Examples

Learn how to edit images using Gemini 2.0 Flash with practical examples and understand its strengths and limitations.

François Aubry

Tutorial

Gemini Deep Think: A Test On 5 Real-World Problems

I tested Google's new Gemini 2.5 Deep Think mode on five real-world problems, ranging from PhD-level questions to business challenges.

François Aubry

Tutorial

Gemini 2.5 Flash Image (Nano Banana): A Complete Guide With Practical Examples

Learn how to use Google’s Gemini 2.5 Flash Image for professional AI image generation. This step-by-step guide covers setup, prompt engineering, editing workflows, and advanced features.

Bex Tuychiev

Tutorial

Gemini 2.0 Flash: How to Process Large Documents Without RAG

Learn how to use Gemini 2.0 Flash's massive context window to build a SaaS sales insights tool that answers business queries without needing RAG.

Aashi Dutt

See More See More

What Is Gemini 2.0 Flash Thinking Experimental?

Gemini 2.0 Flash Thinking Experimental With Apps

Benchmarks: Gemini 2.0 Flash Thinking Experimental

How to Access Gemini 2.0 Flash Thinking Experimental

Gemini Chat (Gemini App & Web)

Google AI Studio

Gemini API

Conclusion

FAQs

What kind of questions should I use Flash Thinking for?

Gemini 2.5 Pro: Features, Tests, Access, Benchmarks, and More

Gemini 2.0 Flash: Step-by-Step Tutorial With Demo Project

Gemini Image Editing: A Guide With 10 Practical Examples

Gemini Deep Think: A Test On 5 Real-World Problems

Gemini 2.5 Flash Image (Nano Banana): A Complete Guide With Practical Examples

Gemini 2.0 Flash: How to Process Large Documents Without RAG

.css-1531qan{-webkit-text-decoration:none;text-decoration:none;color:inherit;}AI Fundamentals

Llama Fundamentals

Implementing AI Solutions in Business

Gemini 2.5 Pro: Features, Tests, Access, Benchmarks, and More

Gemini 2.0 Flash: Step-by-Step Tutorial With Demo Project

Gemini Image Editing: A Guide With 10 Practical Examples

Gemini Deep Think: A Test On 5 Real-World Problems

Gemini 2.5 Flash Image (Nano Banana): A Complete Guide With Practical Examples

Gemini 2.0 Flash: How to Process Large Documents Without RAG

AI Fundamentals