Skip to main content

Gemini 2.0 Flash Thinking Experimental: A Guide With Examples

Learn about Gemini 2.0 Flash Thinking Experimental, including its features, benchmarks, limitations, and how it compares to other reasoning models.
Feb 6, 2025  · 8 min read

This year was predicted to be the year of agentic AI, but to me, it feels more like the year of reasoning models. Google has just introduced Gemini 2.0 Flash Thinking Experimental to Gemini app users, expanding access beyond its previous availability in Google AI Studio, the Gemini API, and Vertex AI.

The 2.0 Flash Thinking model is designed to compete with other reasoning models from OpenAI’s o-series and DeepSeek’s R-series.

Rather than just providing answers, Flash Thinking focuses on showing its thought process—breaking down steps, evaluating alternatives, and making reasoning more transparent. My first impression is that the chat-based version of Gemini 2.0 Flash Thinking is noticeably faster than its OpenAI and DeepSeek counterparts.

In this blog, I’ll cover what Gemini 2.0 Flash Thinking Experimental is, how you can use it, its benchmarks, and how and why to use it alongside tools like Search, Maps, and YouTube.

If you want to get more hands-on with Gemini 2.0, I recommend this video by my colleague Adel Nehme:

Multimodal Apps With Gemini 2.0 | Build a Local YouTube Content Creator Tool

What Is Gemini 2.0 Flash Thinking Experimental?

Google has introduced Gemini 2.0 Flash Thinking Experimental as a new AI model designed for advanced reasoning. Initially available through Google AI Studio, the Gemini API, and Vertex AI, it has now been rolled out to Gemini app users as well.

Unlike standard language models that prioritize generating fluent responses, Flash Thinking aims to break down its thought process, showing step-by-step reasoning, evaluating multiple options, and explaining its conclusions in a more structured way.

Gemini 2.0 Flash Thinking Experimental is a multimodal AI model, meaning it can process both text and images as input. This means we can use for tasks that require visual context, such as interpreting diagrams, analyzing charts, or extracting insights from complex documents.

However, unlike some other multimodal models, Flash Thinking only produces text-based outputs, meaning it won’t generate images or visual data in response.

One of its standout features is its massive context window, supporting up to 1 million tokens for input and generating responses of up to 64,000 tokens. This makes it one of the most expansive AI models available for long-form reasoning, allowing it to analyze entire books, research papers, or extended conversations while maintaining coherence.

The large token limit ensures the model can track complex arguments over extended interactions, reducing the need for users to repeatedly reintroduce context.

One key limitation to keep in mind is that Flash Thinking has a knowledge cutoff of June 2024. This means it does not have built-in knowledge of events beyond that point.

For this reason, it can hallucinate or make wrong assumptions, like in the example below. Because it has never heard about Gemini 2.0 Flash Thinking Experimental, it wrongly assumes that this model is the equivalent of Gemini Advanced and outputs a wrong date.

gemini 2.0 flash experimental has a knowledge cutoff date of june 2024

Gemini 2.0 Flash Thinking Experimental With Apps

To address the knowledge cutoff mentioned above, Google has integrated YouTube, Maps, and Search functionality into Gemini 2.0 Flash Thinking Experimental. You can enable this functionality from the model drop-down menu:

gemini drop down menu

Note that using these tools doesn’t guarantee a correct answer. For instance, it still got the release date wrong, but it was much closer this time—the correct release date is February 5, 2025 (see example below). This happened because the model used the Search tool and sourced its information from two cited websites. Since these sites were published on February 6, 2025, the model wrongly inferred that was the release date of Gemini 2.0 Flash Thinking Experimental.

gemini 2.0 flash experimental with apps

The model automatically selects the tools it needs based on the prompt. For instance, when I asked for the best driving route from Bucharest to London, the model automatically chose the Google Maps tool. Notice the Google Maps icon below and the mention of the Google Maps tool in its reasoning process.

Gemini 2.0 Flash Thinking Experimental With Apps automatically chooses the tool

Note that I’m only using these trivial questions to explore the model—for the question above, the simple 2.0 Flash model is enough. As we’ll see in the section below, Gemini 2.0 Flash Thinking Experimental is best used for advanced math, science, and multimodal reasoning.

Benchmarks: Gemini 2.0 Flash Thinking Experimental

Google’s Gemini 2.0 Flash Thinking Experimental has shown significant improvements in reasoning-based tasks, outperforming its predecessors in multiple key benchmarks (with a focus on mathematics, science, and multimodal reasoning).

Benchmark

Gemini 1.5 Pro

002

Gemini 2.0 Flash

Exp

Gemini 2.0 Flash

Thinking

Exp 01-21

AIME2024 (Math)

19.3%

35.5%

73.3%

GPQA Diamond (Science)

57.6%

58.6%

74.2%

MMMU (Multimodal reasoning)

64.9%

70.7%

75.4%

Source: Google

On the AIME2024 (math) benchmark, Gemini 2.0 Flash Thinking Experimental achieved 73.3%, a large increase from the 35.5% score of Gemini 2.0 Flash Experimental and the 19.3% of Gemini 1.5 Pro. For comparison, OpenAI’s o3-mini (high) scored 87.3% on this benchmark.

Similarly, for GPQA Diamond (science), the model reached 74.2%, improving from the 58.6% of the previous version. For comparison, DeepSeek-R1 scored 71.5% (pass@1—at the first pass) on this benchmark, while OpenAI’s o1 scored 75.7 (pass@1), and o3-mini (high) scored 79.7%. 

In MMMU (multimodal reasoning), which evaluates the model’s ability to integrate and interpret information across multiple modalities, Gemini 2.0 Flash Thinking Experimental scored 75.4%, again surpassing its earlier iterations.

Just as we’ve seen with other reasoning models, 2.0 Flash Thinking’s reasoning capabilities scale with increased inference compute (see graph below). Inference compute refers to the amount of computational power used after you give your prompt to the model.

gemini 2.0 flash thinking becomes better with more inference compute

Source: Google

How to Access Gemini 2.0 Flash Thinking Experimental

Google has made Gemini 2.0 Flash Thinking Experimental available through multiple platforms, including direct access for Gemini users.

Gemini Chat (Gemini App & Web)

The easiest way to access Gemini 2.0 Flash Thinking is through Google’s Gemini chat interface, available both in the Gemini web app and the Gemini mobile app.

At the time of publishing this article, access to 2.0 Flash Thinking is available for free to all users.

Use the drop-down menu on the upper left to access the model:

Accessing gemini 2.0 flash trhinking experimental

Google AI Studio

Google AI Studio is another way to use Flash Thinking. This web-based platform is directed at more advanced users, and it allows you to experiment with the model’s reasoning capabilities, control parameters like temperature, test complex queries, explore structured responses, etc.

google ai studio

Gemini API

For developers looking to integrate Flash Thinking into their applications, Gemini 2.0 Flash Thinking is available via the Gemini API.

Conclusion

Gemini 2.0 Flash Thinking Experimental expands Google’s push into AI reasoning, now available to Gemini app users. It stands out by explaining its thought process, structuring responses step by step, and using tools like Search, YouTube, and Maps.

While promising, Flash Thinking still faces challenges like occasional inaccuracies and over-reliance on sources. With OpenAI and DeepSeek also advancing reasoning models, it will be interesting to see how competition in this space evolves.

FAQs

Can I fine-tune Gemini 2.0 Flash Thinking for my own use case?

No, Google has not yet provided fine-tuning options for Flash Thinking.

Can Gemini 2.0 Flash Thinking generate images?

No, while the model processes text and images as inputs, it only generates text-based responses. It does not produce visual content like images or diagrams.

What kind of questions should I use Flash Thinking for?

It excels at complex reasoning tasks, such as mathematical problem-solving, scientific analysis, and multimodal data interpretation. Day-to-day queries may not require Flash Thinking and can be handled by standard Gemini models.


Alex Olteanu's photo
Author
Alex Olteanu
LinkedIn

I’m an editor and writer covering AI blogs, tutorials, and news, ensuring everything fits a strong content strategy and SEO best practices. I’ve written data science courses on Python, statistics, probability, and data visualization. I’ve also published an award-winning novel and spend my free time on screenwriting and film directing.

Topics

Learn AI with these courses!

track

AI Fundamentals

10hrs hr
Discover the fundamentals of AI, dive into models like ChatGPT, and decode generative AI secrets to navigate the dynamic AI landscape.
See DetailsRight Arrow
Start Course
See MoreRight Arrow
Related
Robot running physics simulations to represent the concept of Genesis physics engine

blog

Genesis Physics Engine: A Guide With Examples

Learn what Genesis is, why it's a major advancement in AI physics simulation, and how it compares to Sora or Veo 2.
Dr Ana Rojo-Echeburúa's photo

Dr Ana Rojo-Echeburúa

8 min

tutorial

Gemini 2.0 Flash: Step-by-Step Tutorial With Demo Project

Learn how to use Google's Gemini 2.0 Flash model to develop a visual assistant capable of reading on-screen content and answering questions about it using Python.
François Aubry's photo

François Aubry

12 min

tutorial

Building Multimodal AI Application with Gemini 2.0 Pro

Build a chat app that can understand text, images, audio, and documents, as well as execute Python code. Truly a multimodal application closer to AGI.
Abid Ali Awan's photo

Abid Ali Awan

11 min

tutorial

What is Google Gemini? Everything You Need To Know About Google’s ChatGPT Rival

Gemini defines a family of multimodal LLMs capable of understanding texts, images, videos, and audio. It’s also said to be capable of performing complex tasks in math and physics, as well as being able to generate high-quality code in several programming languages.
Kurtis Pykes 's photo

Kurtis Pykes

9 min

tutorial

Speculative Decoding: A Guide With Implementation Examples

Learn what speculative decoding is, how it works, when to use it, and how to implement it using Gemma2 models.
Aashi Dutt's photo

Aashi Dutt

12 min

tutorial

Gemini 1.5 Pro API Tutorial: Getting Started With Google's LLM

To connect to the Gemini 1.5 Pro API, obtain your API key from Google AI for Developers, install the necessary Python libraries, and send requests and receive responses from the Gemini 1.5 Pro model.
Natasha Al-Khatib's photo

Natasha Al-Khatib

8 min

See MoreSee More