Skip to main content

Gemini 3.5 Flash: Google's Fastest Agentic Model

Google launched Gemini 3.5 Flash at I/O 2026, a model that outperforms Gemini 3.1 Pro on agentic and coding benchmarks while running four times faster than competitors.
May 19, 2026  · 8 min read

Google announced Gemini 3.5 Flash at I/O 2026 on May 19, a model that beats Gemini 3.1 Pro on agentic and coding benchmarks while running four times faster than other frontier models at the same tier.

The release comes as the AI industry's competitive focus has shifted squarely toward agentic performance. Coding agents, multi-step workflow automation, and long-horizon task execution have become the primary battlegrounds, and Google is positioning 3.5 Flash as its answer to that moment.

For all kinds of professionals, including data scientists, ML engineers, and developers, this matters because 3.5 Flash is now (or will be shortly) the default model in the Gemini app and AI Mode in Search, and it's available today via the Gemini API. In this article, I'll cover what was announced, what stands out, the benchmark numbers, and what it means for your work.

You can also check out one of the other key highlights of the event - Gemini Omni.

What's New With Gemini 3.5 Flash

The headline claim with Gemii 3.5 Flash is speed combined with frontier-level performance. Google says 3.5 Flash is four times faster on output tokens per second than other frontier models, while outperforming Gemini 3.1 Pro on the benchmarks that matter most for agentic work.

On Terminal-Bench 2.1, it scores 76.2%. On GDPval-AA, it reaches 1,656 Elo. On MCP Atlas, it hits 83.6%. For multimodal understanding, it scores 84.2% on CharXiv Reasoning.

In short, these numbers mean the old 'fast, cheap, or smart; pick two' rule in AI is less applicable. We're getting a lightweight model capable of handling complex, multi-step agent workflows without the massive latency.

Google says the model will be generally available today across Google AI Studio, the Gemini API, Android Studio, Gemini Enterprise Agent Platform, and Gemini Enterprise. It is also the new default model in the Gemini app and AI Mode in Search globally.

Google also announced that Gemini 3.5 Pro is in development, already in internal use, and expected to roll out next month. The 3.5 Flash release is the opening move in what Google is calling a new model family built around agentic execution.

Gemini 3.5 Background

The Gemini 3 series established Google's current position in the frontier model race. Gemini 3.1 Pro, released in February 2026, led the Artificial Analysis Intelligence Index when it launched and scored 77.1% on ARC-AGI-2, more than doubling Gemini 3 Pro's 31.1% on that benchmark.

As we covered in our GPT-5.5 vs Gemini 3.1 Pro comparison, Gemini 3.1 Pro's strength was in complex visual reasoning and multimodal tasks.

The Flash naming convention in the Gemini family has always signaled speed-optimized models. What's different with 3.5 Flash is that Google is claiming frontier-level intelligence at Flash speeds, not a quality trade-off. The Artificial Analysis index places 3.5 Flash in the top-right quadrant (according to Google), meaning high intelligence and high output speed simultaneously.

The Antigravity harness, Google's framework for deploying collaborative subagents, is central to how 3.5 Flash is being positioned. It's not just a standalone model but a component in a multi-agent architecture that Google has been building out alongside the model itself.

Key Features of Gemini 3.5

Here's a breakdown of the most interesting information from the announcement.

Benchmark performance

Google's benchmark claims for 3.5 Flash are specific and worth examining directly. The model outperforms Gemini 3.1 Pro on the following:

  • Terminal-Bench 2.1: 76.2% (Gemini 3.1 Pro was benchmarked on Terminal-Bench 2.0 at 68.5%, per our prior coverage)
  • GDPval-AA: 1,656 Elo (Claude Opus 4.7 led this benchmark at 1,753 Elo when it launched, per our Claude Opus 4.7 vs Gemini 3.1 Pro review)
  • MCP Atlas: 83.6% (Gemini 3.1 Pro scored 73.9% on MCP Atlas in our prior testing)
  • CharXiv Reasoning: 84.2% for multimodal understanding

The speed claim is also notable: four times faster on output tokens per second than other frontier models. Google does not specify which models it is comparing against in the research notes, so treat that figure as directional rather than a precise head-to-head.

Agentic architecture and Antigravity

3.5 Flash is designed to work with the Antigravity harness, Google's framework for running collaborative subagents. With Antigravity, the model can deploy multiple subagents in parallel, execute multi-step workflows, and sustain performance across long-horizon tasks.

Google's examples include synthesizing the AlphaZero paper and coding a fully playable game in six hours using two agents, and transforming a legacy codebase to Next.js. These are not toy demos. They reflect the kind of multi-day developer tasks that agentic systems are now being asked to handle.

Real-world enterprise deployments

Several enterprises are already running 3.5 Flash in production or pilot. The specific use cases are worth noting because they illustrate where the model's agentic strengths are being applied:

  • Shopify: Running subagents in parallel to analyze complex data over a long horizon for merchant growth forecasts
  • Macquarie Bank: Piloting customer onboarding by reasoning over 100+ page documents with low latency
  • Salesforce: Integrating into Agentforce for multi-subagent enterprise task automation with multi-turn tool calling
  • Xero: Deploying agents to manage multi-week workflows, including 1099 tax form preparation for small businesses
  • Databricks: Using agentic workflows to monitor real-time information, diagnose issues, and propose solutions across large datasets
  • Ramp: Improving OCR accuracy on complex invoices through multimodal understanding combined with reasoning over historical patterns

Gemini Spark and consumer availability

3.5 Flash is also the model powering Gemini Spark, Google's new personal AI agent that runs 24/7 and takes action on behalf of users. Google is rolling out Spark to trusted testers now, with a Beta planned for Google AI Ultra subscribers in the US the week following the I/O announcement.

The model is available today to billions of users globally through the Gemini app and AI Mode in Search, making this one of the broadest simultaneous consumer and developer launches Google has done for a Gemini model.

Safety and safeguards

Google says 3.5 Flash was developed under its Frontier Safety Framework, with strengthened cyber and CBRN safeguards. The company is using interpretability tools that check the model's internal reasoning before it responds, which is meant to reduce both harmful outputs and false refusals on safe queries.

Gemini 3.5 for Data and AI Practitioners

The most immediate practical implication is that 3.5 Flash is available imminently via the Gemini API in Google AI Studio. If you are building agentic pipelines, the combination of the MCP Atlas score (83.6%) and the Antigravity multi-agent harness makes this worth testing against whatever you are currently using.

The GDPval-AA score of 1,656 Elo trails Claude Opus 4.7's 1,753 Elo from our earlier review, but 3.5 Flash's speed advantage may matter more depending on your latency requirements.

For teams running long-horizon workflows, the Xero and Shopify deployments are the most instructive signals. Multi-week workflows being compressed into automated agent runs is the use case Google is optimizing for, and the Antigravity harness is the infrastructure layer that makes that possible. If you are not already familiar with multi-agent orchestration patterns, this is a good moment to get up to speed.

One thing I'd watch carefully: Google says 3.5 Flash costs less than half the price of other frontier models for comparable tasks. That claim depends heavily on your specific workload, but if it holds up in practice, it changes the economics of running agentic systems at scale. The 3.5 Pro model, expected next month, will be the more interesting comparison point for teams doing the heaviest reasoning work.

Final Thoughts

Gemini 3.5 Flash shows that Google intends to compete on both ends of the performance-speed curve, not just at the flagship level. Outperforming Gemini 3.1 Pro on agentic benchmarks while running at Flash speeds is a meaningful shift, and the enterprise deployments at Shopify, Macquarie, and Salesforce suggest the model holds up outside of controlled benchmarks.

The broader picture is that Google is betting heavily on agentic infrastructure, with Antigravity, Gemini Spark, and 3.5 Flash all pointing in the same direction. Whether that bet pays off depends on how 3.5 Pro performs when it arrives next month, and how the Antigravity harness compares to competing multi-agent frameworks in real developer workflows.

If you want to get up to speed with agentic AI concepts and how to build with models like these, I recommend checking out the AI Agent Fundamentals skill track on DataCamp.


Matt Crabtree's photo
Author
Matt Crabtree
LinkedIn

A senior editor in the AI and edtech space. Committed to exploring data and AI trends.  

Topics

Top DataCamp Courses

Track

AI Agent Fundamentals

6 hr
Discover how AI agents can change how you work and deliver value for your organization!
See DetailsRight Arrow
Start Course
See MoreRight Arrow
Related

blog

Gemini 3: Google’s Most Powerful LLM

Learn about Gemini 3 Pro, Google’s latest and most powerful LLM, which is topping benchmarks across the board. Plus, discover Gemini 3 Deep Think mode and Google Antigravity.
Matt Crabtree's photo

Matt Crabtree

13 min

blog

Gemini 3.1: Features, Benchmarks, Hands-On Tests, and More

Learn about Gemini 3.1 Pro, Google's latest reasoning model. Explore its features, benchmarks, hands-on tests, and how it compares to Claude Opus 4.6, Claude Sonnet 4.6, and GPT-5.2.
Khalid Abdelaty's photo

Khalid Abdelaty

11 min

gemini 2.5 pro with a large context

blog

Gemini 2.5 Pro: Features, Tests, Access, Benchmarks, and More

Explore Google's Gemini 2.5 Pro, and learn about its impressive 1 million token context window, multimodal capabilities, hands-on test results, and how to access it.
Alex Olteanu's photo

Alex Olteanu

8 min

blog

Gemini 2.0 Flash Thinking Experimental: A Guide With Examples

Learn about Gemini 2.0 Flash Thinking Experimental, including its features, benchmarks, limitations, and how it compares to other reasoning models.
Alex Olteanu's photo

Alex Olteanu

8 min

Tutorial

Gemini 2.0 Flash: Step-by-Step Tutorial With Demo Project

Learn how to use Google's Gemini 2.0 Flash model to develop a visual assistant capable of reading on-screen content and answering questions about it using Python.
François Aubry's photo

François Aubry

Tutorial

Gemini 3 Deep Think: A Guide to AI Reasoning

Discover how Google's newest specialized reasoning model can accelerate your data science workflows, interpret complex datasets, and write robust code.
Tim Lu's photo

Tim Lu

See MoreSee More