Gemini 3.5 Flash: Google's Fastest Agentic Model

Google launched Gemini 3.5 Flash at I/O 2026, a model that outperforms Gemini 3.1 Pro on agentic and coding benchmarks while running four times faster than competitors.

May 19, 2026 · 8 min read

Google announced Gemini 3.5 Flash at I/O 2026 on May 19, a model that beats Gemini 3.1 Pro on agentic and coding benchmarks while running four times faster than other frontier models at the same tier.

The release comes as the AI industry's competitive focus has shifted squarely toward agentic performance. Coding agents, multi-step workflow automation, and long-horizon task execution have become the primary battlegrounds, and Google is positioning 3.5 Flash as its answer to that moment.

For all kinds of professionals, including data scientists, ML engineers, and developers, this matters because 3.5 Flash is now (or will be shortly) the default model in the Gemini app and AI Mode in Search, and it's available today via the Gemini API. In this article, I'll cover what was announced, what stands out, the benchmark numbers, and what it means for your work.

You can also check out some other key highlights of the event, Gemini Omni and Gemini Spark.

What's New With Gemini 3.5 Flash

The headline claim with Gemini 3.5 Flash is speed combined with frontier-level performance. Google says 3.5 Flash is four times faster on output tokens per second than other frontier models, while outperforming Gemini 3.1 Pro on the benchmarks that matter most for agentic work.

On Terminal-Bench 2.1, it scores 76.2%. On GDPval-AA, it reaches 1,656 Elo. On MCP Atlas, it hits 83.6%. For multimodal understanding, it scores 84.2% on CharXiv Reasoning.

In short, these numbers mean the old 'fast, cheap, or smart; pick two' rule in AI is less applicable. We're getting a lightweight model capable of handling complex, multi-step agent workflows without the massive latency.

Google says the model will be generally available today across Google AI Studio, the Gemini API, Android Studio, Gemini Enterprise Agent Platform, and Gemini Enterprise. It is also the new default model in the Gemini app and AI Mode in Search globally.

Google also announced that Gemini 3.5 Pro is in development, already in internal use, and expected to roll out next month. The 3.5 Flash release is the opening move in what Google is calling a new model family built around agentic execution.

Gemini 3.5 Background

The Gemini 3 series established Google's current position in the frontier model race. Gemini 3.1 Pro, released in February 2026, led the Artificial Analysis Intelligence Index when it launched and scored 77.1% on ARC-AGI-2, more than doubling Gemini 3 Pro's 31.1% on that benchmark.

As we covered in our GPT-5.5 vs Gemini 3.1 Pro comparison, Gemini 3.1 Pro's strength was in complex visual reasoning and multimodal tasks.

The Flash naming convention in the Gemini family has always signaled speed-optimized models. What's different with 3.5 Flash is that Google is claiming frontier-level intelligence at Flash speeds, not a quality trade-off. The Artificial Analysis index places 3.5 Flash in the top-right quadrant (according to Google), meaning high intelligence and high output speed simultaneously.

The Antigravity harness, Google's framework for deploying collaborative subagents, is central to how 3.5 Flash is being positioned. It's not just a standalone model but a component in a multi-agent architecture that Google has been building out alongside the model itself.

Key Features of Gemini 3.5

Here's a breakdown of the most interesting information from the announcement.

Agentic architecture and Antigravity

3.5 Flash is designed to work with the Antigravity harness, Google's framework for running collaborative subagents. With Antigravity, the model can deploy multiple subagents in parallel, execute multi-step workflows, and sustain performance across long-horizon tasks.

Google's examples include synthesizing the AlphaZero paper and coding a fully playable game in six hours using two agents, and transforming a legacy codebase to Next.js. These are not toy demos. They reflect the kind of multi-day developer tasks that agentic systems are now being asked to handle.

Real-world enterprise deployments

Several enterprises are already running 3.5 Flash in production or pilot. The specific use cases are worth noting because they illustrate where the model's agentic strengths are being applied:

Shopify: Running subagents in parallel to analyze complex data over a long horizon for merchant growth forecasts
Macquarie Bank: Piloting customer onboarding by reasoning over 100+ page documents with low latency
Salesforce: Integrating into Agentforce for multi-subagent enterprise task automation with multi-turn tool calling
Xero: Deploying agents to manage multi-week workflows, including 1099 tax form preparation for small businesses
Databricks: Using agentic workflows to monitor real-time information, diagnose issues, and propose solutions across large datasets
Ramp: Improving OCR accuracy on complex invoices through multimodal understanding combined with reasoning over historical patterns

Gemini Spark and consumer availability

3.5 Flash is also the model powering Gemini Spark, Google's new personal AI agent that runs 24/7 and takes action on behalf of users. Google is rolling out Spark to trusted testers now, with a Beta planned for Google AI Ultra subscribers in the US the week following the I/O announcement.

The model is available today to billions of users globally through the Gemini app and AI Mode in Search, making this one of the broadest simultaneous consumer and developer launches Google has done for a Gemini model.

Safety and safeguards

Google says 3.5 Flash was developed under its Frontier Safety Framework, with strengthened cyber and CBRN safeguards. The company is using interpretability tools that check the model's internal reasoning before it responds, which is meant to reduce both harmful outputs and false refusals on safe queries.

Gemini 3.5 Flash Benchmark Performance

Google's benchmark claims for 3.5 Flash are specific and worth examining directly. The model outperforms Gemini 3.1 Pro on the following:

Terminal-Bench 2.1: 76.2%, ranking second just behind GPT-5.5 (78.2%).
SWE-Bench Pro: 55.1% vs Gemini 3.1 Pros 54.2%. Claude Opus 4.7 still leads here with a good margin at 64.3%.
GDPval-AA: 1,656 Elo. Claude Opus 4.7 led this benchmark at 1,753 Elo when it launched, per our Claude Opus 4.7 vs Gemini 3.1 Pro review.
MCP Atlas: 83.6%, with which Gemini 3.5 Flash takes the lead in agentic tool use.
OSWorld: 78.4%, on par with GPT-5.5 (78.7%) and Claude Opus 4.7 (78.0%).
CharXiv Reasoning: 84.2% for multimodal understanding, just edging over the previous leader, GPT-5.5 (84.1%).
Finance Agent v2: 57.9%, which makes Gemini 3.5 Flash the leader by quite a margin over the next-best GPT-5.5 (51.8%).

The most interesting finding is that Gemini 3.5 Flash takes the top spot on MCP Atlas (83.6%), CharXiv Reasoning (84.2%), and Finance Agent v2 (57.9%), which measure three quite different things. Google's new model seems to excel in agentic tool use, visual reasoning, and complex financial workflows.

This offers plenty of interesting use cases. One example of combining all three strengths would be an agent that reads financial charts from earnings reports, interprets the visual data, and then autonomously calls APIs to rebalance a portfolio based on what it finds.

Here's a more detailed comparison table:

Benchmark	3.5 Flash	3 Flash	3.1 Pro	Opus 4.7	GPT-5.5
Terminal-bench 2.1	76.2%	58.0%	70.3%	66.1%	78.2%
SWE-Bench Pro	55.1%	49.6%	54.2%	64.3%	58.6%
MCP Atlas	83.6%	62.0%	78.2%	79.1%	75.3%
OSWorld	78.4%	65.1%	76.2%	78.0%	78.7%
Finance Agent v2	57.9%	42.6%	43.0%	51.5%	51.8%
CharXiv Reasoning	84.2%	80.3%	83.3%	82.1%	84.1%
Humanity's Last Exam	40.2%	33.7%	44.4%	46.9%	41.4%
ARC-AGI-2	72.1%	33.6%	77.1%	75.8%	84.6%

The speed claim is also notable: four times faster on output tokens per second than other frontier models. Google does not specify which models it is comparing against in the research notes, so treat that figure as directional rather than a precise head-to-head.

Gemini 3.5 for Data and AI Practitioners

The most immediate practical implication is that 3.5 Flash is available imminently via the Gemini API in Google AI Studio. If you are building agentic pipelines, the combination of the MCP Atlas score (83.6%) and the Antigravity multi-agent harness makes this worth testing against whatever you are currently using.

The GDPval-AA score of 1,656 Elo trails Claude Opus 4.7's 1,753 Elo from our earlier review, but 3.5 Flash's speed advantage may matter more depending on your latency requirements.

For teams running long-horizon workflows, the Xero and Shopify deployments are the most instructive signals. Multi-week workflows being compressed into automated agent runs is the use case Google is optimizing for, and the Antigravity harness is the infrastructure layer that makes that possible. If you are not already familiar with multi-agent orchestration patterns, this is a good moment to get up to speed.

One thing I'd watch carefully: Google says 3.5 Flash costs less than half the price of other frontier models for comparable tasks. That claim depends heavily on your specific workload, but if it holds up in practice, it changes the economics of running agentic systems at scale. The 3.5 Pro model, expected next month, will be the more interesting comparison point for teams doing the heaviest reasoning work.

Final Thoughts

Gemini 3.5 Flash shows that Google intends to compete on both ends of the performance-speed curve, not just at the flagship level. Outperforming Gemini 3.1 Pro on agentic benchmarks while running at Flash speeds is a meaningful shift, and the enterprise deployments at Shopify, Macquarie, and Salesforce suggest the model holds up outside of controlled benchmarks.

The broader picture is that Google is betting heavily on agentic infrastructure, with Antigravity, Gemini Spark, and 3.5 Flash all pointing in the same direction. Whether that bet pays off depends on how 3.5 Pro performs when it arrives next month, and how the Antigravity harness compares to competing multi-agent frameworks in real developer workflows.

If you want to get up to speed with agentic AI concepts and how to build with models like these, I recommend checking out the AI Agent Fundamentals skill track on DataCamp.

How does Gemini 3.5 Flash compare to Claude Opus 4.7 and GPT-5.5?

What is the Antigravity harness and why does it matter for Gemini 3.5 Flash?

Can I access Gemini 3.5 Flash through an API right now?

What is the difference between Gemini 3.5 Flash and the upcoming Gemini 3.5 Pro?

What does MCP Atlas measure, and why is Gemini 3.5 Flash's score significant?

Author

Matt Crabtree

Topics

Artificial Intelligence

AI Agents

Top DataCamp Courses

Track

AI Agent Fundamentals

6 hr

Discover how AI agents can change how you work and deliver value for your organization!

See Details

Start Course

Course

Introduction to AI for Work

2 hr

86.9K

Explore what AI is and how to use it responsibly for smarter, more productive work!

See Details

Start Course

Course

Practical AI with Google Gemini and NotebookLM

2 hr

7.4K

Master Gemini and NotebookLM to automate tasks, boost productivity, and work smarter across Google's AI ecosystem.

See Details

Start Course

blog

Google I/O 2026: The Start of the Agentic Gemini Era

From Gemini 3.5 Flash and Gemini Omni to Antigravity 2.0 and Gemini Spark, here's what matters most from Google's agent-focused I/O 2026 keynote.

Tom Farnschläder

12 min

blog

Gemini 3: Google’s Most Powerful LLM

Learn about Gemini 3 Pro, Google’s latest and most powerful LLM, which is topping benchmarks across the board. Plus, discover Gemini 3 Deep Think mode and Google Antigravity.

Matt Crabtree

13 min

blog

Gemini 3.1: Features, Benchmarks, Hands-On Tests, and More

Learn about Gemini 3.1 Pro, Google's latest reasoning model. Explore its features, benchmarks, hands-on tests, and how it compares to Claude Opus 4.6, Claude Sonnet 4.6, and GPT-5.2.

Khalid Abdelaty

11 min

blog

Gemini 2.5 Pro: Features, Tests, Access, Benchmarks, and More

Explore Google's Gemini 2.5 Pro, and learn about its impressive 1 million token context window, multimodal capabilities, hands-on test results, and how to access it.

Alex Olteanu

8 min

Tutorial

Gemini 2.0 Flash: Step-by-Step Tutorial With Demo Project

Learn how to use Google's Gemini 2.0 Flash model to develop a visual assistant capable of reading on-screen content and answering questions about it using Python.

François Aubry

Tutorial

Gemini 3 Deep Think: A Guide to AI Reasoning

Discover how Google's newest specialized reasoning model can accelerate your data science workflows, interpret complex datasets, and write robust code.

Tim Lu

See More See More

What's New With Gemini 3.5 Flash

Gemini 3.5 Background

Key Features of Gemini 3.5

Agentic architecture and Antigravity

Real-world enterprise deployments

Gemini Spark and consumer availability

Safety and safeguards

Gemini 3.5 Flash Benchmark Performance

Gemini 3.5 for Data and AI Practitioners

Final Thoughts

Gemini 3.5 Flash FAQs

Can I access Gemini 3.5 Flash through an API right now?

What is the difference between Gemini 3.5 Flash and the upcoming Gemini 3.5 Pro?

What does MCP Atlas measure, and why is Gemini 3.5 Flash's score significant?

Google I/O 2026: The Start of the Agentic Gemini Era

Gemini 3: Google’s Most Powerful LLM

Gemini 3.1: Features, Benchmarks, Hands-On Tests, and More

Gemini 2.5 Pro: Features, Tests, Access, Benchmarks, and More

Gemini 2.0 Flash: Step-by-Step Tutorial With Demo Project

Gemini 3 Deep Think: A Guide to AI Reasoning

.css-1531qan{-webkit-text-decoration:none;text-decoration:none;color:inherit;}AI Agent Fundamentals

Introduction to AI for Work

Practical AI with Google Gemini and NotebookLM

Google I/O 2026: The Start of the Agentic Gemini Era

Gemini 3: Google’s Most Powerful LLM

Gemini 3.1: Features, Benchmarks, Hands-On Tests, and More

Gemini 2.5 Pro: Features, Tests, Access, Benchmarks, and More

Gemini 2.0 Flash: Step-by-Step Tutorial With Demo Project

Gemini 3 Deep Think: A Guide to AI Reasoning

AI Agent Fundamentals