Evals for Agents with Arize

Key Takeaways:

Tuesday, March 10, 11 AM ET

Description

As AI agents become more autonomous, testing and debugging their behavior becomes both more important—and more challenging. Traditional metrics often fall short when agents reason, plan, and act across multiple steps. To build reliable agentic systems, teams need strong observability and evaluation practices baked in from day one.

In this code-along webinar, Laurie Voss, Head of Developer Relations at Arize, will show you how to automatically test and debug AI agents using the Arize AI engineering platform. You’ll learn core principles of agent evaluation, then build and instrument a simple agent to track performance, behavior, and failure modes. By the end of the session, you’ll have a practical framework for monitoring agents in development and beyond.

Presenter Bio

Laurie VossHead of Developer Relations at Arize

View More Webinars