Evals for Agents with Arize
Key Takeaways:- Learn key principles for observing and evaluating AI agents.
- Get hands-on with the Arize platform for agent testing and monitoring.
- Build, evaluate, and analyze a simple AI agent end to end.
Description
As AI agents become more autonomous, testing and debugging their behavior becomes both more important—and more challenging. Traditional metrics often fall short when agents reason, plan, and act across multiple steps. To build reliable agentic systems, teams need strong observability and evaluation practices baked in from day one.
In this code-along webinar, Laurie Voss, Head of Developer Relations at Arize, will show you how to automatically test and debug AI agents using the Arize AI engineering platform. You’ll learn core principles of agent evaluation, then build and instrument a simple agent to track performance, behavior, and failure modes. By the end of the session, you’ll have a practical framework for monitoring agents in development and beyond.
Presenter Bio

Laurie is web developer turned startup executive turned data and AI evangelist. With over 30 years of experience in tech, he runs the developer relations team at Arize, teaching people how to evaluate AI applications. Previously, Laurie was VP of Developer Relations at LlamaIndex, and was the founding CTO of npm, taking it from a hobby project to 5M active users. He also served as COO and CDO at npm.