Evaluating LLM Responses

Key Takeaways:

Tuesday November 28, 11 AM ET

First name

•Required

Last name

•Required

Company email

•Required

Country

•Required

Company

•Required

Job title

•Required

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.

Description

LLMs should be considered hallucinatory until proven otherwise! A lot of us have turned to augmenting LLMs with a knowledge store (such as Zilliz) to solve this problem. But this RAG setup can still face issues with hallucination. In particular - this can be caused from retrieving irrelevant context, not enough context, and more.

TruLens is built to solve this problem. TruLens sits as the evaluation layer for the LLM stack, allowing you to shorten the feedback loop and iterate on your LLM app faster. We'll also talk about the different metrics you can use for evaluation and why you should consider LLM-based evals when building your app.

Presenter Bio

Josh ReiniDeveloper Relations Engineer at TruEra

Josh is a core contributor to open-source TruLens and the founding Developer Relations Data Scientist at TruEra where he is responsible for education initiatives and nurturing a thriving community of AI Quality practitioners.

Josh has delivered tech talks and workshops to more than a thousand developers at events including the Global AI Conference, NYC Dev Day 2023, LLMs and the Generative AI Revolution 2023, AI developer meetups and the AI Quality Workshop (both in live format and on-demand).

Prior to TruEra, Josh delivered end-to-end data and machine learning and solutions to clients including the Department of State and the Walter Reed National Military Medical Center.

View More Webinars