Skip to main content
HomeCode-alongsArtificial Intelligence (AI)

Evaluating LLM Responses

In this session, we cover the different evaluations that are useful for reducing hallucination and improving retrieval quality of LLMs.
Nov 2023
Code along with us onCode Along

View Slides

LLMs should be considered hallucinatory until proven otherwise! A lot of us have turned to augmenting LLMs with a knowledge store (such as Zilliz) to solve this problem. But this RAG setup can still face issues with hallucination. In particular - this can be caused from retrieving irrelevant context, not enough context, and more.

TruLens is built to solve this problem. TruLens sits as the evaluation layer for the LLM stack, allowing you to shorten the feedback loop and iterate on your LLM app faster. We'll also talk about the different metrics you can use for evaluation and why you should consider LLM-based evals when building your app.

Key Takeaways:

  • Learn about common failure modes for LLM apps
  • Learn the different evaluations that are useful for reducing hallucination, improving retrieval quality & more.
  • Learn about how to evaluate LLM apps with TruLens

Additional Resources

TruLens Documentation

TruLens GitHub

Find the prompts used for LLM-based feedback functions in TruLens' open-source github repository here.

[SKILL TRACK] AI Fundamentals

[COURSE] Working with the OpenAI API

[TUTORIAL] How to Build LLM Applications with LangChain

Topics
Related

blog

Attention Mechanism in LLMs: An Intuitive Explanation

Learn how the attention mechanism works and how it revolutionized natural language processing (NLP).
Yesha Shastri's photo

Yesha Shastri

8 min

blog

12 LLM Projects For All Levels

Discover 12 LLM project ideas with easy-to-follow visual guides and source codes, suitable for beginners, intermediate students, final-year scholars, and experts.
Abid Ali Awan's photo

Abid Ali Awan

12 min

tutorial

Boost LLM Accuracy with Retrieval Augmented Generation (RAG) and Reranking

Discover the strengths of LLMs with effective information retrieval mechanisms. Implement a reranking approach and incorporate it into your own LLM pipeline.
Iván Palomares Carrascosa's photo

Iván Palomares Carrascosa

11 min

tutorial

LlaMA-Factory WebUI Beginner's Guide: Fine-Tuning LLMs

Learn how to fine-tune LLMs on custom datasets, evaluate performance, and seamlessly export and serve models using the LLaMA-Factory's low/no-code framework.
Abid Ali Awan's photo

Abid Ali Awan

12 min

tutorial

LLM Classification: How to Select the Best LLM for Your Application

Discover the family of LLMs available and the elements to consider when evaluating which LLM is the best for your use case.
Andrea Valenzuela's photo

Andrea Valenzuela

15 min

code-along

Retrieval Augmented Generation with LlamaIndex

In this session you'll learn how to get started with Chroma and perform Q&A on some documents using Llama 2, the RAG technique, and LlamaIndex.
Dan Becker's photo

Dan Becker

See MoreSee More