Track
HumanEval: A Benchmark for Evaluating LLM Code Generation Capabilities
Learn how to evaluate your LLM on code generation capabilities with the Hugging Face Evaluate library.
Nov 13, 2024 · 9 min read
Develop AI Applications
Learn to build AI applications using the OpenAI API.
Top AI Courses
16hrs hr
Track
Developing AI Applications
23hrs hr
Track
AI Fundamentals
10hrs hr
See More
RelatedSee MoreSee More
Tutorial
Evaluate LLMs Effectively Using DeepEval: A Practical Guide
Learn to use DeepEval to create Pytest-like relevance tests, evaluate LLM outputs with the G-eval metric, and benchmark Qwen 2.5 using MMLU.
Abid Ali Awan
6 min
Tutorial
Hugging Face's Text Generation Inference Toolkit for LLMs - A Game Changer in AI
A comprehensive guide to Hugging Face Text Generation Inference for self-hosting large language models on local devices.
Josep Ferrer
11 min
Tutorial
LlaMA-Factory WebUI Beginner's Guide: Fine-Tuning LLMs
Learn how to fine-tune LLMs on custom datasets, evaluate performance, and seamlessly export and serve models using the LLaMA-Factory's low/no-code framework.
Abid Ali Awan
12 min
Tutorial
Evaluating LLMs with MLflow: A Practical Beginner’s Guide
Learn how to streamline your LLM evaluations with MLflow. This guide covers MLflow setup, logging metrics, tracking experiment versions, and comparing models to make informed decisions for optimized LLM performance!
Maria Eugenia Inzaugarat
13 min
Tutorial
Fine-Tuning LLMs: A Guide With Examples
Learn how fine-tuning large language models (LLMs) improves their performance in tasks like language translation, sentiment analysis, and text generation.
Josep Ferrer
11 min
code-along
Understanding LLMs for Code Generation
Explore the role of LLMs for coding tasks, focusing on hands-on examples that demonstrate effective prompt engineering techniques to optimize code generation.
Andrea Valenzuela