课程

Reinforcement Learning from Human Feedback (RLHF)

高级技能水平

更新时间 2024年10月

Learn how to make GenAI models truly reflect human values while gaining hands-on experience with advanced LLMs.

免费开始课程

PythonArtificial Intelligence

4小时

13 视频

38 道练习

2,900 XP

3,664

成就证明

深受数千家公司学习者的喜爱

需要团队培训？

企业版试用

课程描述

Combine the efficiency of Generative AI with the understanding of human expertise in this course on Reinforcement Learning from Human Feedback. You’ll learn how to make GenAI models truly reflect human values and preferences while getting hands-on experience with LLMs. You’ll also navigate the complexities of reward models and learn how to build upon LLMs to produce AI that not only learns but also adapts to real-world scenarios.

先决条件

Deep Reinforcement Learning in Python

1

Foundational Concepts

This chapter introduces the basics of Reinforcement Learning with Human Feedback (RLHF), a technique that uses human input to help AI models learn more effectively. Get started with RLHF by understanding how it differs from traditional reinforcement learning and why human feedback can enhance AI performance in various domains.

Introduction to RLHF

Text generation with RLHF

Classifying generated text for RLHF

RL vs. RLHF

Exploring pre-trained LLMs

Tokenize a text dataset

Fine-tuning for review classification

Preparing data for RLHF

Preparing the preference dataset

Extracting prompts

2

Gathering Human Feedback

Discover how to set up systems for gathering human feedback in this Chapter. Learn best practices for collecting high-quality data, from pairwise comparisons to uncertainty sampling, and explore strategies for enhancing your data collection.

Methods for high-quality feedback gathering

Understanding comparison and rating in RLHF

Comparing slogans for a gym campaign

Measuring feedback quality and relevance

Low confidence

K-means for feedback clustering

Active learning

Implementing an active learning pipeline

Active learning loop

3

Tuning Models with Human Feedback

In this Chapter, you'll get into the core of Reinforcement Learning from Human Feedback training. This includes exploring fine-tuning with PPO, techniques to train efficiently, and handling potential divergences from your metrics' objectives.

Reward models explored

Initializing the reward

Setting up the reward trainer

Training with PPO

Initialize the PPO trainer

PPO fine-tuning

Efficient fine-tuning in RLHF

Prepare for 8-bit Training

Train with LoRA

4

Model Evaluation

Explore key techniques for assessing and improving model performance in this last Chapter of Reinforcement Learning from Human Feedback (RLHF): from fine-tuning metrics to incorporating diverse feedback sources, you'll be provided with a comprehensive toolkit to refine your models effectively.

Model metrics and adjustments

Mitigating negative KL divergence

Checking the reward model

Incorporating diverse feedback sources

Majority voting on multiple data sources

Unreliable data source identification

Evaluating RLHF models

Interpreting curves

Evaluating RLHF with metrics

Wrapping up your RLHF journey

Reinforcement Learning from Human Feedback (RLHF)

课程完成

获得成就证明

将此证书添加到您的 LinkedIn 档案、简历或履历中
在社交媒体和绩效评估中分享立即注册

加入超过19百万学习者，今天就开始Reinforcement Learning from Human Feedback (RLHF)！

通过 DataCamp for Mobile 提升您的数据技能

随时随地通过我们的移动课程和每日 5 分钟编程挑战提升技能。