课程

Efficient AI Model Training with PyTorch

高级技能水平

更新时间 2026年4月

Learn how to reduce training times for large language models with Accelerator and Trainer for distributed training

免费开始课程

PythonArtificial Intelligence4 小时13 视频45 练习3,850 经验值成就声明

创建您的免费帐户

或

继续操作即表示您接受我们的《使用条款》和《隐私政策》，并同意您的数据存储在美国。

深受数千家公司学习者的喜爱

培训2人或更多？

试用DataCamp for Business

课程描述

Distributed training is an essential skill in large-scale machine learning, helping you to reduce the time required to train large language models with trillions of parameters. In this course, you will explore the tools, techniques, and strategies essential for efficient distributed training using PyTorch, Accelerator, and Trainer.

Preparing Data for Distributed Training

You'll begin by preparing data for distributed training by splitting datasets across multiple devices and deploying model copies to each device. You'll gain hands-on experience in preprocessing data for distributed environments, including images, audio, and text.

Exploring Efficiency Techniques

Once your data is ready, you'll explore ways to improve efficiency in training and optimizer use across multiple interfaces. You'll see how to address these challenges by improving memory usage, device communication, and computational efficiency with techniques like gradient accumulation, gradient checkpointing, local stochastic gradient descent, and mixed precision training. You'll understand the tradeoffs between different optimizers to help you decrease your model's memory footprint. By the end of this course, you'll be equipped with the knowledge and tools to build distributed AI-powered services.

先决条件

Intermediate Deep Learning with PyTorch Working with Hugging Face

1

Data Preparation with Accelerator

You'll prepare data for distributed training by splitting the data across multiple devices and copying the model on each device. Accelerator provides a convenient interface for data preparation, and you'll learn how to preprocess images, audio, and text as a first step in distributed training.

Prepare models with AutoModel and Accelerator

50 经验值

Loading and inspecting pre-trained models

100 经验值

Automatic device placement with Accelerator

100 经验值

Preprocess images and audio for training

50 经验值

Preprocess image datasets

100 经验值

Preprocess audio datasets

100 经验值

Prepare datasets for distributed training

100 经验值

Preprocess text for training

50 经验值

Preprocess text with AutoTokenizer

100 经验值

Save and load the state of preprocessed text

100 经验值

2

Distributed Training with Accelerator and Trainer

In distributed training, each device trains on its data in parallel. You'll investigate two methods for distributed training: Accelerator enables custom training loops, and Trainer simplifies the interface for training.

Fine-tune models with Trainer

50 经验值

Define evaluation metrics

100 经验值

Specify the TrainingArguments

100 经验值

Set up the Trainer

100 经验值

Train models with Accelerator

50 经验值

Prepare a model for distributed training

100 经验值

Training loops before and after Accelerator

100 经验值

Building a training loop with Accelerator

100 经验值

Evaluate models with Accelerator

50 经验值

Setting the model in evaluation mode

100 经验值

Logging evaluation metrics

100 经验值

3

Improving Training Efficiency

Distributed training strains resources with large models and datasets, but you can address these challenges by improving memory usage, device communication, and computational efficiency. You'll discover the techniques of gradient accumulation, gradient checkpointing, local stochastic gradient descent, and mixed precision training.

Gradient accumulation

50 经验值

Gradient accumulation with Accelerator

100 经验值

Gradient accumulation with Trainer

100 经验值

Gradient checkpointing and local SGD

50 经验值

Gradient checkpointing with Accelerator

100 经验值

Gradient checkpointing with Trainer

100 经验值

Local SGD with Accelerator

100 经验值

Mixed precision training

50 经验值

Mixed precision training with basic PyTorch

100 经验值

Mixed precision training with Accelerator

100 经验值

Mixed precision training with Trainer

100 经验值

4

Training with Efficient Optimizers

You'll focus on optimizers as levers to improve distributed training efficiency, highlighting tradeoffs between AdamW, Adafactor, and 8-bit Adam. Reducing the number of parameters or using low precision helps to decrease a model's memory footprint.

Balanced training with AdamW

50 经验值

AdamW with Trainer

100 经验值

AdamW with Accelerator

100 经验值

Compute the optimizer size

100 经验值

Memory-efficient training with Adafactor

50 经验值

Adafactor with Trainer

100 经验值

Adafactor with Accelerator

100 经验值

Mixed precision training with 8-bit Adam

50 经验值

Set up the 8-bit Adam optimizer

100 经验值

8-bit Adam with Trainer

100 经验值

8-bit Adam with Accelerator

100 经验值

Which optimizer is it?

100 经验值

Congratulations!

50 经验值

Efficient AI Model Training with PyTorch

课程完成

获得成就证明

将此证书添加到你的 LinkedIn 档案、简历或履历中
在社交媒体和绩效评估中分享立即注册

加入超过19百万学习者，今天就开始Efficient AI Model Training with PyTorch！

创建您的免费帐户

或

继续操作即表示您接受我们的《使用条款》和《隐私政策》，并同意您的数据存储在美国。

通过 DataCamp for Mobile 提升您的数据技能

随时随地通过我们的移动课程和每日 5 分钟编程挑战提升技能。