Win Tun Lin has completed

Efficient AI Model Training with PyTorch

4 hr

3,850 XP

Loved by learners at thousands of companies

Course Description

Distributed training is an essential skill in large-scale machine learning, helping you to reduce the time required to train large language models with trillions of parameters. In this course, you will explore the tools, techniques, and strategies essential for efficient distributed training using PyTorch, Accelerator, and Trainer.

Preparing Data for Distributed Training

You'll begin by preparing data for distributed training by splitting datasets across multiple devices and deploying model copies to each device. You'll gain hands-on experience in preprocessing data for distributed environments, including images, audio, and text.

Exploring Efficiency Techniques

Once your data is ready, you'll explore ways to improve efficiency in training and optimizer use across multiple interfaces. You'll see how to address these challenges by improving memory usage, device communication, and computational efficiency with techniques like gradient accumulation, gradient checkpointing, local stochastic gradient descent, and mixed precision training. You'll understand the tradeoffs between different optimizers to help you decrease your model's memory footprint. By the end of this course, you'll be equipped with the knowledge and tools to build distributed AI-powered services.

For Business

Training 2 or more people?

Get your team access to the full DataCamp platform, including all the features.

1
Data Preparation with Accelerator
Free
You'll prepare data for distributed training by splitting the data across multiple devices and copying the model on each device. Accelerator provides a convenient interface for data preparation, and you'll learn how to preprocess images, audio, and text as a first step in distributed training.
Play Chapter Now
Prepare models with AutoModel and Accelerator
50 xp
Loading and inspecting pre-trained models
100 xp
Automatic device placement with Accelerator
100 xp
Preprocess images and audio for training
50 xp
Preprocess image datasets
100 xp
Preprocess audio datasets
100 xp
Prepare datasets for distributed training
100 xp
Preprocess text for training
50 xp
Preprocess text with AutoTokenizer
100 xp
Save and load the state of preprocessed text
100 xp
2
Distributed Training with Accelerator and Trainer
In distributed training, each device trains on its data in parallel. You'll investigate two methods for distributed training: Accelerator enables custom training loops, and Trainer simplifies the interface for training.
Play Chapter Now
Fine-tune models with Trainer
50 xp
Define evaluation metrics
100 xp
Specify the TrainingArguments
100 xp
Set up the Trainer
100 xp
Train models with Accelerator
50 xp
Prepare a model for distributed training
100 xp
Training loops before and after Accelerator
100 xp
Building a training loop with Accelerator
100 xp
Evaluate models with Accelerator
50 xp
Setting the model in evaluation mode
100 xp
Logging evaluation metrics
100 xp
3
Improving Training Efficiency
Distributed training strains resources with large models and datasets, but you can address these challenges by improving memory usage, device communication, and computational efficiency. You'll discover the techniques of gradient accumulation, gradient checkpointing, local stochastic gradient descent, and mixed precision training.
Play Chapter Now
Gradient accumulation
50 xp
Gradient accumulation with Accelerator
100 xp
Gradient accumulation with Trainer
100 xp
Gradient checkpointing and local SGD
50 xp
Gradient checkpointing with Accelerator
100 xp
Gradient checkpointing with Trainer
100 xp
Local SGD with Accelerator
100 xp
Mixed precision training
50 xp
Mixed precision training with basic PyTorch
100 xp
Mixed precision training with Accelerator
100 xp
Mixed precision training with Trainer
100 xp
4
Training with Efficient Optimizers
You'll focus on optimizers as levers to improve distributed training efficiency, highlighting tradeoffs between AdamW, Adafactor, and 8-bit Adam. Reducing the number of parameters or using low precision helps to decrease a model's memory footprint.
Play Chapter Now
Balanced training with AdamW
50 xp
AdamW with Trainer
100 xp
AdamW with Accelerator
100 xp
Compute the optimizer size
100 xp
Memory-efficient training with Adafactor
50 xp
Adafactor with Trainer
100 xp
Adafactor with Accelerator
100 xp
Mixed precision training with 8-bit Adam
50 xp
Set up the 8-bit Adam optimizer
100 xp
8-bit Adam with Trainer
100 xp
8-bit Adam with Accelerator
100 xp
Which optimizer is it?
100 xp
Congratulations!
50 xp

For Business

Training 2 or more people?

Get your team access to the full DataCamp platform, including all the features.

datasets

Audio dataset Crop image Agricultural QA dataset MRPC dataset

collaborators

James Chapman

Jasmin Ludolf

Francesca Donadoni

prerequisites

Intermediate Deep Learning with PyTorch Working with Hugging Face

Dennis Lee

Software Engineer at Amazon

Dennis is passionate about simplifying science and technology for everyone. He is a software engineer at Amazon, optimizing supply chain networks. He has experience across software engineering, data science, and data engineering in various industries from management consulting to operations. He earned his Ph.D. in Electrical and Computer Engineering.

Join over 18 million learners and start Efficient AI Model Training with PyTorch today!

Create Your Free Account

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.

Efficient AI Model Training with PyTorch

Loved by learners at thousands of companies

Course Description

Preparing Data for Distributed Training

Exploring Efficiency Techniques

.css-10r9e5n{-webkit-margin-end:8px;margin-inline-end:8px;}.css-1309hh9{-webkit-flex-shrink:0;-ms-flex-negative:0;flex-shrink:0;-webkit-margin-end:8px;margin-inline-end:8px;}Training 2 or more people?

Data Preparation with Accelerator

Distributed Training with Accelerator and Trainer

Improving Training Efficiency

Training with Efficient Optimizers

Training 2 or more people?

Join over .css-ou6dz6{color:#03ef62;}18 million learners and start Efficient AI Model Training with PyTorch today!

Create Your Free Account

Training 2 or more people?

Join over 18 million learners and start Efficient AI Model Training with PyTorch today!