
Loved by learners at thousands of companies
Course Description
Deep-Dive into the Transformer Architecture
Transformer models have revolutionized text modeling, kickstarting the generative AI boom by enabling today's large language models (LLMs). In this course, you'll look at the key components in this architecture, including positional encoding, attention mechanisms, and feed-forward sublayers. You'll code these components in a modular way to build your own transformer step-by-step.Implement Attention Mechanisms with PyTorch
The attention mechanism is a key development that helped formalize the transformer architecture. Self-attention allows transformers to better identify relationships between tokens, which improves the quality of generated text. Learn how to create a multi-head attention mechanism class that will form a key building block in your transformer models.Build Your Own Transformer Models
Learn to build encoder-only, decoder-only, and encoder-decoder transformer models. Learn how to choose and code these different transformer architectures for different language tasks, including text classification and sentiment analysis, text generation and completion, and sequence-to-sequence translation.Training 2 or more people?
Get your team access to the full DataCamp platform, including all the features.- 1
The Building Blocks of Transformer Models
FreeDiscover what makes the hottest deep learning architecture in AI tick! Learn about the components that make up Transformer models, including the famous self-attention mechanisms described in the renowned paper "Attention is All You Need."
Transformers with PyTorch50 xpBreaking down the Transformer50 xpPyTorch Transformers100 xpEmbedding and positional encoding50 xpCreating input embeddings100 xpCreating positional encodings100 xpMulti-head self-attention50 xpImplementing multi-head attention100 xpStarting the MultiHeadAttentionClass100 xpAdding methods to the MultiHeadAttention class100 xp - 2
Building Transformer Architectures
FreeDesign transformer encoder and decoder blocks, and combine them with positional encoding, multi-headed attention, and position-wise feed-forward networks to build your very own Transformer architectures. Along the way, you'll develop a deep understanding and appreciation for how transformers work under the hood.
Encoder transformers50 xpFeed-forward sublayers100 xpThe encoder transformer layer100 xpThe encoder transformer body100 xpAdding the transformer head100 xpDecoder transformers50 xpDesigning a mask for self-attention100 xpThe decoder layer100 xpCompleting the decoder transformer100 xpEncoder-decoder transformers50 xpAdding cross-attention to the decoder layer100 xpConstructing the encoder-decoder transformer100 xpCongratulations!50 xp
Training 2 or more people?
Get your team access to the full DataCamp platform, including all the features.collaborators


prerequisites
Deep Learning for Text with PyTorchAI Curriculum Manager, DataCamp
James is a Curriculum Manager at DataCamp, where he collaborates with experts from industry and academia to create courses on AI, data science, and analytics. He has led nine DataCamp courses on diverse topics in Python, R, AI developer tooling, and Google Sheets. He has a Master's degree in Physics and Astronomy from Durham University, where he specialized in high-redshift quasar detection. In his spare time, he enjoys restoring retro toys and electronics.
Follow James on LinkedIn
Follow James on LinkedIn
Join over 17 million learners and start Transformer Models with PyTorch today!
Create Your Free Account
or
By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.