Pythonで学ぶDeep Reinforcement Learning

上級スキルレベル

更新日 2024/09

強力な Deep Reinforcement Learning のアルゴリズムを学び、洗練・最適化手法を含めて実践します。

コース説明

前提条件

Intermediate Deep Learning with PyTorch Reinforcement Learning with Gymnasium in Python

Introduction to Deep Reinforcement Learning

Discover how deep reinforcement learning improves upon traditional Reinforcement Learning while studying and implementing your first Deep Q Learning algorithm.

Introduction to deep reinforcement learning

50 XP

Environment and neural network setup

100 XP

DRL training loop

100 XP

Introduction to deep Q learning

50 XP

Deep learning and DQN

50 XP

The Q-Network architecture

100 XP

Instantiating the Q-Network

100 XP

The barebone DQN algorithm

50 XP

Barebone DQN action selection

100 XP

Barebone DQN loss function

100 XP

Training the barebone DQN

100 XP

チャプターを開始

Deep Q-learning

Dive into Deep Q-learning by implementing the original DQN algorithm, featuring Experience Replay, epsilon-greediness and fixed Q-targets. Beyond DQN, you will then explore two fascinating extensions that improve the performance and stability of Deep Q-learning: Double DQN and Prioritized Experience Replay.

DQN with experience replay

50 XP

The double-ended queue

100 XP

Experience replay buffer

100 XP

DQN with experience replay

100 XP

The complete DQN algorithm

50 XP

Epsilon-greediness

100 XP

Fixed Q-targets

100 XP

Implementing the complete DQN algorithm

100 XP

Double DQN

50 XP

Online network and target network in DDQN

100 XP

Training the double DQN

100 XP

Prioritized experience replay

50 XP

Prioritized experience replay buffer

100 XP

Sampling from the PER buffer

100 XP

DQN with prioritized experience replay

100 XP

チャプターを開始

Introduction to Policy Gradient Methods

Learn about the foundational concepts of policy gradient methods found in DRL. You will begin with the policy gradient theorem, which forms the basis for these methods. Then, you will implement the REINFORCE algorithm, a powerful approach to learning policies. The chapter will then guide you through Actor-Critic methods, focusing on the Advantage Actor-Critic (A2C) algorithm, which combines the strengths of both policy gradient and value-based methods to enhance learning efficiency and stability.

Introduction to policy gradient

50 XP

The policy network architecture

100 XP

Working with discrete distributions

100 XP

Policy gradient and REINFORCE

50 XP

Action selection in REINFORCE

100 XP

Training the REINFORCE algorithm

100 XP

Advantage Actor Critic

50 XP

Critic network

100 XP

Actor Critic loss calculations

100 XP

Training the A2C algorithm

100 XP

チャプターを開始

Proximal Policy Optimization and DRL Tips

Explore Proximal Policy Optimization (PPO) for robust DRL performance. Next, you will examine using an entropy bonus in PPO, which encourages exploration by preventing premature convergence to deterministic policies. You'll also learn about batch updates in policy gradient methods. Finally, you will learn about hyperparameter optimization with Optuna, a powerful tool for optimizing performance in your DRL models.

Proximal policy optimization

50 XP

The clipped probability ratio

100 XP

The clipped surrogate objective function

100 XP

Entropy bonus and PPO

50 XP

Entropy playground

100 XP

Training the PPO algorithm

100 XP

Batch updates in policy gradient

50 XP

Minibatch and DRL

50 XP

A2C with batch updates

100 XP

Hyperparameter optimization with Optuna

50 XP

Hyperparameter or not?

100 XP

Hands-on with Optuna

100 XP

Congratulations!

50 XP

チャプターを開始

Pythonで学ぶDeep Reinforcement Learning

コース完了

修了証明書を取得

この修了書をLinkedInや履歴書、CVに追加しましょう
ソーシャルメディアや人事評価で共有しましょう今すぐ登録

Pythonで学ぶDeep Reinforcement Learning

チームのトレーニングを担当していますか？

コース説明

前提条件

Introduction to Deep Reinforcement Learning

Deep Q-learning

Introduction to Policy Gradient Methods

Proximal Policy Optimization and DRL Tips

修了証明書を取得

19百万人を超える学習者と共にPythonで学ぶDeep Reinforcement Learningを始めましょう！

DataCamp for Mobileでデータスキルを磨きましょう

コース説明

修了証明書を取得

.css-nklxlk{color:var(--wf-brand--main, #03EF62);}19百万人を超える学習者と共にPythonで学ぶDeep Reinforcement Learningを始めましょう！

無料アカウントを作成

DataCamp for Mobileでデータスキルを磨きましょう

19百万人を超える学習者と共にPythonで学ぶDeep Reinforcement Learningを始めましょう！