본문으로 바로가기

강의

Python으로 배우는 Gymnasium 기반 Reinforcement Learning

고급기술 수준

업데이트됨 2024. 9.

강화 학습 여정을 시작하세요! 에이전트가 상호작용을 통해 환경을 해결하는 법을 배우는 방식을 알아보세요.

무료로 강의 시작

PythonArtificial Intelligence

4시간

15 동영상

52 연습 문제

4,400 XP

12,915

성취 증명서

수천 개 기업의 학습자들이 사랑하는

팀을 교육하시나요?

비즈니스용으로 체험해 보세요

강의 설명

선수 조건

Supervised Learning with scikit-learn Python Toolbox Introduction to NumPy

1

Introduction to Reinforcement Learning

Dive into the exciting world of Reinforcement Learning (RL) by exploring its foundational concepts, roles, and applications. Navigate through the RL framework, uncovering the agent-environment interaction. You'll also learn how to use the Gymnasium library to create environments, visualize states, and perform actions, thus gaining a practical foundation in RL concepts and applications.

Fundamentals of reinforcement learning

What is Reinforcement Learning?

RL vs. other ML sub-domains

Scenarios for applying RL

Navigating the RL framework

RL interaction loop

Episodic and continuous RL tasks

Calculating discounted returns for agent strategies

Interacting with Gymnasium environments

Setting up a Mountain Car environment

Visualizing the Mountain Car Environment

Interacting with the Frozen Lake environment

2

Model-Based Learning

Delve deeper into the world of RL focusing on model-based learning. Unravel the complexities of Markov Decision Processes (MDPs), understanding their essential components. Enhance your skill set by learning about policies and value functions. Gain expertise in policy optimization with policy iteration and value Iteration techniques.

Markov Decision Processes

Custom Frozen Lake MDP components

Exploring state and action spaces

Transition probabilities and rewards

Policies and state-value functions

Defining a deterministic policy

Computing state-values for a policy

Comparing policies

Action-value functions

Computing Q-values

Improving a policy

Policy iteration and value iteration

Applying policy iteration for optimal policy

Implementing value iteration

3

Model-Free Learning

Embark on a journey through the dynamic realm of Model-Free Learning in RL. Get introduced to to the foundational Monte Carlo methods, and apply first-visit and every-visit Monte Carlo prediction algorithms. Transition into the world of Temporal Difference Learning, exploring the SARSA algorithm. Finally, dive into the depths of Q-Learning, and analyze its convergence in challenging environments.

Monte Carlo methods

Episode generation for Monte Carlo methods

Implementing first-visit Monte Carlo

Implementing every-visit Monte Carlo

Temporal difference learning

Implementing the SARSA update rule

Solving 8x8 Frozen Lake with SARSA

Implementing Q-learning update rule

Solving 8x8 Frozen Lake with Q-learning

Evaluating policy on a slippery Frozen Lake

4

Advanced Strategies in Model-Free RL

Dive into advanced strategies in Model-Free RL, focusing on enhancing decision-making algorithms. Learn about Expected SARSA for more accurate policy updates and Double Q-learning to mitigate overestimation bias. Explore the Exploration-Exploitation Tradeoff, mastering epsilon-greedy and epsilon-decay strategies for optimal action selection. Tackle the Multi-Armed Bandit Problem, applying strategies to solve decision-making challenges under uncertainty.

Expected SARSA

Expected SARSA update rule

Applying Expected SARSA

Double Q-learning

Implementing double Q-learning update rule

Applying double Q-learning

Balancing exploration and exploitation

Defining epsilon-greedy function

Solving CliffWalking with epsilon greedy strategy

Solving CliffWalking with decayed epsilon-greedy strategy

Multi-armed bandits

Creating a multi-armed bandit

Solving a multi-armed bandit

Assessing convergence in a multi-armed bandit

Congratulations!

Python으로 배우는 Gymnasium 기반 Reinforcement Learning

강의
완료

수료증 획득

LinkedIn 프로필, 이력서 또는 CV에 이 인증서를 추가하세요
소셜 미디어와 성과 평가에서 공유하세요지금 등록

19백만 명 이상의 학습자와 함께 Python으로 배우는 Gymnasium 기반 Reinforcement Learning을(를) 시작하세요!

DataCamp for Mobile을 통해 데이터 분석 능력을 향상시키세요.

모바일 강좌와 매일 5분 코딩 챌린지를 통해 이동 중에도 학습 효과를 높이세요.