본문으로 바로가기

강의

Python으로 배우는 Unsupervised Learning

중급기술 수준

업데이트됨 2025. 12.

scikit-learn과 scipy를 사용하여 라벨이 없는 데이터셋을 클러스터링, 변환, 시각화하고 인사이트를 추출하는 방법을 배워보세요.

무료로 강의 시작

PythonMachine Learning

4시간

13 동영상

52 연습 문제

4,150 XP

170K+

성취 증명서

수천 개 기업의 학습자들이 사랑하는

팀을 교육하시나요?

비즈니스용으로 체험해 보세요

강의 설명

나이, 거주지, 금융 이력 등 다양한 특성을 가진 고객들이 있고, 그 속에서 패턴을 찾아 무리(클러스터)로 나누고 싶다고 해볼까요? 또는 Wikipedia 페이지 같은 텍스트 모음이 있어, 내용에 따라 범주로 구분하고 싶을 수도 있어요. 이것이 바로 unsupervised learning의 세계입니다. 예측 과제로 패턴 발견을 지도(supervise)하는 대신, 라벨이 없는 데이터에서 숨겨진 구조를 드러내기 때문에 이렇게 부릅니다. Unsupervised learning에는 클러스터링, 차원 축소, 행렬 분해 등 다양한 Machine Learning 기법이 포함됩니다. 이 강의에서는 unsupervised learning의 기초를 배우고 scikit-learn과 SciPy로 핵심 알고리즘을 구현해 볼 거예요. 라벨 없는 데이터셋에서 클러스터링, 변환, 시각화, 인사이트 도출을 해 보고, 마지막에는 인기 음악 아티스트를 추천하는 추천 시스템도 만들어 봅니다.동영상에는 실시간 전사가 포함되어 있으며, 동영상 왼쪽 하단의 "Show transcript"를 클릭하면 열람할 수 있어요. 강의 용어 사전은 오른쪽의 리소스 섹션에서 확인할 수 있습니다.CPE 크레딧을 받으려면 강의를 이수하고 인증 평가에서 70% 이상 점수를 받아야 합니다. 오른쪽의 CPE 크레딧 알림을 클릭하면 평가로 이동할 수 있어요.

선수 조건

Supervised Learning with scikit-learn

1

Clustering for Dataset Exploration

Learn how to discover the underlying groups (or "clusters") in a dataset. By the end of this chapter, you'll be clustering companies using their stock market prices, and distinguishing different species by clustering their measurements.

Unsupervised Learning

How many clusters?

Clustering 2D points

Inspect your clustering

Evaluating a clustering

How many clusters of grain?

Evaluating the grain clustering

Transforming features for better clusterings

Scaling fish data for clustering

Clustering the fish data

Clustering stocks using KMeans

Which stocks move together?

2

Visualization with Hierarchical Clustering and t-SNE

In this chapter, you'll learn about two unsupervised learning techniques for data visualization, hierarchical clustering and t-SNE. Hierarchical clustering merges the data samples into ever-coarser clusters, yielding a tree visualization of the resulting cluster hierarchy. t-SNE maps the data samples into 2d space so that the proximity of the samples to one another can be visualized.

Visualizing hierarchies

How many merges?

Hierarchical clustering of the grain data

Hierarchies of stocks

Cluster labels in hierarchical clustering

Which clusters are closest?

Different linkage, different hierarchical clustering!

Intermediate clusterings

Extracting the cluster labels

t-SNE for 2-dimensional maps

t-SNE visualization of grain dataset

A t-SNE map of the stock market

3

Decorrelating Your Data and Dimension Reduction

Dimension reduction summarizes a dataset using its common occuring patterns. In this chapter, you'll learn about the most fundamental of dimension reduction techniques, "Principal Component Analysis" ("PCA"). PCA is often used before supervised learning to improve model performance and generalization. It can also be useful for unsupervised learning. For example, you'll employ a variant of PCA will allow you to cluster Wikipedia articles by their content!

Visualizing the PCA transformation

Correlated data in nature

Decorrelating the grain measurements with PCA

Principal components

Intrinsic dimension

The first principal component

Variance of the PCA features

Intrinsic dimension of the fish data

Dimension reduction with PCA

Dimension reduction of the fish measurements

A tf-idf word-frequency array

Clustering Wikipedia part I

Clustering Wikipedia part II

4

Discovering Interpretable Features

In this chapter, you'll learn about a dimension reduction technique called "Non-negative matrix factorization" ("NMF") that expresses samples as combinations of interpretable parts. For example, it expresses documents as combinations of topics, and images in terms of commonly occurring visual patterns. You'll also learn to use NMF to build recommender systems that can find you similar articles to read, or musical artists that match your listening history!

Non-negative matrix factorization (NMF)

Non-negative data

NMF applied to Wikipedia articles

NMF features of the Wikipedia articles

NMF reconstructs samples

NMF learns interpretable parts

NMF learns topics of documents

Explore the LED digits dataset

NMF learns the parts of images

PCA doesn't learn parts

Building recommender systems using NMF

Which articles are similar to 'Cristiano Ronaldo'?

Recommend musical artists part I

Recommend musical artists part II

Final thoughts

Python으로 배우는 Unsupervised Learning

강의
완료

수료증 획득

LinkedIn 프로필, 이력서 또는 CV에 이 인증서를 추가하세요
소셜 미디어와 성과 평가에서 공유하세요지금 등록

비즈니스용

2명 이상을 교육하시나요?

모든 기능을 포함한 전체 DataCamp 플랫폼에 대한 팀 액세스를 받으세요.

다음 트랙에 포함

데이터 과학자 보조 in Python인증서

데이터 과학자를 위한 AI 엔지니어 보조인증서

머신 러닝 기초 in Python

머신 러닝 과학자 in Python

강사

Benjamin Wilson

Benjamin Wilson

Director of Research at lateral.io

협업자

강의 리소스

Company stock price movements데이터 세트

Eurovision 2016데이터 세트

Fish measurements데이터 세트

Grains데이터 세트

LCD digits데이터 세트

Musical artists데이터 세트

Wikipedia articles데이터 세트

Wine데이터 세트

Course Glossary데이터 세트

19백만 명 이상의 학습자와 함께 Python으로 배우는 Unsupervised Learning을(를) 시작하세요!

DataCamp for Mobile을 통해 데이터 분석 능력을 향상시키세요.

모바일 강좌와 매일 5분 코딩 챌린지를 통해 이동 중에도 학습 효과를 높이세요.