メインコンテンツへスキップ

ホーム Python

コース

Pythonで学ぶ教師なし学習

中級スキルレベル

更新日 2025/12

scikit-learnおよびscipyを用いて、ラベル付けされていないデータセットからクラスタリング、変換、可視化を行い、そこから知見を抽出する方法について学びましょう。

コースを無料で開始

PythonMachine Learning

4時間

13 ビデオ

52 演習

4,150 XP

170K+

修了証明書

何千もの企業の従業員が支持

チームのトレーニングを担当していますか？

Businessをお試しください

コース説明

たとえば、年齢・居住地・与信情報などさまざまな属性をもつ顧客データからパターンを見つけ、クラスタに分けたいとします。あるいは、Wikipediaのページのようなテキスト集合を、内容にもとづいてカテゴリに分割したい場合もあります。これは教師なし学習の世界です。特定の予測タスクで発見を誘導（監督）するのではなく、ラベルのないデータから潜在的な構造を見いだします。教師なし学習には、クラスタリング、次元削減、行列因子分解など、機械学習のさまざまな手法が含まれます。本コースでは、教師なし学習の基礎を学び、scikit-learn と SciPy を使って主要なアルゴリズムを実装します。ラベルなしデータセットのクラスタリング、変換、可視化、洞察の抽出を行い、最後は人気の音楽アーティストを推薦するレコメンダシステムを構築します。動画にはライブ字幕が含まれており、動画左下の"Show transcript"をクリックすると表示できます。コース用語集は、右側のリソースセクションにあります。CPEクレジットを取得するには、コースを修了し、認定アセスメントで70%のスコアに到達する必要があります。右側のCPEクレジットの案内をクリックするとアセスメントに移動できます。

前提条件

Supervised Learning with scikit-learn

1

Clustering for Dataset Exploration

Learn how to discover the underlying groups (or "clusters") in a dataset. By the end of this chapter, you'll be clustering companies using their stock market prices, and distinguishing different species by clustering their measurements.

Unsupervised Learning

How many clusters?

Clustering 2D points

Inspect your clustering

Evaluating a clustering

How many clusters of grain?

Evaluating the grain clustering

Transforming features for better clusterings

Scaling fish data for clustering

Clustering the fish data

Clustering stocks using KMeans

Which stocks move together?

チャプターを開始

2

Visualization with Hierarchical Clustering and t-SNE

In this chapter, you'll learn about two unsupervised learning techniques for data visualization, hierarchical clustering and t-SNE. Hierarchical clustering merges the data samples into ever-coarser clusters, yielding a tree visualization of the resulting cluster hierarchy. t-SNE maps the data samples into 2d space so that the proximity of the samples to one another can be visualized.

Visualizing hierarchies

How many merges?

Hierarchical clustering of the grain data

Hierarchies of stocks

Cluster labels in hierarchical clustering

Which clusters are closest?

Different linkage, different hierarchical clustering!

Intermediate clusterings

Extracting the cluster labels

t-SNE for 2-dimensional maps

t-SNE visualization of grain dataset

A t-SNE map of the stock market

チャプターを開始

3

Decorrelating Your Data and Dimension Reduction

Dimension reduction summarizes a dataset using its common occuring patterns. In this chapter, you'll learn about the most fundamental of dimension reduction techniques, "Principal Component Analysis" ("PCA"). PCA is often used before supervised learning to improve model performance and generalization. It can also be useful for unsupervised learning. For example, you'll employ a variant of PCA will allow you to cluster Wikipedia articles by their content!

Visualizing the PCA transformation

Correlated data in nature

Decorrelating the grain measurements with PCA

Principal components

Intrinsic dimension

The first principal component

Variance of the PCA features

Intrinsic dimension of the fish data

Dimension reduction with PCA

Dimension reduction of the fish measurements

A tf-idf word-frequency array

Clustering Wikipedia part I

Clustering Wikipedia part II

チャプターを開始

4

Discovering Interpretable Features

In this chapter, you'll learn about a dimension reduction technique called "Non-negative matrix factorization" ("NMF") that expresses samples as combinations of interpretable parts. For example, it expresses documents as combinations of topics, and images in terms of commonly occurring visual patterns. You'll also learn to use NMF to build recommender systems that can find you similar articles to read, or musical artists that match your listening history!

Non-negative matrix factorization (NMF)

Non-negative data

NMF applied to Wikipedia articles

NMF features of the Wikipedia articles

NMF reconstructs samples

NMF learns interpretable parts

NMF learns topics of documents

Explore the LED digits dataset

NMF learns the parts of images

PCA doesn't learn parts

Building recommender systems using NMF

Which articles are similar to 'Cristiano Ronaldo'?

Recommend musical artists part I

Recommend musical artists part II

Final thoughts

チャプターを開始

Pythonで学ぶ教師なし学習

コース完了

修了証明書を取得

この修了書をLinkedInや履歴書、CVに追加しましょう
ソーシャルメディアや人事評価で共有しましょう今すぐ登録

ビジネス向け

2名以上のトレーニングをお考えですか？

すべての機能を含む完全なDataCampプラットフォームにチームでアクセス。

以下のトラックに含まれます

アソシエイトデータサイエンティスト Pythonで認定

データサイエンティスト向けアソシエイトAIエンジニア認定

機械学習の基礎 Pythonで

機械学習サイエンティスト Pythonで

講師

Benjamin Wilson

Benjamin Wilson

Director of Research at lateral.io

協力者

コースリソース

Company stock price movementsデータセット

Eurovision 2016データセット

Fish measurementsデータセット

Grainsデータセット

LCD digitsデータセット

Musical artistsデータセット

Wikipedia articlesデータセット

Wineデータセット

Course Glossaryデータセット

19百万人を超える学習者と共にPythonで学ぶ教師なし学習を始めましょう！

DataCamp for Mobileでデータスキルを磨きましょう

モバイルコースと毎日の 5 分間のコーディングチャレンジで、外出先でも進歩できます。