Lewati ke konten utama

Kursus

Pengurangan Dimensi dengan Python

MenengahTingkat Keterampilan

Diperbarui 01/2023

Pahami konsep pengurangan dimensi pada data Anda, dan kuasai teknik-teknik untuk melakukannya dalam Python.

Mulai Kursus Gratis

PythonMachine Learning4 jam16 videos58 Latihan4,700 XP36,095Bukti Prestasi

Buat Akun Gratis Anda

atau

Dengan melanjutkan, Anda menerima Ketentuan Penggunaan kami, Kebijakan Privasi kami dan bahwa data Anda disimpan di Amerika Serikat.

Dipercaya oleh para pelajar di ribuan perusahaan

Pelatihan untuk 2 orang atau lebih?

Coba DataCamp for Business

Deskripsi Kursus

Himpunan data berdimensi tinggi bisa terasa membingungkan dan membuat Anda tidak tahu harus mulai dari mana. Biasanya, Anda akan menelusuri himpunan data baru secara visual terlebih dahulu, tetapi ketika dimensinya terlalu banyak, pendekatan klasik menjadi tidak memadai. Untungnya, ada teknik visualisasi yang dirancang khusus untuk data berdimensi tinggi, dan Anda akan diperkenalkan pada teknik-teknik ini dalam kursus ini. Setelah menelusuri data, Anda sering kali akan menemukan bahwa banyak fitur hanya sedikit informasinya karena tidak menunjukkan variasi atau karena merupakan duplikasi dari fitur lain. Anda akan mempelajari cara mendeteksi fitur-fitur ini dan menghapusnya dari himpunan data agar dapat berfokus pada fitur yang informatif. Pada langkah berikutnya, Anda mungkin ingin membangun model berdasarkan fitur-fitur ini, dan bisa jadi sebagian tidak berpengaruh terhadap hal yang ingin Anda prediksi. Anda juga akan mempelajari cara mendeteksi dan menghapus fitur yang tidak relevan ini untuk mengurangi dimensi dan sekaligus kompleksitas. Terakhir, Anda akan mempelajari bagaimana teknik ekstraksi fitur dapat mengurangi dimensi melalui perhitungan komponen utama yang tidak berkorelasi.

Persyaratan

Supervised Learning with scikit-learn

1

Exploring High Dimensional Data

You'll be introduced to the concept of dimensionality reduction and will learn when an why this is important. You'll learn the difference between feature selection and feature extraction and will apply both techniques for data exploration. The chapter ends with a lesson on t-SNE, a powerful feature extraction technique that will allow you to visualize a high-dimensional dataset.

Introduction

Finding the number of dimensions in a dataset

Removing features without variance

Feature selection vs. feature extraction

Visually detecting redundant features

Advantage of feature selection

t-SNE visualization of high-dimensional data

t-SNE intuition

Fitting t-SNE to the ANSUR data

t-SNE visualisation of dimensionality

2

Feature Selection I - Selecting for Feature Information

In this first out of two chapters on feature selection, you'll learn about the curse of dimensionality and how dimensionality reduction can help you overcome it. You'll be introduced to a number of techniques to detect and remove features that bring little added value to the dataset. Either because they have little variance, too many missing values, or because they are strongly correlated to other features.

The curse of dimensionality

Train - test split

Fitting and testing the model

Accuracy after dimensionality reduction

Features with missing values or little variance

Finding a good variance threshold

Features with low variance

Removing features with many missing values

Pairwise correlation

Correlation intuition

Inspecting the correlation matrix

Visualizing the correlation matrix

Removing highly correlated features

Filtering out highly correlated features

Nuclear energy and pool drownings

3

Feature Selection II - Selecting for Model Accuracy

In this second chapter on feature selection, you'll learn how to let models help you find the most important features in a dataset for predicting a particular target feature. In the final lesson of this chapter, you'll combine the advice of multiple, different, models to decide on which features are worth keeping.

Selecting features for model performance

Building a diabetes classifier

Manual Recursive Feature Elimination

Automatic Recursive Feature Elimination

Tree-based feature selection

Building a random forest model

Random forest for feature selection

Recursive Feature Elimination with random forests

Regularized linear regression

Creating a LASSO regressor

Lasso model results

Adjusting the regularization strength

Combining feature selectors

Creating a LassoCV regressor

Ensemble models for extra votes

Combining 3 feature selectors

4

Feature Extraction

This chapter is a deep-dive on the most frequently used dimensionality reduction algorithm, Principal Component Analysis (PCA). You'll build intuition on how and why this algorithm is so powerful and will apply it both for data exploration and data pre-processing in a modeling pipeline. You'll end with a cool image compression use case.

Feature extraction

Manual feature extraction I

Manual feature extraction II

Principal component intuition

Principal component analysis

Calculating Principal Components

PCA on a larger dataset

PCA explained variance

PCA applications

Understanding the components

PCA for feature exploration

PCA in a model pipeline

Principal Component selection

Selecting the proportion of variance to keep

Choosing the number of components

PCA for image compression

Congratulations!

Pengurangan Dimensi dengan Python

Kursus
Selesai

Memperoleh Surat Keterangan Prestasi

Tambahkan kredensial ini ke profil LinkedIn, resume, atau CV Anda
Bagikan di media sosial dan dalam penilaian kinerja AndaDaftar Sekarang

Bergabung dengan 19 juta pelajar dan mulai Pengurangan Dimensi dengan Python Hari Ini!

Buat Akun Gratis Anda

atau

Dengan melanjutkan, Anda menerima Ketentuan Penggunaan kami, Kebijakan Privasi kami dan bahwa data Anda disimpan di Amerika Serikat.

Kembangkan keterampilan data Anda dengan DataCamp untuk Mobile

Buat kemajuan di mana saja dengan kursus mobile kami dan tantangan coding harian 5 menit.