본문으로 바로가기

강의

R로 배우는 Feature Engineering

중급기술 수준

업데이트됨 2023. 3.

머신러닝 모델을 위한 특징 공학 원리를 배우고, R tidymodels 프레임워크로 구현하는 방법을 익히세요.

무료로 강의 시작

RMachine Learning4시간14 동영상58 연습 문제4,950 XP2,574성취 증명서

무료 계정을 만드세요

또는

계속 진행하시면 당사의 이용약관, 개인정보처리방침 및 귀하의 데이터가 미국에 저장되는 것에 동의하시는 것입니다.

수천 개 기업의 학습자들이 사랑하는

2명 이상을 교육하시나요?

DataCamp for Business 체험

강의 설명

이 강의에서는 여러 종류의 Machine Learning 모델의 핵심인 feature engineering을 학습해요. 어떤 모델의 성능이 주어진 feature에 직접적으로 좌우되기 때문에, feature engineering은 도메인 지식을 과정의 중심에 둡니다. 탄탄한 feature engineering 원칙을 익혀서 가능한 경우 변수 수를 줄이고, 학습 알고리즘을 더 빠르게 실행하며, 해석 가능성을 높이고, 과적합을 방지하는 방법을 배웁니다.또한 R의 tidymodels 프레임워크를 활용해 feature engineering 기법을 구현하는 법을 익힙니다. 특히 recipe 패키지에 중점을 두어, 모델에 가장 적합한 feature를 생성, 추출, 변환, 선택하는 방법을 다룹니다.새로운 데이터셋을 접했을 때, 관련성 높은 feature를 식별·선택하고, 정보가 거의 없는 feature는 배제해 정확도를 희생하지 않으면서도 모델을 더 빠르게 만들 수 있게 됩니다. 아울러 변환을 적용하고 새로운 feature를 생성해 모델을 더 효율적이고 해석 가능하며 정확하게 만드는 데에도 익숙해지실 거예요!

선수 조건

Supervised Learning in R: Classification Supervised Learning in R: Regression

1

Introducing Feature Engineering

Raw data does not always come in its best shape for analysis. In this opening chapter, you will get a first look at how to transform and create features that enhance your model's performance and interpretability.

What is feature engineering?

A tentative model

Manually engineering a feature

Creating new features using domain knowledge

Setting up your data for analysis

Building a workflow

Increasing the information content of raw data

Identifying missing values

Imputing missing values and creating dummy variables

Fitting and assessing the model

Predicting hotel bookings

2

Transforming Features

In this chapter, you’ll learn that, beyond manually transforming features, you can leverage tools from the tidyverse to engineer new variables programmatically. You’ll explore how this approach improves your models' reproducibility and is especially useful when handling datasets with many features.

Why transform existing features?

Glancing at your data

Normalizing and log-transforming

Fit and augment

Customize your model assessment

Common feature transformations

Common transformations

Plain recipe

Box-Cox transformation

Yeo-Johnson transformation

Advanced transformations

step_poly()

step_percentile()

Who's staying?

3

Extracting Features

You’ll now learn how models often benefit from reducing dimensionality and extracting features from high-dimensional data, including converting text data into numeric values, encoding categorical data, and ranking the predictive power of variables. You’ll explore methods including principal component analysis, kernel principal component analysis, numerical extraction from text, categorical encodings, and variable importance scores.

Reducing dimensionality

Prepping the stage

Digging into the structure

Percent of variance explained

Visualizing variance explained

Feature hashing

Investigating education field

Into the matrix

Exploring the hashing

Visualizing the hashing

Encoding categorical data using supervised learning

Setting up your workflow

Fitting, augmenting, and assessing

Binding models together

Variable Importance

Create a workflow

Fit and augment

Which is the main predictor?

4

Selecting Features

You’ll wrap up the course by learning about feature engineering and machine learning techniques. You’ll begin by focusing on the problems associated with using all available features in a model and the importance of identifying irrelevant and redundant features and learning to remove these features using embedded methods such as lasso and elastic-net. Next, you’ll explore shrinkage methods such as lasso, ridge, and elastic-net, which can be used to regularize feature weights or select features by setting coefficients to zero. Finally, you’ll finish by focusing on creating an end-to-end feature engineering workflow and reviewing and practicing the previously learned concepts and functions in a small project.

Reducing the model's features

Sifting through variable importance

Assessing model performance using all available predictors

Building a reduced model

Shrinkage methods

Manual regularization with Lasso

Tuning the penalty

Finalizing the model

Putting it all together

Prep and split

Congratulations!

R로 배우는 Feature Engineering

강의
완료

수료증 획득

LinkedIn 프로필, 이력서 또는 CV에 이 자격증을 추가하세요
소셜 미디어와 성과 평가에서 공유하세요지금 등록

19백만 명 이상의 학습자와 함께 R로 배우는 Feature Engineering을(를) 시작하세요!

무료 계정을 만드세요

또는

계속 진행하시면 당사의 이용약관, 개인정보처리방침 및 귀하의 데이터가 미국에 저장되는 것에 동의하시는 것입니다.

DataCamp for Mobile을 통해 데이터 분석 능력을 향상시키세요.

모바일 강좌와 매일 5분 코딩 챌린지를 통해 이동 중에도 학습 효과를 높이세요.