Interactive Course

Feature Engineering in R

Learn a variety of feature engineering techniques to develop meaningful features that will uncover useful insights about your machine learning models.

  • 4 hours
  • 13 Videos
  • 44 Exercises
  • 690 Participants
  • 3,500 XP

Loved by learners at thousands of top companies:

deloitte-grey.svg
whole-foods-grey.svg
uber-grey.svg
axa-grey.svg
paypal-grey.svg
roche-grey.svg

Course Description

Feature engineering helps you uncover useful insights from your machine learning models. The model building process is iterative and requires creating new features using existing variables that make your model more efficient. In this course, you will explore different data sets and apply a variety of feature engineering techniques to both continuous and discrete variables.

  1. 1

    Creating Features from Categorical Data

    Free

    In this chapter, you will learn how to change categorical features into numerical representations that models can interpret. You'll learn about one-hot encoding and using binning for categorical features.

  2. Transforming Numerical Features

    In this chapter, you will learn about using transformation techniques, like Box-Cox and Yeo-Johnson, to address issues with non-normally distributed features. You'll also learn about methods to scale features, including mean centering and z-score standardization.

  1. 1

    Creating Features from Categorical Data

    Free

    In this chapter, you will learn how to change categorical features into numerical representations that models can interpret. You'll learn about one-hot encoding and using binning for categorical features.

  2. Creating Features from Numeric Data

    In this chapter, you will learn how to manipulate numerical features to create meaningful features that can give better insights into your model. You will also learn how to work with dates in the context of feature engineering.

  3. Transforming Numerical Features

    In this chapter, you will learn about using transformation techniques, like Box-Cox and Yeo-Johnson, to address issues with non-normally distributed features. You'll also learn about methods to scale features, including mean centering and z-score standardization.

  4. Advanced Methods

    In the final chapter, we will use feature crossing to create features from two or more variables. We will also discuss principal component analysis, and methods to explore and visualize those results.

What do other learners have to say?

Devon

“I've used other sites, but DataCamp's been the one that I've stuck with.”

Devon Edwards Joseph

Lloyd's Banking Group

Louis

“DataCamp is the top resource I recommend for learning data science.”

Louis Maiden

Harvard Business School

Ronbowers

“DataCamp is by far my favorite website to learn from.”

Ronald Bowers

Decision Science Analytics @ USAA

Jose Hernandez
Jose Hernandez

Data Scientist, University of Washington

Jose is a Data Scientist at the University of Washington’s eScience Institute. Jose’s interests include the application of data science methods on sociological and educational data and building open source data tools to facilitate that process. Jose’s research combines theory and practice with data science methods to inform education policymaking. Jose earned his doctorate at the UW, with a focus in statistics and measurement and a Master of Education in policy, also from UW.

See More
Collaborators
  • Chester Ismay

    Chester Ismay

  • Amy Peterson

    Amy Peterson

Icon Icon Icon professional info