Feature engineering helps you uncover useful insights from your machine learning models. The model building process is iterative and requires creating new features using existing variables that make your model more efficient. In this course, you will explore different data sets and apply a variety of feature engineering techniques to both continuous and discrete variables.
In this chapter, you will learn how to change categorical features into numerical representations that models can interpret. You'll learn about one-hot encoding and using binning for categorical features.
In this chapter, you will learn how to manipulate numerical features to create meaningful features that can give better insights into your model. You will also learn how to work with dates in the context of feature engineering.
In this chapter, you will learn about using transformation techniques, like Box-Cox and Yeo-Johnson, to address issues with non-normally distributed features. You'll also learn about methods to scale features, including mean centering and z-score standardization.
In the final chapter, we will use feature crossing to create features from two or more variables. We will also discuss principal component analysis, and methods to explore and visualize those results.
Data Scientist, University of Washington
“I've used other sites—Coursera, Udacity, things like that—but DataCamp's been the one that I've stuck with.”
Devon Edwards Joseph
Lloyds Banking Group
“DataCamp is the top resource I recommend for learning data science.”
Harvard Business School
“DataCamp is by far my favorite website to learn from.”
Decision Science Analytics, USAA