r programming

Feature Selection in R with the Boruta R Package

Tackle feature selection in R: explore the Boruta algorithm, a wrapper built around the Random Forest classification algorithm, and its implementation!

High-dimensional data, in terms of number of features, is increasingly common these days in machine learning problems. To extract useful information from these high volumes of data, you have to use statistical techniques to reduce the noise or redundant data. This is because you often need not use every feature at your disposal to train a model. You can improve your model by feeding in only those features that are uncorrelated and non-redundant. This is where feature selection plays an important role. Not only it helps in training your model faster but also reduces the complexity of the model, makes it easier to interpret and improves the accuracy, precision or recall, whatever may the performance metric be.

In this tutorial, you'll tackle the following concepts: