# Ensemble Learning in R with SuperLearner

Boost your machine learning results and discover ensembles in R with the SuperLearner package: learn about the Random Forest algorithm, bagging, and much more!
Feb 2018  · 25 min read

Did you ever want to build a machine learning ensemble, but did not know how to get started? This tutorial will help you on your way with `SuperLearner`. This R package provides you with an easy way to create machine learning ensembles with the use of high level functions by offering a standardized wrapper to fit an ensemble using popular R machine learing libraries such as `glmnet`, `knn`, `randomForest` and many more!

In this tutorial, you'll tackle the following topics:

• What are Ensembles? Go over a short definition of ensembles before you start tackling the practical example that this tutorial offers!
• Why `SuperLearner` and what does this package actually do?
• Ensemble Learning in R with `SuperLearner`: in this section, you'll learn how to install the packages you need, prepare the data and create your first ensemble model! You'll also see how you can train the mode and make predictions with it. In doing so, you'll cover Kernel Support Vector Machines, Bayes Generalized Linear Models and Bagging. Lastly, you'll see how you can tune the hyperparameters to further improve your model's performance!

When you are finished, you will have fit your first ensemble, predicted new data and tuned parts of the ensemble.

## What are Ensembles?

All this is awesome, but what exactly is an ensemble?

An ensemble occurs when the probability predictions or numerical predictions of multiple machine models are combined by averaging, weighting each model and adding them together or using the most common observation between models. This provides a multiple vote scenario that is likely to drive a prediction to the correct class or closer to the correct number in regression models. Ensembles tend to work best when there are disagreements between the models being fit. The concept of combining multiple models also seems to perform well in practice, often above implementations of single algorithms.

Ensembles can be created manually by fitting multiple models, predicting with each of them and then combining them.

## Why `SuperLearner`?

Now that you have seen what ensembles are, you might ask yourself what the `SuperLearner` library exactly does. Well, simply put, `SuperLearner` is an algorithm that uses cross-validation to estimate the performance of multiple machine learning models, or the same model with different settings. It then creates an optimal weighted average of those models, which is also called an “ensemble”, using the test data performance.

But why would you use `SuperLearner`?

Even though you'll learn more about the power of this R package throughout the tutorial, you could already consider this list of advantages:

• `SuperLearner` allows you to fit an ensemble model by simply adding algorithms
• As you already read before, `SuperLearner` uses cross-validation, which is inherently used to estimate risk for all models. This makes `SuperLearner` great for model comparison!
• `SuperLearner` makes ensembling efficient by automatically estimating the weights of the ensemble. This is normally a task that can be very tedious and requires a lot of experimentation.
• `SuperLearner` automatically removes models that do not contribute to the ensemble prediction power, this leaves you free to experiment with numerous algorithms!

Let's take a look at the process to use `SuperLearner`.

## Ensemble Learning in R with `SuperLearner`

### Install the `SuperLearner` Package

`SuperLearner` can be installed from CRAN with the `install.packages()` function and then loaded into your workspace using the `library()` function:

`````` # Install the package
install.packages("SuperLearner")

# Load the package
library("SuperLearner")``````

### Prepare your Data

To illustrate `SuperLearner`, you will use the Pima Indian Women data set from the `MASS` package. The MASS package contains a training set, which is used for training a model and a test set, which is used for assessing the performance of the model on unseen data. The data set provides some descriptive factors about the Pima Indian Women such as number of pregnancies and age and whether or not they have diabetes. The purpose of the data set is to try to predict diabetes.

The `type` column is the column that indicates the presence of diabetes. It is a binary `Yes` or `No` column, which means that it follows a binomial distribution.

Note that, without getting too theoretical, a binomial distribution is a collection of Bernoulli trials, which are a success or failure test in probability. A binomial distribution is easily identified because there are only two possible responses, in this case `Yes` or `No`. Why are you getting into this? Well, `SuperLearner` requires you to define the family of problem your model should belong to. You will see that in more detail when you fit the model later in this tutorial.

``````     # Get the `MASS` library
library(MASS)

# Train and test sets
train <- Pima.tr
test <- Pima.te

# Print out the first lines of `train`
``````    ##   npreg glu bp skin  bmi   ped age type
## 1     5  86 68   28 30.2 0.364  24   No
## 2     7 195 70   33 25.1 0.163  55  Yes
## 3     5  77 82   41 35.8 0.156  35   No
## 4     0 165 76   43 47.9 0.259  26   No
## 5     0 107 60   25 26.4 0.133  23   No
## 6     5  97 76   27 35.6 0.378  52  Yes``````
``````    # Get a summary of `train`
summary(train)``````
``````    ##      npreg            glu              bp              skin
##  Min.   : 0.00   Min.   : 56.0   Min.   : 38.00   Min.   : 7.00
##  1st Qu.: 1.00   1st Qu.:100.0   1st Qu.: 64.00   1st Qu.:20.75
##  Median : 2.00   Median :120.5   Median : 70.00   Median :29.00
##  Mean   : 3.57   Mean   :124.0   Mean   : 71.26   Mean   :29.21
##  3rd Qu.: 6.00   3rd Qu.:144.0   3rd Qu.: 78.00   3rd Qu.:36.00
##  Max.   :14.00   Max.   :199.0   Max.   :110.00   Max.   :99.00
##       bmi             ped              age         type
##  Min.   :18.20   Min.   :0.0850   Min.   :21.00   No :132
##  1st Qu.:27.57   1st Qu.:0.2535   1st Qu.:23.00   Yes: 68
##  Median :32.80   Median :0.3725   Median :28.00
##  Mean   :32.31   Mean   :0.4608   Mean   :32.11
##  3rd Qu.:36.50   3rd Qu.:0.6160   3rd Qu.:39.25
##  Max.   :47.90   Max.   :2.2880   Max.   :63.00``````

Tip: if you want to have more information on the variables of this data set, use the `help()` function, just like here:

`` help(Pima.tr)``

By running the above command, you can derive that the `type` column indicates diabetes.

`SuperLearner` also requires the response variable to be encoded if it is a classification problem. Since you are solving a binomial classification problem, you will encode the factor for the variable `type` to 0-1 encoding:

``````     y <- as.numeric(train[,8])-1
ytest <- as.numeric(test[,8])-1``````

Since the `type` column was a factor, R will encode it to 1 and 2, but this is not what you want: ideally, you would like to work with the type encoded as 0 and 1, which are "No" and "Yes", respectively. In the above code chunk, you subtract `1` from the whole set to get your 0-1 encoding. R will also encode this in the factor order.

The package also requires that the predictors (`X`) and responses (`Y`) to be in their own data structures. You split out `Y` above, now you need to split out `X`. You will go ahead and split out your test set as well:

``````     x <- data.frame(train[,1:7])
xtest <- data.frame(test[,1:7])``````

Note that some algorithms do not just require a data frame, but would require a model matrix saved as a data frame. An example is the `nnet` algorithm. When solving a regression problem, you will almost always use the model matrix to store your data for SuperLearner. All a model matrix does is split out factor variables into their own columns and recodes them as 0-1 values instead of text values. It does not impact numerical columns. The model matrix will increase the number of columns an algorithm has to deal with, therefore it could increase computational time. For a small data set, such as this, there is minimal impact, but larger data sets could be heavily affected. The moral of the story is to decide which algorithms you will want to try before fitting your model. For this simple example, you will just use the data frame for the existing data structure.

### Your First Ensemble Model with `SuperLearner`

To start creating your first model, you can use the following command to preview what models are available in the package:

``     listWrappers()``
``````    ## All prediction algorithm wrappers in SuperLearner:

##   "SL.bartMachine"      "SL.bayesglm"         "SL.biglasso"
##   "SL.caret"            "SL.caret.rpart"      "SL.cforest"
##   "SL.dbarts"           "SL.earth"            "SL.extraTrees"
##  "SL.gam"              "SL.gbm"              "SL.glm"
##  "SL.glm.interaction"  "SL.glmnet"           "SL.ipredbagg"
##  "SL.kernelKnn"        "SL.knn"              "SL.ksvm"
##  "SL.lda"              "SL.leekasso"         "SL.lm"
##  "SL.loess"            "SL.logreg"           "SL.mean"
##  "SL.nnet"             "SL.nnls"             "SL.polymars"
##  "SL.qda"              "SL.randomForest"     "SL.ranger"
##  "SL.ridge"            "SL.rpart"            "SL.rpartPrune"
##  "SL.speedglm"         "SL.speedlm"          "SL.step"
##  "SL.step.forward"     "SL.step.interaction" "SL.stepAIC"
##  "SL.svm"              "SL.template"         "SL.xgboost"

##
## All screening algorithm wrappers in SuperLearner:

##  "All"
##  "screen.corP"           "screen.corRank"        "screen.glmnet"
##  "screen.randomForest"   "screen.SIS"            "screen.template"
##  "screen.ttest"          "write.screen.template"``````

You will notice there are prediction algorithm wrappers and screening algorithm wrappers. There are some popular libraries in here that can be used for either classification, regression or both. The screening algorithms are used for automated variable selection by `SuperLearner`.

When you want to use an algorithm from the above list, you'll need to have the package installed in your environment. That's because `SuperLearner` is really calling these packages and then fitting the models when the method is used. That also means that if you never use the method `SL.caret`, for example, you do not need to have the `caret` package installed.

Fitting the model is simple, but you'll go through this step-by-step with a single model example.

You will fit the Ranger algorithm, which is a faster implementation of the famous Random Forest.

Remember that a Random Forest is a powerful method which is actually an ensembling of decision trees. Decision trees work by observing your data and calculating a probability split between each variable in the model, giving you a pathway to your prediction. Decision trees have a habit of overfitting to their data, which means they do not generalize well to new data. Random Forest solves this problem by growing multiple decision trees based on numerous samples of data and then averages those predictions to find the correct prediction. It also only selects a subset of the features for each sample, which is how it differs from tree bagging. This creates a model that is not overfitting the data. Cool, right?

In this case, it could be that you first need to install the `ranger` library with `install.packages()` function before you can start fitting the model.

If you have done that, you can continue and use `SL.ranger` in the `SuperLearner()` function.

Since Random Forest -and therefore Ranger- contain random sampling in the algorithm, you will not get the same result if you fit it more than once. Therefore, for this exercise, you will set the seed so you can reproduce the examples and also compare multiple models on the same random seed baseline. R uses `set.seed()` to set the random seed. The seed can be any number, in this case, you will use `150`.

``````    set.seed(150)
single.model <- SuperLearner(y,
x,
family=binomial(),
SL.library=list("SL.ranger"))``````

`SuperLearner` requires a `Y` variable, which is the response or outcome you want, an `X` variable, which are the predictor variables, the `family` to use, which can be guassian or binomial and the library to use in the form of a list. That's `SL.ranger` in this case.

Do you remember the whole binomial distribution discussion that you read about earlier? Now, you see why you needed to know that: using the gaussian model would not have yielded proper predictions in your 0-1 range.

Next, simply printing the model provides the coefficient, which is the weight of the algorithm in the model and the risk factor which is the error the algorithm produces. Behind the scenes, the package fits each algorithm used in the ensemble to produce the risk factor.

``     single.model``
``````    ##
## Call:
## SuperLearner(Y = y, X = x, family = binomial(), SL.library = list("SL.ranger"))
##
##
##
##                    Risk Coef
## SL.ranger_All 0.1759541    1``````

In this case, your risk factor is less than 0.20. Of course, this will need to be tested through external cross validation and in the test set, but it is a good start. The beauty of `SuperLearner` is that it tries to automatically build an ensemble through the use of cross validation. Of course, if there is only one model, then it gets the full weight of the ensemble.

So this single model is great, but you can do this without `SuperLearner`. How can you fit ensemble models?

### Training an Ensemble with R: Kernel Support Vector Machines, Bayes GLM and Bagging

Ensembling with SuperLearner is as simple as selecting the algorithms to use. In this case, let's add Kernel Support Vector Machines (KSVM) from the `kernlab` package, Bayes Generalized Linear Models (GLM) from the `arm` package and bagging from the `ipred` package.

But what are KSVM and Bayes GLM?

• The KSVM uses something called "the kernel trick" to calculate distance between points. Instead of having to draw a map of the features and calculate coordinates, the kernel method calculates the inner products between points. This allows for faster computation. Then the support vector machine is used to learn the non-linear boundary between points in classification. A support vector machine attempts to create a gap between two classes in a machine learning problem that is often nonlinear. It then classifies new points on either side of that gap based on where they are in space.

• The Bayes GLM model is simply an implementation of logistic regression. At least in this case, where you are classifying a 0-1 problem. Bayes GLM differs from KSVM in that it uses an augmented regression algorithm to update the coefficients at each step. Bagging is similar to random forest above without subsetting the features. This means that you will grow multiple decision trees from random samples and average them together to get your prediction.

Now let's fit your first ensemble!

Tip: don't forget to install these packages if you don't have them yet! Additionally, you might also be prompted to install other required packages.

``````     # Set the seed
set.seed(150)

# Fit the ensemble model
model <- SuperLearner(y,
x,
family=binomial(),
SL.library=list("SL.ranger",
"SL.ksvm",
"SL.ipredbagg",
"SL.bayesglm"))

# Return the model
model``````
``````    ##
## Call:
## SuperLearner(Y = y, X = x, family = binomial(), SL.library = list("SL.ranger",
##     "SL.ksvm", "SL.ipredbagg", "SL.bayesglm"))
##
##
##                       Risk     Coef
## SL.ranger_All    0.1756230 0.000000
## SL.ksvm_All      0.1838340 0.000000
## SL.ipredbagg_All 0.1664828 0.524182
## SL.bayesglm_All  0.1677593 0.475818``````

Adding these algorithms improved your model and changed the landscape. Ranger and KVSM have a coefficient of zero, which means that it is not weighted as part of the ensemble anymore. Bayes GLM and Bagging make up the rest of the weight of the model. You will notice `SuperLearner` is calculating this risk for you and deciding on the optimal model mix that will reduce the error.

To understand each model's specific contribution to the model and the variation, you can use `SuperLearner`'s internal cross-validation function `CV.SuperLearner()`. To set the number of folds, you can use the `V` argument. In this case, you will set it to `5`:

``````     # Set the seed
set.seed(150)

# Get V-fold cross-validated risk estimate
cv.model <- CV.SuperLearner(y,
x,
V=5,
SL.library=list("SL.ranger",
"SL.ksvm",
"SL.ipredbagg",
"SL.bayesglm"))

# Print out the summary statistics
summary(cv.model)``````
``````    ##
## Call:
## CV.SuperLearner(Y = y, X = x, V = 5, SL.library = list("SL.ranger",
##     "SL.ksvm", "SL.ipredbagg", "SL.bayesglm"))
##
## Risk is based on: Mean Squared Error
##
## All risk estimates are based on V =  5
##
##         Algorithm     Ave       se     Min     Max
##     Super Learner 0.17277 0.014801 0.16250 0.19557
##       Discrete SL 0.17964 0.014761 0.16363 0.19244
##     SL.ranger_All 0.17866 0.015004 0.14811 0.20518
##       SL.ksvm_All 0.19382 0.020301 0.15685 0.26215
##  SL.ipredbagg_All 0.17791 0.015858 0.15831 0.19244
##   SL.bayesglm_All 0.16628 0.014318 0.15322 0.18022``````

The summary of cross validation shows the average risk of the model, the variation of the model and the range of the risk.

Plotting this also produces a nice plot of the models used and their variation:

``     plot(cv.model)`` It's easy to see that Bayes GLM performs the best on average while KSVM performs the worst and contains a lot of variation compared to the other models. The beauty of `SuperLearner` is that, if a model does not fit well or contribute much, it is just weighted to zero! There is no need to remove it and retrain unless you plan on retraining the model in the future. Just remember that proper model training involves cross validation of the entire model. In a real-world setting, that is how you would determine the risk of the model before predicting new data.

### Make Predictions with SuperLearner

With the specific command `predict.SuperLearner()` you can easily make predictions on new data sets. That means that you can not use the normal `predict()` function!

``     predictions <- predict.SuperLearner(model, newdata=xtest)``

The function `predict.SuperLearner()` takes a model argument (a SuperLearner fit model) and new data to predict on. Predictions will first return the overall ensemble predictions:

``     head(predictions\$pred)``
``````    ##            [,1]
## [1,] 0.79322181
## [2,] 0.11895658
## [3,] 0.04612200
## [4,] 0.05928159
## [5,] 0.68824522
## [6,] 0.54373451``````

It will also return the individual library predictions:

``     head(predictions\$library.predict)``
``````    ##      SL.ranger_All SL.ksvm_All SL.ipredbagg_All SL.bayesglm_All
## [1,]         0.796   0.8089502       0.82086658      0.76276712
## [2,]         0.129   0.1580203       0.18586049      0.04525230
## [3,]         0.016   0.1579566       0.06255427      0.02801949
## [4,]         0.102   0.1885473       0.07238268      0.04484885
## [5,]         0.638   0.7108875       0.58791672      0.79877149
## [6,]         0.550   0.6898737       0.37488066      0.72975132``````

This allows you to see how each model classified each observation. This could be useful in debugging the model or fitting multiple models at once to see which to use further.

You may have noticed the prediction quantities being returned. They are in the form of probabilities. That means that you will need a cut off threshold to determine if you should classify a one or zero. This only needs to be done in the binomial classification case, not regression.

Normally, you would determine this in training with cross-validation, but for simplicity, you will use a cut off of 0.50. Since this is a simple binomial problem, you will use `dplyr`'s `ifelse()` function to recode your probabilities:

``````     # Load the package
library(dplyr)

# Recode probabilities
conv.preds <- ifelse(predictions\$pred>=0.5,1,0)``````

Now you can build a confusion matrix with `caret` to review the results:

``````     # Load in `caret`
library(caret)

# Create the confusion matrix
cm <- confusionMatrix(conv.preds, ytest)

# Return the confusion matrix
cm
``````
``````    ## Confusion Matrix and Statistics
##
##           Reference
## Prediction   0   1
##          0 199  45
##          1  24  64
##
##                Accuracy : 0.7922
##                  95% CI : (0.7445, 0.8345)
##     No Information Rate : 0.6717
##     P-Value [Acc > NIR] : 8.166e-07
##
##                   Kappa : 0.5044
##  Mcnemar's Test P-Value : 0.01605
##
##             Sensitivity : 0.8924
##             Specificity : 0.5872
##          Pos Pred Value : 0.8156
##          Neg Pred Value : 0.7273
##              Prevalence : 0.6717
##          Detection Rate : 0.5994
##    Detection Prevalence : 0.7349
##       Balanced Accuracy : 0.7398
##
##        'Positive' Class : 0
##``````

You are getting around 0.7921687 accuracy on this data set, which is good performance for this data set. Many algorithms have scored higher, but this is good for a quick ensemble. With some proper training with cross-validation and trying some different models, it is easy to see how you can quickly improve this score.

### Tuning Hyperparameters

While model performance is not terrible, you can try to improve your performance by tuning some hyperparameters of some of the models that you have in the ensemble. Ranger was not weighted heavily in your model, but maybe that is because you need more trees and need to tune mtry parameter. Maybe you can improve bagging as well by increasing the `nbagg` parameter to `250` from the default of `25`.

There are two methods for doing this: either you define a function that calls the learner and modifies a parameter or you use the `create.Learner()` function. In the next sections, you'll learn more about these options.

#### Defining a Function

The first one is with the help of `function()`. Here, you would define a function that calls the learner and modifies a parameter. The function call uses the ellipsis `...` to pass along additional arguments to a function. Those three little dots allow the modification to a formula without having to specify in the function what those modifications are. This means if you are changing 10 parameters, you do not need 10 objects in the function to map within the function. It is a generalizable way to write a function.

``````     SL.ranger.tune <- function(...){
SL.ranger(..., num.trees=1000, mtry=2)
}

SL.ipredbagg.tune <- function(...){
SL.ipredbagg(..., nbagg=250)
}``````

`SL.ranger.tune` is the name of your modified `ranger` method and `SL.ipredbagg.tune` is the name of your modified `ipredbagg` method. Now that you have some new learner functions created, you can pass these along to the cross validation formula to see if the performance improves.

Note that you will keep the original `SL.ranger` and `SL.ipredbagg` functions in the algorithm to see if performance improves on your tuned versions of the functions.

``````     # Set the seed
set.seed(150)

# Tune the model
cv.model.tune <- CV.SuperLearner(y,
x,
V=5,
SL.library=list("SL.ranger",
"SL.ksvm",
"SL.ipredbagg","SL.bayesglm",
"SL.ranger.tune",
"SL.ipredbagg.tune"))

# Get summary statistics
summary(cv.model.tune)``````
``````    ##
## Call:
## CV.SuperLearner(Y = y, X = x, V = 5, SL.library = list("SL.ranger",
##     "SL.ksvm", "SL.ipredbagg", "SL.bayesglm", "SL.ranger.tune", "SL.ipredbagg.tune"))
##
##
## Risk is based on: Mean Squared Error
##
## All risk estimates are based on V =  5
##
##              Algorithm     Ave       se     Min     Max
##          Super Learner 0.17272 0.014969 0.15849 0.19844
##            Discrete SL 0.17250 0.014989 0.15645 0.18430
##          SL.ranger_All 0.17897 0.015084 0.15388 0.19920
##            SL.ksvm_All 0.19573 0.020278 0.16095 0.26304
##       SL.ipredbagg_All 0.17667 0.015629 0.16473 0.18898
##        SL.bayesglm_All 0.16628 0.014318 0.15322 0.18022
##     SL.ranger.tune_All 0.17637 0.014882 0.15218 0.19793
##  SL.ipredbagg.tune_All 0.17813 0.015869 0.16455 0.19260``````
``````    # Plot the tuned model
plot(cv.model.tune)`````` You can see from this plot that `ipredbagg` seems to improve as you increase the `nbagg` parameter as seen in `SL.ipredbagg.tune`. `Ranger` seems to get worse with tuning the parameters, but let's leave it in and see if SuperLearner finds it to be relevant.

Again, the beauty is `SuperLearner` will just set it to zero if it is not relevant. Remember, that the best ensembles are not composed of the best performing algorithms, but rather the algorithms that best complement each other to classify a prediction.

Let's fit the new model with tuned parameters and see how they weigh:

``````     # Set the seed
set.seed(150)

# Create the tuned model
model.tune <- SuperLearner(y,
x,
SL.library=list("SL.ranger",
"SL.ksvm",
"SL.ipredbagg",
"SL.bayesglm",
"SL.ranger.tune",
"SL.ipredbagg.tune"))

# Return the tuned model
model.tune``````
``````    ##
## Call:
## SuperLearner(Y = y, X = x, SL.library = list("SL.ranger", "SL.ksvm",
##     "SL.ipredbagg", "SL.bayesglm", "SL.ranger.tune", "SL.ipredbagg.tune"))
##
##
##
##                            Risk      Coef
## SL.ranger_All         0.1748247 0.0000000
## SL.ksvm_All           0.1974033 0.0000000
## SL.ipredbagg_All      0.1745503 0.0000000
## SL.bayesglm_All       0.1634855 0.7162423
## SL.ranger.tune_All    0.1725514 0.0000000
## SL.ipredbagg.tune_All 0.1711161 0.2837577``````

`SL.bayesglm` and `SL.ipredbagg.tune` are now the only algorithms weighted in the ensemble. Predicting on the test set gives the following result:

``````     # Gather predictions for the tuned model
predictions.tune <- predict.SuperLearner(model.tune, newdata=xtest)

# Recode predictions
conv.preds.tune <- ifelse(predictions.tune\$pred>=0.5,1,0)

# Return the confusion matrix
confusionMatrix(conv.preds.tune,ytest)``````
``````    ## Confusion Matrix and Statistics
##
##           Reference
## Prediction   0   1
##          0 200  43
##          1  23  66
##
##                Accuracy : 0.8012
##                  95% CI : (0.7542, 0.8428)
##     No Information Rate : 0.6717
##     P-Value [Acc > NIR] : 1.116e-07
##
##                   Kappa : 0.5271
##  Mcnemar's Test P-Value : 0.01935
##
##             Sensitivity : 0.8969
##             Specificity : 0.6055
##          Pos Pred Value : 0.8230
##          Neg Pred Value : 0.7416
##              Prevalence : 0.6717
##          Detection Rate : 0.6024
##    Detection Prevalence : 0.7319
##       Balanced Accuracy : 0.7512
##
##        'Positive' Class : 0
##``````

This gives you a little improvement on the test set and illustrates the concepts of using `SuperLearner` for model tuning.

#### `create.Learner()`

The second method for tuning hyperparameters is to use the `create.Learner()` function. This allows you to customize an existing `SuperLearner`:

``````     learner <- create.Learner("SL.ranger", params=list(num.trees=1000, mtry=2))
learner2 <- create.Learner("SL.ipredbagg", params=list(nbagg=250))``````

The learner character string is the first argument to the `create.Learner()` function. Then you pass a list of the parameters to modify. This will create an object:

``     learner``
``````    ## \$grid
## NULL
##
## \$names
##  "SL.ranger_1"
##
## \$base_learner
##  "SL.ranger"
##
## \$params
## \$params\$num.trees
##  1000
##
## \$params\$mtry
##  2``````

Now, when passing the learner to SuperLearner, you use the names object in the learner object:

``````     # Set the seed
set.seed(150)

# Create a second tuned model
cv.model.tune2 <- CV.SuperLearner(y,
x,
V=5,
SL.library=list("SL.ranger",
"SL.ksvm",
"SL.ipredbagg",
"SL.bayesglm",
learner\$names,
learner2\$names))

# Get summary statistics
summary(cv.model.tune2)``````
``````    ##
## Call:
## CV.SuperLearner(Y = y, X = x, V = 5, SL.library = list("SL.ranger",
##     "SL.ksvm", "SL.ipredbagg", "SL.bayesglm", learner\$names, learner2\$names))
##
##
## Risk is based on: Mean Squared Error
##
## All risk estimates are based on V =  5
##
##           Algorithm     Ave       se     Min     Max
##       Super Learner 0.17272 0.014969 0.15849 0.19844
##         Discrete SL 0.17250 0.014989 0.15645 0.18430
##       SL.ranger_All 0.17897 0.015084 0.15388 0.19920
##         SL.ksvm_All 0.19573 0.020278 0.16095 0.26304
##    SL.ipredbagg_All 0.17667 0.015629 0.16473 0.18898
##     SL.bayesglm_All 0.16628 0.014318 0.15322 0.18022
##     SL.ranger_1_All 0.17637 0.014882 0.15218 0.19793
##  SL.ipredbagg_1_All 0.17813 0.015869 0.16455 0.19260``````
``````    # Plot `cv.model.tune2`
plot(cv.model.tune2)`````` The end result is the same as if you used the first method. It is up to you to use whatever method you desire.

## More Ensemble Models and Machine Learning in R

Wow, you covered a lot of ground! By now, you should have a good handle on the SuperLearner and should have successfully fit your first ensemble with SuperLearner. This package makes it nice and easy to add models really quickly. There are some subtlies with methods and what data form to use. However, when in doubt, a model matrix saved as a data frame almost always works.

As a reminder, you installed and loaded `SuperLearner`, formatted your dataset, fit a single model, fit your first ensemble, predicted with the ensemble and tuned some hyperparameters!

The next steps would be to tackle some more advanced topics with this package, such as parallelization, feature selection and screening, using model matrices, writing your own SuperLearner and ensemble cross validation.

Check out DataCamp's Machine Learning in R for beginners tutorial.

### Machine Learning with caret in R

Beginner
4 hours
51,520
This course teaches the big ideas in machine learning like how to build and evaluate predictive models.
See Details

### Supervised Learning in R: Regression

Beginner
4 hours
33,170
In this course you will learn how to predict future events using linear regression, generalized additive models, random forests, and xgboost.

### Machine Learning in the Tidyverse

Beginner
5 hours
12,108
Leverage the tools in the tidyverse to generate, explore and evaluate machine learning models.
See More