This course will introduce a powerful classifier, the support vector machine (SVM) using an intuitive, visual approach. Support Vector Machines in R will help students develop an understanding of the SVM model as a classifier and gain practical experience using R’s libsvm implementation from the e1071 package. Along the way, students will gain an intuitive understanding of important concepts, such as hard and soft margins, the kernel trick, different types of kernels, and how to tune SVM parameters. Get ready to classify data with this impressive model.
This chapter introduces some key concepts of support vector machines through a simple 1-dimensional example. Students are also walked through the creation of a linearly separable dataset that is used in the subsequent chapter.Sugar content of soft drinks50 xpVisualizing a sugar content dataset100 xpIdentifying decision boundaries50 xpFind the maximal margin separator100 xpVisualize the maximal margin separator100 xpGenerating a linearly separable dataset50 xpGenerate a 2d uniformly distributed dataset.100 xpCreate a decision boundary100 xpIntroduce a margin in the dataset100 xp
Support Vector Classifiers - Linear Kernels
Introduces students to the basic concepts of support vector machines by applying the svm algorithm to a dataset that is linearly separable. Key concepts are illustrated through ggplot visualisations that are built from the outputs of the algorithm and the role of the cost parameter is highlighted via a simple example. The chapter closes with a section on how the algorithm deals with multiclass problems.Linear Support Vector Machines50 xpCreating training and test datasets100 xpBuilding a linear SVM classifier100 xpExploring the model and calculating accuracy100 xpVisualizing Linear SVMs50 xpVisualizing support vectors using ggplot100 xpVisualizing decision & margin bounds using `ggplot2`100 xpVisualizing decision & margin bounds using `plot()`100 xpTuning linear SVMs50 xpTuning a linear SVM100 xpVisualizing decision boundaries and margins100 xpWhen are soft margin classifiers useful?50 xpMulticlass problems50 xpA multiclass classification problem100 xpIris redux - a more robust accuracy.100 xp
Provides an introduction to polynomial kernels via a dataset that is radially separable (i.e. has a circular decision boundary). After demonstrating the inadequacy of linear kernels for this dataset, students will see how a simple transformation renders the problem linearly separable thus motivating an intuitive discussion of the kernel trick. Students will then apply the polynomial kernel to the dataset and tune the resulting classifier.Generating a radially separable dataset50 xpGenerating a 2d radially separable dataset100 xpVisualizing the dataset100 xpLinear SVMs on radially separable data50 xpLinear SVM for a radially separable dataset100 xpAverage accuracy for linear SVM100 xpThe kernel trick50 xpVisualizing transformed radially separable data100 xpSVM with polynomial kernel100 xpTuning SVMs50 xpUsing `tune.svm()`100 xpBuilding and visualizing the tuned model100 xp
Radial Basis Function Kernels
Builds on the previous three chapters by introducing the highly flexible Radial Basis Function (RBF) kernel. Students will create a "complex" dataset that shows up the limitations of polynomial kernels. Then, following an intuitive motivation for the RBF kernel, students see how it addresses the shortcomings of the other kernels discussed in this course.Generating a complex dataset50 xpGenerating a complex dataset - part 1100 xpGenerating a complex dataset - part 2100 xpVisualizing the dataset100 xpMotivating the RBF kernel50 xpLinear SVM for complex dataset100 xpQuadratic SVM for complex dataset100 xpThe RBF Kernel50 xpPolynomial SVM on a complex dataset100 xpRBF SVM on a complex dataset100 xpTuning an RBF kernel SVM100 xp
PrerequisitesIntroduction to R
Kailash AwatiSee More
Senior Lecturer at University of Technology Sydney.
Kailash Awati is co-founder and principal of Sensanalytics, a consultancy specializing in sensemaking and analytics. He is also on the academic staff at the University of Technology Sydney where he teaches into the Master of Data Science and Innovation program. He blogs about analytics, sensemaking and his other professional interests at Eight to Late.