Apply statistical modeling in a real-life setting using logistic regression and decision trees to model credit risk.

Start Course for Free4 Hours16 Videos52 Exercises41,501 Learners

4000 XPor

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA. You confirm you are at least 16 years old (13 if you are an authorized Classrooms user).<p>This hands-on-course with real-life credit data will teach you how to model credit risk by using logistic regression and decision trees in R.</p><p>Modeling credit risk for both personal and company loans is of major importance for banks. The probability that a debtor will default is a key component in getting to a measure for credit risk. While other models will be introduced in this course as well, you will learn about two model types that are often used in the credit scoring context; logistic regression and decision trees. You will learn how to use them in this particular context, and how these models are evaluated by banks.</p>

- 1
### Introduction and data preprocessing

**Free**This chapter begins with a general introduction to credit risk models. We'll explore a real-life data set, then preprocess the data set such that it's in the appropriate format before applying the credit risk models.

Introduction and data structure50 xpExploring the credit data100 xpInterpreting a CrossTable()50 xpHistograms and outliers50 xpHistograms100 xpOutliers100 xpMissing data and coarse classification50 xpDeleting missing data100 xpReplacing missing data100 xpKeeping missing data100 xpData splitting and confusion matrices50 xpSplitting the data set100 xpCreating a confusion matrix100 xp - 2
### Logistic regression

Logistic regression is still a widely used method in credit risk modeling. In this chapter, you will learn how to apply logistic regression models on credit data in R.

Logistic regression: introduction50 xpBasic logistic regression100 xpInterpreting the odds for a categorical variable50 xpMultiple variables in a logistic regression model100 xpInterpreting significance levels50 xpLogistic regression: predicting the probability of default50 xpPredicting the probability of default100 xpMaking more discriminative models100 xpEvaluating the logistic regression model result50 xpSpecifying a cut-off100 xpComparing two cut-offs50 xpWrap-up and remarks50 xpComparing link functions for a given cut-off100 xp - 3
### Decision trees

Classification trees are another popular method in the world of credit risk modeling. In this chapter, you will learn how to build classification trees using credit data in R.

What is a decision tree?50 xpComputing the gain for a tree100 xpChanging one Gini...50 xpBuilding decision trees using the rpart()-package50 xpUndersampling the training set100 xpChanging the prior probabilities100 xpIncluding a loss matrix100 xpPruning the decision tree50 xpPruning the tree with changed prior probabilities100 xpPruning the tree with the loss matrix100 xpOther tree options and the construction of confusion matrices50 xpOne final tree using more options100 xpConfusion matrices and accuracy of our final trees100 xpOptimizing the accuracy50 xp - 4
### Evaluating a credit risk model

In this chapter, you'll learn how you can evaluate and compare the results obtained through several credit risk models.

Finding the right cut-off: the strategy curve50 xpComputing a bad rate given a fixed acceptance rate100 xpThe strategy table and strategy curve100 xpTo tree or not to tree?50 xpThe ROC-curve50 xpROC-curves for comparison of logistic regression models100 xpROC-curves for comparison of tree-based models100 xpInput selection based on the AUC50 xpAnother round of pruning based on AUC100 xpBest of four50 xpFurther model reduction?100 xpCourse wrap-up50 xp

Prerequisites

Intermediate R for FinanceDirector of Data Science Education at Flatiron School

Lore is a data scientist with expertise in applied finance. She obtained her PhD in Business Economics and Statistics at KU Leuven, Belgium. During her PhD, she collaborated with several banks working on advanced methods for the analysis of credit risk data. Lore formerly worked as a Data Science Curriculum Lead at DataCamp, and is and is now Director of Data Science Education at Flatiron School, a coding school with branches in 8 cities and online programs.

“I've used other sites—Coursera, Udacity, things like that—but DataCamp's been the one that I've stuck with.”

Devon Edwards Joseph

Lloyds Banking Group

“DataCamp is the top resource I recommend for learning data science.”

Louis Maiden

Harvard Business School

“DataCamp is by far my favorite website to learn from.”

Ronald Bowers

Decision Science Analytics, USAA

or

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA. You confirm you are at least 16 years old (13 if you are an authorized Classrooms user).