Getting Started with Machine Learning in Python

Supervised Learning

For example, predicting if a customer will buy a product (target) based on their location and last five purchases (features).

The data has the following fields:

Column name	Description
`loan_id`	Unique loan id
`gender`	Gender - `Male` / `Female`
`married`	Marital status - `Yes` / `No`
`dependents`	Number of dependents
`education`	Education - `Graduate` / `Not Graduate`
`self_employed`	Self-employment status - `Yes` / `No`
`applicant_income`	Applicant's income
`coapplicant_income`	Coapplicant's income
`loan_amount`	Loan amount (thousands)
`loan_amount_term`	Term of loan (months)
`credit_history`	Credit history meets guidelines - `1` / `0`
`property_area`	Area of the property - `Urban` / `Semi Urban` / `Rural`
`loan_status`	Loan approval status (target) - `1` / `0`

# Import required libraries

# Read in the dataset


# Preview the data

We can't just dive straight into machine learning! We need to understand and format our data for modeling. What are we looking for?

If data is strongly correlated with the target variable it might be a good feature for predictions!

Do we need to modify any data, e.g., into different data types (ML models expect numeric data), or extract part of the data?

# Remove the loan_id to avoid accidentally using it as a feature

# Counts and data types per column

# Distributions and relationships

# Correlation between variables

# Target frequency

# Class frequency by loan_status

# First model using loan_amount

# Split into training and test sets

# Previewing the training set

# Instantiate a logistic regression model

# Fit to the training data

# Predict test set values

# Check the model's first five predictions

True Positive (TP) = # Correctly predicted as positive

True Negative (TN) = # Correctly predicted as negative

False Positive (FP) = # Incorrectly predicted as positive (actually negative)

False Negative (FN) = # Incorrectly predicted as negative (actually positive)

	Predicted: Negative	Predicted: Positive
Actual: Negative	True Negative	False Positive
Actual: Positive	False Negative	True Positive

‌
‌
‌