Supervised Learning with scikit-learn
👋 Welcome to your new workspace! Here, you can experiment with the data you used in Supervised Learning with scikit-learn and practice your newly learned skills with a challenge. You can find out more about DataCamp Workspace here.
Below is a code cell that imports the course packages and loads in the course datasets as pandas DataFrames.
🏃To execute the code, click inside the cell to select it and click "Run" or the ► icon. You can also use Shift-Enter to run a selected cell and automatically switch to the next cell.
# Import the course packages
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import sklearn
import scipy.stats
# Import the course datasets as DataFrames
auto = pd.read_csv("datasets/auto.csv")
boston = pd.read_csv("datasets/boston.csv")
diabetes = pd.read_csv("datasets/diabetes.csv")
gapminder = pd.read_csv("datasets/gm_2008_region.csv")
votes = pd.read_csv("datasets/votes.csv")
whitewine = pd.read_csv("datasets/white-wine.csv")
# Preview the first DataFrame
auto Challenge Yourself
Don't know where to start? Add code to the code cell below to try the following challenge:
You work for a used car website where people can list their cars for sale. Although users submit details such as the horsepower, weight, and place of origin, they are often unable to provide the precise miles per gallon (
mpg). Your manager has asked you whether it is possible to use these various features to predict the mpg of cars so that the website can add the information automatically. You have access to theautoDataFrame to train a model and make predictions.Try to use all of the relevant techniques you learned in Supervised Learning with scikit-learn. This includes preprocessing the data and fine-tuning your model.
Reminder: To execute the code you add to a cell, click inside the cell to select it and click "Run" or the ► icon. You can also use Shift-Enter to run a selected cell and automatically switch to the next cell.
# Use this cell (and add others as needed) to predict the mpg of cars in the auto DataFrame
Continue to Explore
Feeling confident about your skills? Continue to Unsupervised Learning in Python, or check out the other courses in the Machine Learning Scientist with Python Career Track to learn other advanced machine learning techniques.
If you're interested in exploring the remaining course datasets, you can refer to the DataFrames and potential target variables below:
boston:MEDV, the median value of owner-occupied homes in thousands of dollars.diabetes:diabetes,0indicates that the patient does not have diabetes, while a value of1indicates that the patient does have diabetes.gapminder:life, life expectancyparty: party affiliation (democratorrepublican)whitewine:quality