Skip to content
Data Wrangling Practice
Data Wrangling
This is a data wrangling practice workbook with a dataset from the Coursera IBM Data Analyst Course.
Import Basic Libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as pltIgnore Future Warnings
import warnings
warnings.filterwarnings("ignore", category = FutureWarning)Import the Data from the URL below
filepath = "https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-DA0101EN-SkillsNetwork/labs/Data%20files/auto.csv"Create Dataframe df and view top 5 rows
df = pd.read_csv(filepath, header=None)
df.head()Assign Column Headers
headers = ["symboling","normalized-losses","make","fuel-type","aspiration", "num-of-doors","body-style",
"drive-wheels","engine-location","wheel-base", "length","width","height","curb-weight","engine-type",
"num-of-cylinders", "engine-size","fuel-system","bore","stroke","compression-ratio","horsepower",
"peak-rpm","city-mpg","highway-mpg","price"]
df.columns = headers
df.head()Identify and Handle Missing Values
df = df.replace("?", np.nan)
df.head()missing_data = df.isnull().sum()
missing_data = missing_data[missing_data>0]
missing_data = pd.DataFrame(missing_data, columns=["Missing Values"]).reset_index()
missing_data = missing_data.rename(columns = {"index":"Column Name"})
missing_dataCheck Data Types