Skip to content
analyzing CO2 emissions data for Canadian vehicles, and bicycle market of bicycle store
  • AI Chat
  • Code
  • Report
  • 💾 The data I

    I have access to seven years of CO2 emissions data for Canadian vehicles (source):

    • "Make" - The company that manufactures the vehicle.
    • "Model" - The vehicle's model.
    • "Vehicle Class" - Vehicle class by utility, capacity, and weight.
    • "Engine Size(L)" - The engine's displacement in liters.
    • "Cylinders" - The number of cylinders.
    • "Transmission" - The transmission type: A = Automatic, AM = Automatic Manual, AS = Automatic with select shift, AV = Continuously variable, M = Manual, 3 - 10 = the number of gears.
    • "Fuel Type" - The fuel type: X = Regular gasoline, Z = Premium gasoline, D = Diesel, E = Ethanol (E85), N = natural gas.
    • "Fuel Consumption Comb (L/100 km)" - Combined city/highway (55%/45%) fuel consumption in liters per 100 km (L/100 km).
    • "CO2 Emissions(g/km)" - The tailpipe carbon dioxide emissions in grams per kilometer for combined city and highway driving.

    The data comes from the Government of Canada's open data website.

    Importing packages

    # Import packages
    import pandas as pd
    import numpy as np
    import matplotlib.pyplot as plt
    import seaborn as sns

    Loading Data and preview

    # Load the data
    cars = pd.read_csv('data/co2_emissions_canada.csv')
    # Preview the dataframe

    Exploring Data

    #Checking null values
    # check duplicates
    # remove duplicates
    cars= cars.drop_duplicates()
    # Fixing strings
    cars["Make"]= cars["Make"].str.upper()
    cars["Model"]= cars["Model"].str.upper()
    cars["Vehicle Class"]= cars["Vehicle Class"].str.upper()
    cars["Transmission"]= cars["Transmission"].str.upper()
    # create numpy arrays
    cars_makes = cars['Make'].to_numpy()
    cars_models = cars['Model'].to_numpy()
    cars_classes = cars['Vehicle Class'].to_numpy()
    cars_engine_sizes = cars['Engine Size(L)'].to_numpy()
    cars_cylinders = cars['Cylinders'].to_numpy()
    cars_transmissions = cars['Transmission'].to_numpy()
    cars_fuel_types = cars['Fuel Type'].to_numpy()
    cars_fuel_consumption = cars['Fuel Consumption Comb (L/100 km)'].to_numpy()
    cars_co2_emissions = cars['CO2 Emissions(g/km)'].to_numpy()

    Answering the questions

    • what is the median Engine size in liters?
    print("The Median of Engine sizes is {}".format(round(cars_engine_sizes.mean(),2)) + " L")
    • What is the average fuel consumption for regular gasoline, premium gasoline, Ethanol, and diesel?