Skip to content
Understanding Vehicle CO2 Emissions and the Bicycle Market
  • AI Chat
  • Code
  • Report
  • PART 1: Understanding Vehicle CO2 Emissions

    Image from Natural Resources Canada

    Key findings

    Here are the several findings I discovered after investigating our data on CO2 emissions from vehicles in Canada:

    • The median engine size in Canada is 3 liters, which is relatively large and typically found in larger vehicles such as SUVs, trucks, and sports cars.

    • The average fuel consumption is highest for ethanol, followed by premium gasoline, regular gasoline, and diesel.

    • There is a strong positive correlation between fuel consumption and CO2 emissions, implying that as fuel consumption rises, so will CO2 emissions.

    • The MID-SIZE vehicle class has lower average CO2 emissions than the SUV-SMALL class.

    • CO2 emissions from all vehicles in Canada are extremely high on average, with vehicles with engine sizes of 2 liters or less emitting 23% less CO2 than larger vehicles.

    • Natural gas emits the least amount of CO2, while ethanol emits the most.

    • The car manufacturer Smart has the lowest average CO2 emissions, while Bugatti has the highest.

    1.1 Background

    You volunteer for a public policy advocacy organization in Canada, and your colleague asked you to help her draft recommendations for guidelines on CO2 emissions rules. After researching emissions data for a wide range of Canadian vehicles, she would like you to investigate which vehicles produce lower emissions.

    1.2 Objectives

    The objective of this research is to gain some insights on the types of vehicles that have lower CO2 emissions. Specifically, I aim to answer the following questions:

    1. What is the median engine size in liters?
    2. What is the average fuel consumption for regular gasoline (X), premium gasoline (Z), ethanol (E), and diesel (D)?
    3. What is the correlation between fuel consumption and CO2 emissions?
    4. Which vehicle class has lower average CO2 emissions, 'SUV - SMALL' or 'MID-SIZE'?
    5. What are the average CO2 emissions for all vehicles? For vehicles with an engine size of 2.0 liters or smaller?
    6. Any other insights you found during your analysis?

    1.3 Introduction

    A Canadian group that works to improve public policy has given me the challenging task of looking into CO2 emissions data for a wide range of Canadian vehicles. My colleague wants to know which vehicles have less carbon dioxide (CO2) emissions, and I'm determined to give her useful information. I'll look into what the average engine size is, the average consumption for each fuel type, and if there's a link between fuel use and CO2 emissions. I'll also look at the data to find out which vehicle class has the lowest CO2 emissions and if there are any other interesting results. This undertaking won't be easy, but I'm up for the challenge and will deliver meaningful results to the organization.

    1.4 Data description

    The data comes from the Government of Canada's open data website.

    The following list gives some descriptions of our key variables:

    • "Make" - The company that manufactures the vehicle.
    • "Model" - The vehicle's model.
    • "Vehicle Class" - Vehicle class by utility, capacity, and weight.
    • "Engine Size(L)" - The engine's displacement in liters.
    • "Cylinders" - The number of cylinders.
    • "Transmission" - The transmission type: A = Automatic, AM = Automatic Manual, AS = Automatic with select shift, AV = Continuously variable, M = Manual, 3 - 10 = the number of gears.
    • "Fuel Type" - The fuel type: X = Regular gasoline, Z = Premium gasoline, D = Diesel, E = Ethanol (E85), N = natural gas.
    • "Fuel Consumption Comb (L/100 km)" - Combined city/highway (55%/45%) fuel consumption in liters per 100 km (L/100 km).
    • "CO2 Emissions(g/km)" - The tailpipe carbon dioxide emissions in grams per kilometer for combined city and highway driving.

    1.5 Exploratory data analysis

    First, we import the necessary packages to explore our data. Then, we check for issues and missing values and specify key variables for analysis. This helps us understand and control our dataset.

    # Import necessary packages
    import pandas as pd
    import numpy as np
    import matplotlib.pyplot as plt
    import matplotlib.style as style
    # Load the data into a Pandas DataFrame:
    cars = pd.read_csv('data/co2_emissions_canada.csv')
    # Explore and check the data
    cars.head() # Print the first 5 rows of the data
    cars.info() # Print information about the data, including the data types and number of non-null values
    cars.isnull().sum()  # Check for missing values in each column
    cars.describe()  # Compute basic descriptive statistics for each numerical column
    # Extract key variables
    cars_makes, cars_models, cars_classes, cars_engine_sizes, cars_cylinders, \
        cars_transmissions, cars_fuel_types, cars_fuel_consumption, cars_co2_emissions = \
        [cars[col].to_numpy() for col in ['Make', 'Model', 'Vehicle Class', 'Engine Size(L)', 
                                          'Cylinders', 'Transmission', 'Fuel Type', 
                                          'Fuel Consumption Comb (L/100 km)', 'CO2 Emissions(g/km)']]
    # Overall theme
    plt.style.use('fivethirtyeight')