Skip to content
New Workbook
Sign up
notebook
0

Everyone Can Learn Python Scholarship

1️⃣ Python 🐍 - CO2 Emissions

Now let's now move on to the competition and challenge.

📖 Background

You volunteer for a public policy advocacy organization in Canada, and your colleague asked you to help her draft recommendations for guidelines on CO2 emissions rules.

After researching emissions data for a wide range of Canadian vehicles, she would like you to investigate which vehicles produce lower emissions.

💾 The data I

You have access to seven years of CO2 emissions data for Canadian vehicles (source):

  • "Make" - The company that manufactures the vehicle.
  • "Model" - The vehicle's model.
  • "Vehicle Class" - Vehicle class by utility, capacity, and weight.
  • "Engine Size(L)" - The engine's displacement in liters.
  • "Cylinders" - The number of cylinders.
  • "Transmission" - The transmission type: A = Automatic, AM = Automatic Manual, AS = Automatic with select shift, AV = Continuously variable, M = Manual, 3 - 10 = the number of gears.
  • "Fuel Type" - The fuel type: X = Regular gasoline, Z = Premium gasoline, D = Diesel, E = Ethanol (E85), N = natural gas.
  • "Fuel Consumption Comb (L/100 km)" - Combined city/highway (55%/45%) fuel consumption in liters per 100 km (L/100 km).
  • "CO2 Emissions(g/km)" - The tailpipe carbon dioxide emissions in grams per kilometer for combined city and highway driving.

The data comes from the Government of Canada's open data website.

# Import the pandas and numpy packages
import pandas as pd
import numpy as np

# Load the data
cars = pd.read_csv('data/co2_emissions_canada.csv')

# create numpy arrays
cars_makes = cars['Make'].to_numpy()
cars_models = cars['Model'].to_numpy()
cars_classes = cars['Vehicle Class'].to_numpy()
cars_engine_sizes = cars['Engine Size(L)'].to_numpy()
cars_cylinders = cars['Cylinders'].to_numpy()
cars_transmissions = cars['Transmission'].to_numpy()
cars_fuel_types = cars['Fuel Type'].to_numpy()
cars_fuel_consumption = cars['Fuel Consumption Comb (L/100 km)'].to_numpy()
cars_co2_emissions = cars['CO2 Emissions(g/km)'].to_numpy()

# Preview the dataframe
cars

💪 Challenge I

Help your colleague gain insights on the type of vehicles that have lower CO2 emissions. Include:

  1. What is the median engine size in liters?
  2. What is the average fuel consumption for regular gasoline (Fuel Type = X), premium gasoline (Z), ethanol (E), and diesel (D)?
  3. What is the correlation between fuel consumption and CO2 emissions?
  4. Which vehicle class has lower average CO2 emissions, 'SUV - SMALL' or 'MID-SIZE'?
  5. What are the average CO2 emissions for all vehicles? For vehicles with an engine size of 2.0 liters or smaller?
  6. Any other insights you found during your analysis?
# What is the median engine size in liters

engine_median = cars['Engine Size(L)'].median()

print("The median engine size in liters is {}".format(engine_median))
# What is the average fuel consumption for regular gasoline

avg_fuel = cars.groupby(cars["Fuel Type"])["Fuel Consumption Comb (L/100 km)"].mean()


print("The average fuel consumption for regular gasoline is {} ".format(avg_fuel["X"]))

 
# What is the correlation between fuel consumption and CO2 emissions?

corr = cars["Fuel Consumption Comb (L/100 km)"].corr(cars["CO2 Emissions(g/km)"])
print("The correlation between fuel consumption and co2 emission is {}".format(corr))
correlation = cars[["Fuel Consumption Comb (L/100 km)", "CO2 Emissions(g/km)"]].corr()
import seaborn as sns
sns.heatmap(correlation, xticklabels=correlation.columns, yticklabels=correlation.columns, annot=True)
# Which vehicle class has lower average CO2 emissions, 'SUV - SMALL' or 'MID-SIZE'?

emission = cars.groupby(cars["Vehicle Class"])["CO2 Emissions(g/km)"].mean()
emission[["SUV - SMALL", "MID-SIZE"]]
print("(MID-SIZE) Vehicle class has a lower average co2 emission than (SUV-SMALL) vehicle size.")
# What are the average CO2 emissions for all vehicles? For vehicles with an engine size of 2.0 liters or smaller?

avg_emission = cars.groupby(["Engine Size(L)"])["CO2 Emissions(g/km)"].mean().reset_index(name="Average co2 emission")
avg_emission.loc[0:8]