Everyone Can Learn Python Scholarship
1️⃣ Python 🐍 - CO2 Emissions
I am excited to take on the competition and challenge of analyzing CO2 emissions for a wide range of Canadian vehicles. As a volunteer for a public policy advocacy organization in Canada, my colleague has asked me to assist in drafting recommendations for guidelines on CO2 emissions rules. Using my knowledge of Python and my research skills, I will investigate which vehicles produce lower emissions and present my findings to my colleague.
💾 The data I
The seven years of CO2 emissions data for Canadian vehicles is found in this link (source):
- "Make" - The company that m
- anufactures the vehicle.
- "Model" - The vehicle's model.
- "Vehicle Class" - Vehicle class by utility, capacity, and weight.
- "Engine Size(L)" - The engine's displacement in liters.
- "Cylinders" - The number of cylinders.
- "Transmission" - The transmission type: A = Automatic, AM = Automatic Manual, AS = Automatic with select shift, AV = Continuously variable, M = Manual, 3 - 10 = the number of gears.
- "Fuel Type" - The fuel type: X = Regular gasoline, Z = Premium gasoline, D = Diesel, E = Ethanol (E85), N = natural gas.
- "Fuel Consumption Comb (L/100 km)" - Combined city/highway (55%/45%) fuel consumption in liters per 100 km (L/100 km).
- "CO2 Emissions(g/km)" - The tailpipe carbon dioxide emissions in grams per kilometer for combined city and highway driving.
The data comes from the Government of Canada's open data website.
# Import the pandas and numpy packages
import pandas as pd
import numpy as np
# Load the data
cars = pd.read_csv('data/co2_emissions_canada.csv')
# create numpy arrays
cars_makes = cars['Make'].to_numpy()
cars_models = cars['Model'].to_numpy()
cars_classes = cars['Vehicle Class'].to_numpy()
cars_engine_sizes = cars['Engine Size(L)'].to_numpy()
cars_cylinders = cars['Cylinders'].to_numpy()
cars_transmissions = cars['Transmission'].to_numpy()
cars_fuel_types = cars['Fuel Type'].to_numpy()
cars_fuel_consumption = cars['Fuel Consumption Comb (L/100 km)'].to_numpy()
cars_co2_emissions = cars['CO2 Emissions(g/km)'].to_numpy()
# Preview the dataframe
cars# Look at the first ten items in the CO2 emissions array
cars_co2_emissions[:10]💪 Challenge I
Help your colleague gain insights on the type of vehicles that have lower CO2 emissions. Include:
- What is the median engine size in liters?
- What is the average fuel consumption for regular gasoline (Fuel Type = X), premium gasoline (Z), ethanol (E), and diesel (D)?
- What is the correlation between fuel consumption and CO2 emissions?
- Which vehicle class has lower average CO2 emissions, 'SUV - SMALL' or 'MID-SIZE'?
- What are the average CO2 emissions for all vehicles? For vehicles with an engine size of 2.0 liters or smaller?
- Any other insights you found during your analysis?
1. The following is the median engine size in litres.
# Calculating the median engine size
median_engine_size =np.median(cars_engine_sizes)
# Print the result
print("The median engine size in liters is:", median_engine_size)
2. To find the average fuel consumption for different fuel types, I created a boolean mask for each fuel type, then used the mask to filter the cars_fuel_types and cars_fuel_consumption numpy arrays, and then I calculated the mean of the filtered fuel consumption array for each fuel type.
# create boolean masks for each fuel type
regular_gas_mask = cars_fuel_types == 'X'
premium_gas_mask = cars_fuel_types == 'Z'
ethanol_mask = cars_fuel_types == 'E'
diesel_mask = cars_fuel_types == 'D'
# use the masks to filter the fuel consumption array
regular_gas_consumption = cars_fuel_consumption[regular_gas_mask]
premium_gas_consumption = cars_fuel_consumption[premium_gas_mask]
ethanol_consumption = cars_fuel_consumption[ethanol_mask]
diesel_consumption = cars_fuel_consumption[diesel_mask]
# calculate the mean fuel consumption for each fuel type
mean_regular_gas_consumption = np.mean(regular_gas_consumption)
mean_premium_gas_consumption = np.mean(premium_gas_consumption)
mean_ethanol_consumption = np.mean(ethanol_consumption)
mean_diesel_consumption = np.mean(diesel_consumption)
# print the results
print("The average fuel consumption for regular gasoline is:", mean_regular_gas_consumption)
print("The average fuel consumption for premium gasoline is:", mean_premium_gas_consumption)
print("The average fuel consumption for ethanol is:", mean_ethanol_consumption)
print("The average fuel consumption for diesel is:", mean_diesel_consumption)
3. To find the correlation between fuel consumption and CO2 emissions, I used the np.corrcoef() function, which returns the correlation matrix of two arrays. In this case, I would use cars_fuel_consumption and cars_co2_emissions arrays.
correlation = np.corrcoef(cars_fuel_consumption, cars_co2_emissions)
print("The correlation between fuel consumption and CO2 emissions is:", correlation[0,1])
4. To find which vehicle class has lower average CO2 emissions, 'SUV - SMALL' or 'MID-SIZE', I created boolean masks for each vehicle class, use the masks to filter the cars_classes and cars_co2_emissions numpy arrays, and then calculated the mean of the filtered CO2 emissions array for each vehicle class.
# create boolean masks for each vehicle class
SUV_small_mask = cars_classes == 'SUV - SMALL'
MID_SIZE_mask = cars_classes == 'MID-SIZE'
# use the masks to filter the CO2 emissions array
SUV_small_co2_emissions = cars_co2_emissions[SUV_small_mask]
MID_SIZE_co2_emissions = cars_co2_emissions[MID_SIZE_mask]
# calculate the mean CO2 emissions for each vehicle class
mean_SUV_small_co2_emissions = np.mean(SUV_small_co2_emissions)
mean_MID_SIZE_co2_emissions = np.mean(MID_SIZE_co2_emissions)
# compare the mean CO2 emissions for each vehicle class
if mean_SUV_small_co2_emissions < mean_MID_SIZE_co2_emissions:
print("SUV-SMALL vehicles have lower average CO2 emissions")
else:
print("MID-SIZE vehicles have lower average CO2 emissions")
5.(a) To find the average CO2 emissions for all vehicles, I used the np.mean() function on the cars_co2_emissions numpy array.