Visualization Co2 emissions  Python vs Bicycle Market  SQL
We will analyze various features of vehicles and their impact on CO2 emissions through data analysis. Some of the key insights and findings include:
Engine size and number of cylinders have a positive correlation with CO2 emissions. This means that larger engines with more cylinders tend to emit more CO2.
Fuel type is an important factor in determining CO2 emissions, with ethanol and premium gasoline having the highest emissions, and diesel having the lowest.
Vehicle class also plays a significant role in CO2 emissions, with SUVs and trucks emitting more CO2 than smaller vehicles like cars and wagons.
Fuel efficiency is negatively correlated with CO2 emissions, which means that vehicles with higher fuel efficiency tend to emit less CO2.
The distribution of CO2 emissions for different vehicle features, such as engine size and fuel type, can be visualized effectively using violin plots, which provide a clear view of the distribution and density of data points.
Interactive plots, such as scatter plots with hover text and facetted bubble charts, can help to identify trends and patterns in the data across different variables and subsets of the data.
Overall, these findings can help inform decisions around vehicle design, fuel choices, and policy measures aimed at reducing CO2 emissions from transportation.
From the Cars data, we will first extract all the column data like make of the car, model of the car, cylinders of car for ease of analysis on the individual metrics of the properties of car.
# Import the pandas and numpy packages
import pandas as pd
import numpy as np
# Load the data
cars = pd.read_csv('data/co2_emissions_canada.csv')
# create numpy arrays
cars_makes = cars['Make'].to_numpy()
cars_models = cars['Model'].to_numpy()
cars_classes = cars['Vehicle Class'].to_numpy()
cars_engine_sizes = cars['Engine Size(L)'].to_numpy()
cars_cylinders = cars['Cylinders'].to_numpy()
cars_transmissions = cars['Transmission'].to_numpy()
cars_fuel_types = cars['Fuel Type'].to_numpy()
cars_fuel_consumption = cars['Fuel Consumption Comb (L/100 km)'].to_numpy()
cars_co2_emissions = cars['CO2 Emissions(g/km)'].to_numpy()
# Preview the dataframe
cars
We will print the first 10 co2 emissions from our co2 emmisions array.
# Look at the first ten items in the CO2 emissions array
cars_co2_emissions[:10]
Engine size is correlated with CO2 emissions. As engine size varies so does the amount of fuel required to power it, resulting in higher emissions. For example larger engines typically produce more power, which can lead to higher CO2 emissions. Engine size can be an important factor in predicting CO2 emissions from a car's features. By including engine size as a feature in a predictive model, it may be possible to improve the accuracy of CO2 emission predictions. Here the median engine size we have calculated below:
# calculate the median engine size in liters
median_engine_size = cars['Engine Size(L)'].median()
# print the result
print("The median engine size in liters is:", median_engine_size)
Grouping the data by vehicle class and calculating the median engine size histogram can provide useful insights into the engine sizes of different vehicle classes.
The median engine size of different vehicle classes can vary significantly. For example, the median engine size of a midsize might be larger than that of a compact car.
The median engine size histogram can help us identify any patterns or trends in the distribution of engine sizes across different vehicle classes. For instance, we might observe that larger vehicles tend to have larger engines on average.
By visualizing the median engine size histogram for each vehicle class, we can gain a better understanding of the relative importance of engine size as a factor in determining vehicle class. This information can be valuable for predicting other features of a vehicle, such as fuel efficiency or CO2 emissions.
import matplotlib.pyplot as plt
import seaborn as sns
engine_size_stats = cars['Engine Size(L)'].describe()
# plot a histogram of engine size
cars['Engine Size(L)'].plot(kind='hist', bins=20)
plt.title('Distribution of Engine Size')
plt.xlabel('Engine Size (L)')
plt.show()
# group the data by vehicle class and calculate median engine size
class_engine_size = cars.groupby('Vehicle Class')['Engine Size(L)'].median()
print(class_engine_size)
To verify our hypothesis of larger engine sizes result in higher CO2 emission, we have plotted the engine size vs CO2 emission graph, and the observations are:

There appears to be a positive correlation between engine size and CO2 emissions, with larger engines generally producing more emissions. This trend is visible in the scatter plot as a general upward trend from left to right.

The spread of CO2 emissions increases with larger engine sizes, with more variability in emissions for larger engines. This can be seen in the scatter plot as a widening spread of data points for higher engine sizes.
We also plot a bar chart of mean CO2 emissions by fuel type to get insights on the fuel type impact on the CO2 emissions.

The bar chart shows that vehicles running on diesel fuel produce the highest average CO2 emissions, followed by premium gasoline and regular gasoline. This suggests that diesel fuel may be less environmentally friendly than gasoline.

One interesting observation from the bar chart is that vehicles running on natural gas have the lowest average CO2 emissions among all fuel types. This indicates that natural gas could be a more sustainable fuel option for vehicles in terms of reducing their carbon footprint.
# plot a scatter plot of engine size vs. CO2 emissions
sns.scatterplot(data=cars, x='Engine Size(L)', y='CO2 Emissions(g/km)')
plt.title('Engine Size vs. CO2 Emissions')
plt.show()
# calculate the correlation between engine size and CO2 emissions
correlation = cars['Engine Size(L)'].corr(cars['CO2 Emissions(g/km)'])
print("The correlation between engine size and CO2 emissions is:", correlation)
# plot a bar chart of mean CO2 emissions by fuel type
cars.groupby('Fuel Type')['CO2 Emissions(g/km)'].mean().plot(kind='bar')
plt.title('Mean CO2 Emissions by Fuel Type')
plt.xlabel('Fuel Type')
plt.ylabel('Mean CO2 Emissions (g/km)')
plt.show()
On average, vehicles that run on natural gas have the lowest fuel consumption, followed by those that run on diesel and regular gasoline. Vehicles that run on premium gasoline have the highest fuel consumption, with ethanol (E85) in the middle.
The fuel consumption for vehicles that run on diesel, ethanol (E85), and natural gas varies less compared to those running on regular or premium gasoline, which have a wider range of fuel consumption values.
# calculate the average fuel consumption for each fuel type
regular_gas_avg = cars.loc[cars['Fuel Type'] == 'X', 'Fuel Consumption Comb (L/100 km)'].mean()
premium_gas_avg = cars.loc[cars['Fuel Type'] == 'Z', 'Fuel Consumption Comb (L/100 km)'].mean()
ethanol_avg = cars.loc[cars['Fuel Type'] == 'E', 'Fuel Consumption Comb (L/100 km)'].mean()
diesel_avg = cars.loc[cars['Fuel Type'] == 'D', 'Fuel Consumption Comb (L/100 km)'].mean()
# print the results
print(f"Average fuel consumption for regular gasoline: {regular_gas_avg:.2f} L/100 km")
print(f"Average fuel consumption for premium gasoline: {premium_gas_avg:.2f} L/100 km")
print(f"Average fuel consumption for ethanol: {ethanol_avg:.2f} L/100 km")
print(f"Average fuel consumption for diesel: {diesel_avg:.2f} L/100 km")
From the given data, we can see that ethanol has the highest average fuel consumption of 16.86 L/100 km, followed by premium gasoline at 11.42 L/100 km. Diesel has the lowest average fuel consumption of 8.84 L/100 km, and regular gasoline falls in between at 10.08 L/100 km. This information can be useful for making decisions related to fuel efficiency and cost.