Skip to content
New Workbook
Sign up
Analysis with Python and SQL: CO2 emission evaluation and bicycle market analysis (Bankole Moses)
0

CO2 EMISSION EVALUATION AND BICYCLE MARKET ANALYSIS

# Import the pandas and numpy packages
import pandas as pd
import numpy as np

# Load the data
cars = pd.read_csv('data/co2_emissions_canada.csv')

# create numpy arrays
cars_makes = cars['Make'].to_numpy()
cars_models = cars['Model'].to_numpy()
cars_classes = cars['Vehicle Class'].to_numpy()
cars_engine_sizes = cars['Engine Size(L)'].to_numpy()
cars_cylinders = cars['Cylinders'].to_numpy()
cars_transmissions = cars['Transmission'].to_numpy()
cars_fuel_types = cars['Fuel Type'].to_numpy()
cars_fuel_consumption = cars['Fuel Consumption Comb (L/100 km)'].to_numpy()
cars_co2_emissions = cars['CO2 Emissions(g/km)'].to_numpy()

# Preview the dataframe
cars

first, lets see if our data is cleaned and do some exploratory data analysis

#changing our dataframe from cars to df
df = cars
df.head()
df.nunique()

from our summary, we have:

  • 41 types of make
  • 2053 types of model
  • 16 types of vehicle size
  • 51 types of engine size(L)
  • 27 types of transmission
  • 5 types of fuel.... in our DataFrame
#checking for missing data in our dataset
missing_values = df.isnull()
for column in missing_values.columns.values.tolist():
    print(column)
    print(missing_values[column].value_counts())
    print("")

A true indicates a missing and false indicates otherwise.

From our data, we have no missing values, so it seems theres less work to do on cleaning our dataset.

df.columns
df["Fuel Type"].value_counts()

From our summary, we have:

  • 3637 cars with fuel type X
  • 3202 cars with fuel type Z
  • 370 cars with fuel type E
  • 175 cars with fuel type D
  • 1 cars with fuel type N
#checking data dypes
df.dtypes
df.describe()

To find the median engine-size

engine_size = df[["Engine Size(L)"]]