Skip to content
Exploring CO2 Emissions and the Bike Market: A Data Analysis (copy)
  • AI Chat
  • Code
  • Report
  • Spinner

    Exploring CO2 Emissions and the Bike Market: A Data Analysis

    By: Ángel Daniel Gil Contreras Source: Bing Image Creator

    Before starting this project...

    To all my readers:

    I would like to express my sincere gratitude to all those who have taken the time to read this report. Your interest and support are greatly appreciated. I put a lot of effort into creating this report, and I hope that it will be well received. I believe that the insights and recommendations contained within will be valuable and informative. Thank you again for your support and for taking the time to read this report.

    Reducing Vehicle Emissions in Canada: A Data-Driven Approach

    First, some declarations:

    As a starting point for this project, I believe that it is important to make the code and analysis accessible to those new to the field of Data Science. To that end, I will strive to use commonly-used libraries and functions to make the project more approachable and easier to understand.

    Please note that while some basic knowledge of statistics is required, I will provide clear explanations of any more advanced statistical concepts that may be used in the project, whenever necessary.

    This report is being presented to the leaders of the organization, so it is important to be clear, concise, and simple in the analysis, explanations, and results. In addition, I will make some comments by breaking the fourth wall and addressing the audience directly.

    • Normal writing: I am fulfilling my role as a volunteer.
    • Italicized writing: comments to the audience and answers to the required questions (breaking the fourth wall).

    Note: All images without a source were dissing by me in

    According to the Government of Canada's 2019 Inventory of Greenhouse Gas Emissions, the transportation sector was the second largest source of greenhouse gas emissions in Canada in 2017, accounting for 26% of total emissions. Within the transportation sector, the largest source of emissions was from on-road vehicles (including cars, trucks, buses, and motorcycles), which accounted for 78% of transportation sector emissions.

    Here is the link to the full report:


    As a public policy advocacy organization in Canada, it is important to understand the CO2 emissions of vehicles in the country. This information can be used to develop guidelines and policies to reduce emissions and protect the environment. To help with this effort, I analyzed 7 years of CO2 emissions data for Canadian vehicles as follows.

    Analysis Path

    Note: I will consider CO2 Emissions as the dependent variable for the full analysis (the other features are the explicative variables). So the approach will be to see how the other variables affect this one.

    1. Clean and prepare the data:

    • Check for missing values and decide how to handle them (e.g. drop rows with missing values, impute missing values).
    • Make sure all the variables are in the correct format (e.g., the "Engine Size" variable is in numeric format).
    • Check for any inconsistencies or errors in the data (e.g. negative values for CO2 emissions).
    • Check Categories representation on the dataset.

    2. Exploratory Analysis:

    • Calculate summary statistics such as mean, median, minimum, and maximum values for CO2 emissions.
    • Create visualizations to explore the distribution of CO2 emissions and identify potential outliers.

    3. Correlation Analysis

    • Explore the relationship between numeric variables.
    • Analyze the effects of categorical variables on CO2 emissions. This includes identifying patterns and trends between the categories and the dependent variable.

    4. Multivariable Analysis

    • Multivariable Model to see how we can explain the CO2 Emissions base on a set of diferent numeric and categorical variables.

    5. Conclusion and Recommendations

    1. Clean and prepare the data:

    a. The data I

    I had access to CO2 emissions data for Canadian vehicles (source) which has the following variables:

    • "Make" - The company that manufactures the vehicle.
    • "Model" - The vehicle's model.
    • "Vehicle Class" - Vehicle class by utility, capacity, and weight.
    • "Engine Size(L)" - The engine's displacement in liters.
    • "Cylinders" - The number of cylinders.
    • "Transmission" - The transmission type:
    • A = Automatic,
    • AM = Automatic Manual,
    • AS = Automatic with select shift,
    • AV = Continuously variable,
    • M = Manual,
    • 3 - 10 = the number of gears.
    • "Fuel Type" - The fuel type:
    • X = Regular gasoline,
    • Z = Premium gasoline,
    • D = Diesel,
    • E = Ethanol (E85),
    • N = natural gas.
    • "Fuel Consumption Comb (L/100 km)" - Combined city/highway (55%/45%) fuel consumption in liters per 100 km (L/100 km).
    • "CO2 Emissions(g/km)" - The tailpipe carbon dioxide emissions in grams per kilometer for combined city and highway driving.

    The data comes from the Government of Canada's open data website.