Skip to content
Climate Change and Impacts in Africa
  • AI Chat
  • Code
  • Report
  • Climate Change and Impacts in Africa

    According to the United Nations, Climate change refers to long-term shifts in temperatures and weather patterns. Such shifts can be natural, due to changes in the sun’s activity or large volcanic eruptions. But since the 1800s, human activities have been the main driver of climate change, primarily due to the burning of fossil fuels like coal, oil, and gas.

    The consequences of climate change now include, among others, intense droughts, water scarcity, severe fires, rising sea levels, flooding, melting polar ice, catastrophic storms, and declining biodiversity.

    You work for a Non-governmental organization tasked with reporting the state of climate change in Africa at the upcoming African Union Summit. The head of analytics has provided you with IEA-EDGAR CO2 dataset which you will clean, combine and analyze to create a report on the state of climate change in Africa. You will also provide insights on the impact of climate change on African regions (with four countries, one from each African region, as case studies).


    The dataset, IEA-EDGAR CO2, is a component of the EDGAR (Emissions Database for Global Atmospheric Research) Community GHG database version 7.0 (2022) including or based on data from IEA (2021) Greenhouse Gas Emissions from Energy,, as modified by the Joint Research Centre. The data source was the EDGARv7.0_GHG website provided by Crippa et. al. (2022) and with DOI.

    The dataset contains three sheets - IPCC 2006, 1PCC 1996, and TOTALS BY COUNTRY on the amount of CO2 (a greenhouse gas) generated by countries between 1970 and 2021. You can download the dataset from your workspace or inspect the dataset directly here.


    This sheet contains the annual CO2 (kt) produced between 1970 - 2021 in each country. The relevant columns in this sheet are:

    C_group_IM24_shThe region of the world
    Country_code_A3The country code
    NameThe name of the country
    Y_1970 - Y_2021The amount of CO2 (kt) from 1970 - 2021

    IPCC 2006

    These sheets contain the amount of CO2 by country and the industry responsible.

    C_group_IM24_shThe region of the world
    Country_code_A3The country code
    NameThe name of the country
    Y_1970 - Y_2021The amount of CO2 (kt) from 1970 - 2021
    ipcc_code_2006_for_standard_report_nameThe industry responsible for generating CO2
    # Setup
    import pandas as pd
    import numpy as np
    import pingouin
    from sklearn.linear_model import LinearRegression
    from statsmodels.regression.linear_model import OLS
    import seaborn as sns
    import matplotlib.pyplot as plt
    import inspect'ggplot')
    # The sheet names containing our datasets
    sheet_names = ['IPCC 2006', 'TOTALS BY COUNTRY']
    # The column names of the dataset starts from rows 11
    # Let's skip the first 10 rows
    datasets = pd.read_excel('IEA_EDGAR_CO2_1970-2021.xlsx', sheet_name = sheet_names, skiprows = 10)
    # we need only the African regions
    african_regions = ['Eastern_Africa', 'Western_Africa', 'Southern_Africa', 'Northern_Africa']
    ipcc_2006_africa = datasets['IPCC 2006'].query('C_group_IM24_sh in @african_regions')
    totals_by_country_africa = datasets['TOTALS BY COUNTRY'].query('C_group_IM24_sh in @african_regions')
    # Read the temperatures datasets containing four African countries
    # One from each African Region:
    # Nigeria:    West Africa
    # Ethiopa :   East Africa
    # Tunisia:    North Africa
    # Mozambique: South Africa
    temperatures = pd.read_csv('temperatures.csv')

    1 - Clean and tidy the datasets

    ipcc_2006_africa = ipcc_2006_africa.rename(columns={'C_group_IM24_sh': 'Region', 'Country_code_A3': 'Code','ipcc_code_2006_for_standard_report_name': 'Industry'})
    totals_by_country_africa = totals_by_country_africa.rename(columns={'C_group_IM24_sh':'Region','Country_code_A3': 'Code'})
    # drop columns
    ipcc_2006_africa = ipcc_2006_africa.drop(['IPCC_annex','ipcc_code_2006_for_standard_report',
    totals_by_country_africa = totals_by_country_africa.drop(['IPCC_annex', 'Substance'], axis=1)
    # Melt and clean Year column
    def melt_clean(df):
        value_vars = list(filter(lambda x: x.startswith('Y_'), df.columns))
        id_vars = list(set(df.columns).difference(value_vars))
        # melt
        long = df.melt(
        # drop rows where co2 is missing
        long = long[~long.CO2.isnull()]
        # convert year to integer
        long.Year = long.Year.str.replace('Y_', '').astype(int)
        return long
    ipcc_2006_africa = melt_clean(ipcc_2006_africa)
    totals_by_country_africa = melt_clean(totals_by_country_africa)

    2 - Show the trend of CO2 levels across the African regions

    sns.lineplot(data=totals_by_country_africa, x='Year', y='CO2', hue='Region', ci=None)
    plt.ylabel('CO2 (kt)')
    plt.title('CO2 levels across the African Regions between 1970 and 2021')

    3 - Determine the relationship between time (Year) and CO2 levels across the African regions

    relationship_btw_time_CO2 = totals_by_country_africa.groupby('Region')[['Year', 'CO2']].corr(method='spearman')

    4 - Determine if there is a significant difference in the CO2 levels among the African Regions

    aov_results = pingouin.anova(data=totals_by_country_africa, dv='CO2', between='Region')
    pw_ttest_result = pingouin.pairwise_tests(data=totals_by_country_africa, dv='CO2',
    between='Region', padjust="bonf").round(3)

    5 - Determine the most common (top 5) industries in each African region.

    # Group the data by Region and Industry and count the occurrences
    ipcc_grouped = ipcc_2006_africa.groupby(['Region', 'Industry']).size().reset_index(name='Count')
    # Sort the data within each region group by Count in descending order
    ipcc_grouped = ipcc_grouped.sort_values(['Region', 'Count'], ascending=[True, False])
    # Get the top 5 industries for each region
    top_5_industries = ipcc_grouped.groupby('Region').head(5).reset_index(drop=True)

    6 - Determine the industry responsible for the most amount of CO2 (on average) in each African Region

    # Emissions per group
    group_emissions = ipcc_2006_africa.groupby(['Region', 'Industry'])['CO2'].mean().reset_index()
    # Industry with  the maximum CO2 emission
    max_co2_industries = group_emissions.loc[group_emissions.groupby('Region')['CO2'].idxmax()].reset_index(drop=True)

    7 - Predict the CO2 levels (at each African region) in the year 2025