Climate Change and Impacts in Africa
According to the United Nations, Climate change refers to long-term shifts in temperatures and weather patterns. Such shifts can be natural, due to changes in the sun’s activity or large volcanic eruptions. But since the 1800s, human activities have been the main driver of climate change, primarily due to the burning of fossil fuels like coal, oil, and gas.
The consequences of climate change now include, among others, intense droughts, water scarcity, severe fires, rising sea levels, flooding, melting polar ice, catastrophic storms, and declining biodiversity.
You work for a Non-governmental organization tasked with reporting the state of climate change in Africa at the upcoming African Union Summit. The head of analytics has provided you with IEA-EDGAR CO2 dataset which you will clean, combine and analyze to create a report on the state of climate change in Africa. You will also provide insights on the impact of climate change on African regions (with four countries, one from each African region, as case studies).
Dataset
The dataset, IEA-EDGAR CO2, is a component of the EDGAR (Emissions Database for Global Atmospheric Research) Community GHG database version 7.0 (2022) including or based on data from IEA (2021) Greenhouse Gas Emissions from Energy, www.iea.org/statistics, as modified by the Joint Research Centre. The data source was the EDGARv7.0_GHG website provided by Crippa et. al. (2022) and with DOI.
The dataset contains three sheets - IPCC 2006, 1PCC 1996, and TOTALS BY COUNTRY on the amount of CO2 (a greenhouse gas) generated by countries between 1970 and 2021. You can download the dataset from your workspace or inspect the dataset directly here.
TOTALS BY COUNTRY SHEET
This sheet contains the annual CO2 (kt) produced between 1970 - 2021 in each country. The relevant columns in this sheet are:
| Columns | Description |
|---|---|
C_group_IM24_sh | The region of the world |
Country_code_A3 | The country code |
Name | The name of the country |
Y_1970 - Y_2021 | The amount of CO2 (kt) from 1970 - 2021 |
IPCC 2006
These sheets contain the amount of CO2 by country and the industry responsible.
| Columns | Description |
|---|---|
C_group_IM24_sh | The region of the world |
Country_code_A3 | The country code |
Name | The name of the country |
Y_1970 - Y_2021 | The amount of CO2 (kt) from 1970 - 2021 |
ipcc_code_2006_for_standard_report_name | The industry responsible for generating CO2 |
Instructions
The head of analytics in your organization has specifically asked you to do the following:
- Clean and tidy the datasets.
- Create a line plot to show the trend of
CO2levels across the African regions. - Determine the relationship between time (
Year) andCO2levels across the African regions. - Determine if there is a significant difference in the
CO2levels among the African Regions. - Determine the most common (top 5) industries in each African region.
- Determine the industry responsible for the most amount of CO2 (on average) in each African Region.
- Predict the
CO2levels (at each African region) in the year 2025. - Determine if
CO2levels affect annualtemperaturein the selected African countries.
IMPORTANT
-
Make a copy of this workspace.
-
Write your code within the cells provided for you. Each of those cells contain the comment "
#Your code here". -
Next, run the cells containing the checks. We've asked you not to modify these cells. To pass a check, make sure you create the variables mentioned in the instruction tasks. They (the variables) will be verified for correctness; if the cell outputs nothing your solution passes else the cell will throw an error. We included messages to help you fix these errors.
-
If you're stuck (even after reviewing related DataCamp courses), then uncomment and run the cell which contains the source code of the solution. For example,
print(inspect.getsource(solutions.solution_one))will display the solution for instruction 1. We advise you to only look at the solution to your current problem. -
Note that workspaces created inside the "I4G 23/24" group are always private to the group and cannot be made public.
-
If after completion you want to showcase your work on your DataCamp portfolio, use "File > Make a copy" to copy over the workspace to your personal account. Then make it public so it shows up on your DataCamp portfolio.
-
We hope you enjoy working on this project as we enjoyed creating it. Cheers!
# Setup
import pandas as pd
import numpy as np
import pingouin
from sklearn.linear_model import LinearRegression
from statsmodels.regression.linear_model import OLS
import seaborn as sns
import matplotlib.pyplot as plt
plt.style.use('ggplot')
# The sheet names containing our datasets
sheet_names = ['IPCC 2006', 'TOTALS BY COUNTRY']
# The column names of the dataset starts from rows 11
# Let's skip the first 10 rows
datasets = pd.read_excel('IEA_EDGAR_CO2_1970-2021.xlsx', sheet_name = sheet_names, skiprows = 10)
# we need only the African regions
african_regions = ['Eastern_Africa', 'Western_Africa', 'Southern_Africa', 'Northern_Africa']
ipcc_2006_africa = datasets['IPCC 2006'].query('C_group_IM24_sh in @african_regions')
totals_by_country_africa = datasets['TOTALS BY COUNTRY'].query('C_group_IM24_sh in @african_regions')
# Read the temperatures datasets containing four African countries
# One from each African Region:
# Nigeria: West Africa
# Ethiopa : East Africa
# Tunisia: North Africa
# Mozambique: South Africa
temperatures = pd.read_csv('temperatures.csv')1: Clean and tidy the datasets
- Rename
C_group_IM24_shtoRegion,Country_code_A3toCode, andipcc_code_2006_for_standard_report_nametoIndustryin the corresponding African datasets. - Drop
IPCC_annex,ipcc_code_2006_for_standard_report, andSubstancefrom the corresponding datasets. - Melt
Y_1970toY_2021into a two columnsYearandCO2. Drop rows whereCO2is missing. - Convert
Yeartointtype.
#check state of datasets
print(ipcc_2006_africa.head(2))
print(totals_by_country_africa.head(2))
# Your code here (for the learner)
#renaming the required columns
#dictionary of colomns to rename in both datasets
columns_rename = {
'C_group_IM24_sh':'Region',
'Country_code_A3':'Code',
'ipcc_code_2006_for_standard_report_name':'Industry'
}
columns_rename2 = {
'C_group_IM24_sh':'Region',
'Country_code_A3':'Code'
}
#rename columns
ipcc_2006_africa.rename(columns = columns_rename, inplace = True)
totals_by_country_africa.rename(columns = columns_rename2, inplace = True)
#drop required columns
ipcc_2006_africa.drop(['IPCC_annex','ipcc_code_2006_for_standard_report','Substance'], inplace = True, axis = 1)
totals_by_country_africa.drop(['IPCC_annex','Substance'], inplace = True, axis = 1)
#confirm changes
print(ipcc_2006_africa.columns)
print(totals_by_country_africa.columns)ipcc_2006_africa = pd.melt(ipcc_2006_africa, id_vars=['Region','Code','Name','Industry','fossil_bio'], value_vars=['Y_1970', 'Y_1971', 'Y_1972', 'Y_1973',
'Y_1974', 'Y_1975', 'Y_1976', 'Y_1977', 'Y_1978', 'Y_1979', 'Y_1980',
'Y_1981', 'Y_1982', 'Y_1983', 'Y_1984', 'Y_1985', 'Y_1986', 'Y_1987',
'Y_1988', 'Y_1989', 'Y_1990', 'Y_1991', 'Y_1992', 'Y_1993', 'Y_1994',
'Y_1995', 'Y_1996', 'Y_1997', 'Y_1998', 'Y_1999', 'Y_2000', 'Y_2001',
'Y_2002', 'Y_2003', 'Y_2004', 'Y_2005', 'Y_2006', 'Y_2007', 'Y_2008',
'Y_2009', 'Y_2010', 'Y_2011', 'Y_2012', 'Y_2013', 'Y_2014', 'Y_2015',
'Y_2016', 'Y_2017', 'Y_2018', 'Y_2019', 'Y_2020', 'Y_2021'], var_name='Year', value_name='CO2')
totals_by_country_africa = pd.melt(totals_by_country_africa, id_vars=['Region','Code','Name'], value_vars=['Y_1970', 'Y_1971', 'Y_1972', 'Y_1973',
'Y_1974', 'Y_1975', 'Y_1976', 'Y_1977', 'Y_1978', 'Y_1979', 'Y_1980',
'Y_1981', 'Y_1982', 'Y_1983', 'Y_1984', 'Y_1985', 'Y_1986', 'Y_1987',
'Y_1988', 'Y_1989', 'Y_1990', 'Y_1991', 'Y_1992', 'Y_1993', 'Y_1994',
'Y_1995', 'Y_1996', 'Y_1997', 'Y_1998', 'Y_1999', 'Y_2000', 'Y_2001',
'Y_2002', 'Y_2003', 'Y_2004', 'Y_2005', 'Y_2006', 'Y_2007', 'Y_2008',
'Y_2009', 'Y_2010', 'Y_2011', 'Y_2012', 'Y_2013', 'Y_2014', 'Y_2015',
'Y_2016', 'Y_2017', 'Y_2018', 'Y_2019', 'Y_2020', 'Y_2021'],var_name='Year', value_name='CO2')
#drop rows CO2 is missing
ipcc_2006_africa.dropna(subset=['CO2'], inplace=True)
totals_by_country_africa.dropna(subset=['CO2'],inplace=True)
#rename values in the year column to remove 'Y_' to facilitate conversion to integer
ipcc_2006_africa['Year'] = ipcc_2006_africa['Year'].str.replace('\D','',regex=True)
totals_by_country_africa['Year'] = totals_by_country_africa['Year'].str.replace('\D','',regex=True)
#convert Year column to datatype of integer
ipcc_2006_africa['Year'] = ipcc_2006_africa['Year'].astype('int')
totals_by_country_africa['Year'] = totals_by_country_africa['Year'].astype('int')
#confirm changes
print(ipcc_2006_africa['Year'].dtype)
print(totals_by_country_africa['Year'].dtype)#checkpoint
ipcc_2006_africa1 = ipcc_2006_africa.copy()
totals_by_country_africa1 = totals_by_country_africa.copy()2: Trend of CO2 levels across the African regions
CO2 levels across the African regions# Your code here
co2_year_plot = sns.lineplot(data=totals_by_country_africa, x='Year',y='CO2',hue='Region', ci=None)
co2_year_plot.set_xlabel('Year')
co2_year_plot.set_ylabel('CO2 (kt)')
co2_year_plot.set_title('CO2 levels across the African Regions between 1970 and 2021', fontdict ={ 'weight':'bold'})
CO2_levels_across_Africa = co2_year_plot
plt.show()
plt.savefig('CO2_levels_across_Africa')
3: Determine the relationship between time (Year) and CO2 levels across the African regions
Year) and CO2 levels across the African regions# Your code here
#group by Region and subset Year and CO2 columns
relationship_btw_time_CO2 = totals_by_country_africa.groupby('Region')['Year','CO2'].corr(method='spearman')
#view result
relationship_btw_time_CO24: Determine if there is a significant difference in the CO2 levels among the African Regions
# Your code here
totals_by_country_africa.head(3)
#check difference in level of CO2 between the regions
aov_results = pingouin.anova(totals_by_country_africa, dv='CO2', between='Region')
pw_ttest_result = pingouin.pairwise_tests(totals_by_country_africa, dv='CO2',between='Region',padjust='bonf')
#view result
print('anova test: ', aov_results)
print('pairwise_test: ')
pw_ttest_result