Skip to content
5 hidden cells
1 hidden cell
World Economic Indicators (copy)
5 hidden cells
Objective 1
Which countries have experienced the highest growth in population and GDP? Is there overlap?
#year to access
year_to_access = wb[(wb['Year'] >= 1990) & (wb['Year'] <= 2021)]
Hidden output
# Convert columns to numeric, forcing errors to NaN (which can be handled later)
year_to_access['Population density (people per sq. km of land area)'] = pd.to_numeric(
year_to_access['Population density (people per sq. km of land area)'], errors='coerce'
)
year_to_access['GDP (USD)'] = pd.to_numeric(year_to_access['GDP (USD)'], errors='coerce')
# Calculate YoY Growth Rates with proper line continuation using parentheses
year_to_access['Population Growth YoY (%)'] = (
year_to_access.groupby('Country Name')['Population density (people per sq. km of land area)']
.pct_change() * 100
)
year_to_access['GDP Growth YoY (%)'] = (
year_to_access.groupby('Country Name')['GDP (USD)']
.pct_change() * 100
)
# Drop rows with NaN values resulting from pct_change
year_to_access = year_to_access.dropna(subset=['Population Growth YoY (%)', 'GDP Growth YoY (%)'])
# Group by country and calculate the mean of the YoY growth rates
average_growth = year_to_access.groupby('Country Name').agg({
'Population Growth YoY (%)': 'mean',
'GDP Growth YoY (%)': 'mean'
}).reset_index()
# Identify Top 10 Countries with Highest Average Population Growth
top_population_growth = average_growth.nlargest(10, 'Population Growth YoY (%)')[['Country Name', 'Population Growth YoY (%)']]
# Identify Top 10 Countries with Highest Average GDP Growth
top_gdp_growth = average_growth.nlargest(10, 'GDP Growth YoY (%)')[['Country Name', 'GDP Growth YoY (%)']]
# Check for overlap
overlap_countries = top_population_growth[top_population_growth['Country Name'].isin(top_gdp_growth['Country Name'])]
Hidden output
#top population growth countries
plt.figure(figsize=(6,5))
sns.barplot(data = top_population_growth, x = 'Population Growth YoY (%)', y = 'Country Name', palette= 'viridis' )
plt.xlabel('Avg growth rate(%)')
plt.ylabel('Country Name')
plt.title('top ten countries by avg population growth rate (%)')
top_population_growth.reset_index(drop = True).head(10)[['Country Name','Population Growth YoY (%)']]
#top ten GDP growth countries
plt.figure(figsize=(6,5))
sns.barplot(data = top_gdp_growth, x = 'GDP Growth YoY (%)', y = 'Country Name', palette= 'cividis' )
plt.xlabel('avg gdp growth rate')
plt.ylabel('Country')
plt.title('top ten countries by avg gdp growth rate yoy(%)')
top_gdp_growth.reset_index(drop = True).head(10)
1 hidden cell
# Importing the necessary libraries
from matplotlib_venn import venn2
import matplotlib.pyplot as plt
# Plot Venn Diagram for Overlap
plt.figure(figsize=(8, 8))
venn2(
[set(top_population_growth['Country Name']), set(top_gdp_growth['Country Name'])],
set_labels=('Top Population Growth', 'Top GDP Growth')
)
plt.title('Overlap of Top Population and GDP Growth Countries')
plt.show()
From our analysis, these are the top as per population growth are
Latvia Bosnia and Herzegovina Lithuania Georgia Estonia Bulgaria Armenia Romania Croatia Greenland
and the top countries as per GDP growth are Iraq Libya Serbia Ukraine Congo, Dem. Rep. Curacao New Caledonia Northern Mariana Islands Virgin Islands (U.S.) French Polynesia
There are no overlaps
Objective 2
What regions saw the most growth in HDI in the 21st century?
#filtering for 21st century
twenty_first_cent = hn[(hn['year'] == 2001) | (hn['year'] ==2021)]
#dropping all hdi_rank rows
rows_to_drop = twenty_first_cent[twenty_first_cent['rank'] == 'hdi_rank']
twenty_first_cent = twenty_first_cent.drop(rows_to_drop.index)