Skip to content

Gross Domestic Product Data

This dataset consists of the yearly gross domestic product (GDP) in current USD of countries and regions worldwide since the 1960s.

Not sure where to begin? Scroll to the bottom to find challenges!

import pandas as pd

gdp_data = pd.read_csv("gdp_data.csv", index_col=None)

To analyze only the GDP of countries, you can use country_codes.csv to extract these rows from the dataset:

country_codes = pd.read_csv("country_codes.csv", index_col=0)

Source and license of dataset.

Don't know where to start?

Challenges are brief tasks designed to help you practice specific skills:

  • 🗺️ Explore: Try to identify recessions in different countries.
  • 📊 Visualize: Create a plot to visualize the change in GDP in your country over the past decade.
  • 🔎 Analyze: Which country had the highest percentage growth in GDP over the past decade?

Scenarios are broader questions to help you develop an end-to-end project for your portfolio:

You are have been hired by an NGO to learn about the economic development of countries in Central America: Belize, Costa Rica, El Salvador, Guatemala, Honduras, Nicaragua, and Panama.

Using data from 1960 through to 2016, you have been asked to give a deep-dive on the GDP growth of each country per year and decade. Your manager is also interested in how each country compares to the regional average.

You will need to prepare a report that is accessible to a broad audience. It should outline your motivation, steps, findings, and conclusions.


✍️ If you have an idea for an interesting Scenario or Challenge, or have feedback on our existing ones, let us know! You can submit feedback by pressing the question mark in the top right corner of the screen and selecting "Give Feedback". Include the phrase "Content Feedback" to help us flag it in our system.

print(gdp_data)
gdp_data.rename(columns={'Value':'GDP'}, inplace=True)
US = gdp_data[gdp_data['Country Name'] == 'United States']
philippines = gdp_data[gdp_data['Country Name'] == 'Philippines']
colombia = gdp_data[gdp_data['Country Name'] == 'Colombia']
indonesia = gdp_data[gdp_data['Country Name'] == 'Indonesia']
hong_kong = gdp_data[gdp_data['Country Name'] == 'Hong Kong SAR, China']
print(US.tail())
print(colombia.tail())
print(philippines.tail())
print(indonesia.tail())
print(hong_kong.tail())
import matplotlib.pyplot as plt

fig, ax = plt.subplots(figsize=(16,8))

ax.set_xlabel('Year')
ax.set_ylabel('GPD (in 100 billions)')
ax.set_title('GDP between 1960-2016')
ax.plot(colombia['Year'], colombia['GDP'], label='Colombia')
ax.plot(philippines['Year'], philippines['GDP'], label='Philippines')
ax.plot(indonesia['Year'], indonesia['GDP'], label='Indonesia')
ax.plot(hong_kong['Year'], hong_kong['GDP'], label='Hong Kong (SAR)')
ax.legend()
plt.show()
import seaborn as sns

fig, ax = plt.subplots()
ex_countries = [US, philippines, colombia, indonesia, hong_kong]
for countries in ex_countries:
    sns.relplot(data=countries, x=countries['Year'], y=countries['GDP'], kind='line', ax=ax)
    ax.set()
    plt.show()