Skip to content

I was watching a video about what makes the U.S such a economic power house. There were a lot of factors but one they touched on was the grand expanse of the Mississppi river system and how it is navigable. That made me think, what is the relationship between length of navigable water ways and a country's economy over the years.

I have downloaded two datasets to study this. One is the gdp_land dataset which contains the countries's land area and gdp per capita per year. The other is Water_Ways, which contains the length of navigable water ways per country over the years. The datsets originally came from UNECE Transport Statistics Database, https://w3.unece.org/PXWeb/en/Table?IndicatorCode=58, and The World Bank Database, https://databank.worldbank.org/source/world-development-indicators#.

As the information for navigable water way was limited, our sample size is small. The water length ratio to land seems to barely affect the growth of GDP per capita according to the data. There is a slight increase and the single country on the right doesn't affect this correlation much.

The country data was small but the data went as far as back as 2005 to 2020. This increases our sample size. Note that for many years, the change is zero and affected the height of the heighest bar by almost 200 units. Meaning the 2 sides closest to zero are about equal. It does slight favor growth though.

Even with the increased number of samples we still do not see a strong correlation. Despite the changing room for ships to transport materials, it doesn't mean its significant enough for the infrastructure to be made or usable.

With the small data we have to say there is no strong correlation. The logic does make sense for why navigable water ways matter for countries. We should get a bigger dataset if possible and investigate more factors. Such as the infrastructure for the water ways, population densities near water ways, and does the water way connect to the ocean. These factors can change how useful the water way is to GDP per Capita.

For now we can't see how navigable water ways change GDP per capita. We have other factors that are better documented and studied to use instead.

Here is how I cleaned and analysed the dataset.

Fisrt off we should download some useful packages and download the datasets.

# Importing My most used packages
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
# Downloading the  Water_Ways dataset
water_ways = pd.read_csv("Water_Ways.csv")

# We only need the CountryName, PeriodCode, and Value columns for our investigation
water_ways = water_ways[["CountryName", "PeriodCode", "Value"]]
print(water_ways.head())
# Downloading the gdp_land dataset
gdp_land = pd.read_csv('gdp_land.csv')
print(gdp_land.head())

# The data is in wide format as compared to the water_ways data which is longer. So we should change the data to better fit the previous data.
gdp_land = pd.melt(gdp_land, id_vars=['Country Name', "Country Code",\
                "Series Name", "Series Code"], var_name='year', value_name='value')
gdp_land = gdp_land[["Country Name", "Series Name", "year", "value"]]
print(gdp_land.head())
# The column series data is in long format and we want it wider to make it easier for us to join the data.

gdp_land_pivot = gdp_land.pivot(index=['Country Name', 'year'], columns='Series Name', values='value')
gdp_land_pivot = gdp_land_pivot.reset_index()
print(gdp_land_pivot)
print(gdp_land_pivot.columns)

Now that we have the datasets downloaded. We should start investigating each dataset. First we should investigate the gdp_land dataset as it seems to need more work to fit.

First off, we need the year column to actually be a datetime data type, and change the column names of Country Name and the new columns we just made from Series Name.

# Changing the year column to just 4 digits
for index, row in gdp_land_pivot.iterrows():
    gdp_land_pivot.at[index, 'year'] = row['year'][:4]
gdp_land_pivot['year'] = pd.to_datetime(gdp_land_pivot['year'])
gdp_land_pivot['year'] = gdp_land_pivot['year'].dt.year
print(gdp_land_pivot.head())

# Renaming some of the columns
gdp_land2 = gdp_land_pivot.rename(columns = {"Country Name": "country", 'GDP per capita, PPP (constant 2017 international $)' : "gdp_per_cap", 'Land area (sq. km)' : "land"})
print(gdp_land2.head())