Competition - Analyzing global internet patterns

Analyzing global internet patterns

📖 Background

In this competition, you'll be exploring a dataset that highlights internet usage for different countries from 2000 to 2023. Your goal is import, clean, analyze and visualize the data in your preferred tool.

The end goal will be a clean, self explanatory, and interactive visualization. By conducting a thorough analysis, you'll dive deeper into how internet usage has changed over time and the countries still widely impacted by lack of internet availability.

💾 Data

You have access to the following file, but you can supplement your data with other sources to enrich your analysis.

Interet Usage (`internet_usage.csv`)

Column name	Description
Country Name	Name of the country
Country Code	Countries 3 character country code
2000	Contains the % of population of individuals using the internet in 2000
2001	Contains the % of population of individuals using the internet in 2001
2002	Contains the % of population of individuals using the internet in 2002
2003	Contains the % of population of individuals using the internet in 2003
....	...
2023	Contains the % of population of individuals using the internet in 2023

The data can be downloaded from the Files section (File > Show workbook files).

import pandas as pd
internet_usage = pd.read_csv('data/internet_usage.csv')
internet_usage

💪 Challenge

Use a tool of your choice to create an interesting visual or dashboard that summarizes your analysis!

Things to consider:

Use this Workspace to prepare your data (optional).
Stuck on where to start, here's some ideas to get you started:
- Visualize interner usage over time, by country
- How has internet usage changed over time, are there any patterns emerging?
- Consider bringing in other data to supplement your analysis
Create a screenshot of your main dashboard / visuals, and paste in the designated field.
Summarize your findings in an executive summary.

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline
internet_usage = pd.read_csv("data/internet_usage.csv") 
internet_usage.head(2)

Hidden output

#Creat a copy
internet_usage_clean = internet_usage.copy()

#Replace string values with NAM
internet_usage_clean = internet_usage_clean.replace('..', np.nan)
internet_usage_clean = internet_usage_clean.replace('O', np.nan)

#Convert years columns from object to a float
internet_usage_clean[internet_usage_clean.columns[2:]] = internet_usage_clean.iloc[:, 2:].astype(float)

#Replace 0 values to NAN
internet_usage_clean = internet_usage_clean.replace(0, np.nan)
internet_usage_clean = internet_usage_clean.replace(0.0, np.nan)

#how many null values for each row
internet_usage_clean.isnull().sum(axis = 1).sort_values(ascending = False).nlargest(15)

Hidden output

# drop rows that has more than 17 null values
rows_to_drop = internet_usage_clean[internet_usage_clean.isnull().sum(axis = 1) > 17].index
internet_usage_clean = internet_usage_clean.drop(rows_to_drop)

Load Country geo & GDP data

Enrich your data by adding Continent, sub-region, gdp for each Country

gdp= pd.read_csv('data/gdp_2000_2023.csv')

#make a copy
gdp_clean = gdp.copy()
# drop rows that has more than 17 null values
gdp_rows_to_drop = gdp_clean[gdp_clean.isnull().sum(axis = 1) > 17].index
gdp_clean = gdp_clean.drop(gdp_rows_to_drop)

#fill missing data using interpolate method with linear method for each country based on row date
gdp_clean.set_index(['Country Name', 'Country Code'], inplace=True)
#cell which has no previous values will not filled with above method
#we fill it with 1M
gdp_clean = gdp_clean.apply(lambda row: row.fillna(row.interpolate(method = 'linear')), axis = 1)
gdp_clean = gdp_clean.apply(lambda row: row.fillna(1e6), axis = 1)
gdp_clean.reset_index(inplace=True)

country = pd.read_csv('data/country_and_continents.csv' )
country.head()

Hidden output

country.info()

Hidden output

merged_df = pd.merge(internet_usage_clean, country, how="left", on=['Country Code'])

‌
‌
‌

Competition - Analyzing global internet patterns

.mfe-app-workspace-kj242g{position:absolute;top:-8px;}.mfe-app-workspace-11ezf91{display:inline-block;}.mfe-app-workspace-11ezf91:hover .Anchor__copyLink{visibility:visible;}Analyzing global internet patterns

📖 Background

💾 Data

You have access to the following file, but you can supplement your data with other sources to enrich your analysis.

💪 Challenge

Load Country geo & GDP data

Analyzing global internet patterns