Skip to content
How Much of the World Has Access to the Internet?
Author
Samvel Kocharyan, [email protected]
https://www.linkedin.com/in/samvelkoch/
2022
πΎ The data
The research team compiled the following tables (source):
internet
- "Entity" - The name of the country, region, or group.
- "Code" - Unique id for the country (null for other entities).
- "Year" - Year from 1990 to 2019.
- "Internet_usage" - The share of the entity's population who have used the internet in the last three months.
people
- "Entity" - The name of the country, region, or group.
- "Code" - Unique id for the country (null for other entities).
- "Year" - Year from 1990 to 2020.
- "Users" - The number of people who have used the internet in the last three months for that country, region, or group.
broadband
- "Entity" - The name of the country, region, or group.
- "Code" - Unique id for the country (null for other entities).
- "Year" - Year from 1998 to 2020.
- "Broadband_Subscriptions" - The number of fixed subscriptions to high-speed internet at downstream speeds >= 256 kbit/s for that country, region, or group.
Acknowledgments: Max Roser, Hannah Ritchie, and Esteban Ortiz-Ospina (2015) - "Internet." OurWorldInData.org.
# Import smomething useful
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
Hidden output
# Read the internet table
internet = pd.read_csv('data/internet.csv')
1. What are the top 5 countries with the highest internet use (by population share)?
[216]
# Let's explore internet table
internet.describe()
[13]
# Have a look at last entries by year 2019
top5_internet_2019 = internet[internet['Year'] == 2019]
top5_internet_2019_sorted = top5_internet_2019.sort_values(by=['Internet_Usage'], ascending=False)
top5_internet_2019_sorted.head(5)
TOP 5 countries with the highest internet use (by population share in 2019)
-
- π§π Bahrain
-
- πΆπ¦ Qatar
-
- π°πΌ Kuwait
-
- π¦πͺ UAE
-
- π©π° Denmark
[17]
# But for whole dataset years coverage (1990-2019)
# "TOP5 Internet Usage" should be absolutely different
# Will use Internet_Usage Median as indicator instead of
# Mean and Sum. Outliers will not smash our rating.
top5_internet = internet.groupby('Entity')['Internet_Usage'].agg([np.sum, np.mean, np.median]).sort_values(by="median", ascending=False).head(6)
top5_internet
[219]
# Hmm... Kosovo on the TOP. But this country became indepent only in 2008.
# Not fair enough to be in this top.
# Let's check the year when Kosovo got it's first data
# in the 'internet' table?
kosovo = internet[internet['Entity'] == 'Kosovo']
kosovo
[13]
### OK. Kosovo 2017. Doesn't work for TOP5 rating which accumulates
# Internet Usage stat from 1990.
# Let's forget about Kosovo for a while and explore our next leader countries.
leaders = ['Iceland', 'Sweden', 'Denmark', 'Norway', 'Netherlands']
internet[(internet['Entity'].isin(leaders)) & (internet['Year'] == 1990)].head(6)
[221]
# Iceland - leader of the rating has no data for 1990.
# But in 1991 Iternet Usage was 0.5%.
# And for next 28 years Iceland was in the top. Fair enough for leader.
internet[internet['Code'] == 'ISL']['Year'].agg('min')
[222]
top5_internet_leaders = internet[internet['Entity'] != 'Kosovo'].groupby('Entity')[['Entity','Internet_Usage']].agg(np.median).sort_values(by='Internet_Usage',ascending=False).head(5)
top5_internet_leaders
β
β
β
β
β