Skip to content
Competition - XP Competition 2022
How Much of the World Has Access to the Internet?
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd[17]
# Read the data
broadband = pd.read_csv('data/broadband.csv')
# Read the internet table
internet = pd.read_csv('data/internet.csv')
# Read the people table
people = pd.read_csv('data/people.csv')
# Take a look at the first rows
broadband💾 The data
The research team compiled the following tables (source):
internet
- "Entity" - The name of the country, region, or group.
- "Code" - Unique id for the country (null for other entities).
- "Year" - Year from 1990 to 2019.
- "Internet_usage" - The share of the entity's population who have used the internet in the last three months.
people
- "Entity" - The name of the country, region, or group.
- "Code" - Unique id for the country (null for other entities).
- "Year" - Year from 1990 to 2020.
- "Users" - The number of people who have used the internet in the last three months for that country, region, or group.
broadband
- "Entity" - The name of the country, region, or group.
- "Code" - Unique id for the country (null for other entities).
- "Year" - Year from 1998 to 2020.
- "Broadband_Subscriptions" - The number of fixed subscriptions to high-speed internet at downstream speeds >= 256 kbit/s for that country, region, or group.
Acknowledgments: Max Roser, Hannah Ritchie, and Esteban Ortiz-Ospina (2015) - "Internet." OurWorldInData.org.
💪 Challenge
Create a report to answer the principal's questions. Include:
- What are the top 5 countries with the highest internet use (by population share)?
- How many people had internet access in those countries in 2019?
- What are the top 5 countries with the highest internet use for each of the following regions: 'Middle East & North Africa', 'Latin America & Caribbean', 'East Asia & Pacific', 'South Asia', 'North America', 'Europe & Central Asia'?
- Create a visualization for those five regions' internet usage over time.
- What are the 5 countries with the most internet users?
- What is the correlation between internet usage (population share) and broadband subscriptions for 2019?
- Summarize your findings.
Note: This is how the World Bank defines the different regions.
Question 1: What are the top 5 countries with the highest internet use (by population share)?
internet_sorted = internet.sort_values(by=['Year','Internet_Usage'], ascending=False)
internet_top_5 = internet_sorted.iloc[:5,:]
top_5_list = list(internet_top_5['Entity'])
print(internet_top_5)
print(top_5_list)The top 5 countries with the highest internet share were Bahrain, Qatar, Kuwait, United Arab Emerites, and Denmark. This only represents the most recent year, 2019.
Question 2: How many people had internet access in those countries in 2019?
people_w_access_2019 = people[people['Year']==2019]
people_w_access_2019_top_5 = people_w_access_2019[people_w_access_2019['Entity'].isin(top_5_list)].sort_values(by=['Users'], ascending=False)
print(people_w_access_2019_top_5)
🧑⚖️ Judging criteria
| CATEGORY | WEIGHTING | DETAILS |
|---|---|---|
| Response quality | 85% |
|
| Presentation | 15% |
|
In the event of a tie, earlier submission time will be used as a tie-breaker.
✅ Checklist before submitting your workspace
- Rename your workspace to make it descriptive of your work. N.B., you should leave the notebook name as notebook.ipynb.
- Remove redundant cells like the introduction to data science notebooks, so the workbook is focused on your story.
- Check that all the cells run without error.