Skip to content
Competition - XP Competition 2022
  • AI Chat
  • Code
  • Report
  • Spinner

    How Much of the World Has Access to the Internet?

    import matplotlib.pyplot as plt
    import seaborn as sns
    import pandas as pd
    [17]
    # Read the data
    broadband = pd.read_csv('data/broadband.csv')
    # Read the internet table
    internet = pd.read_csv('data/internet.csv')
    # Read the people table
    people = pd.read_csv('data/people.csv')
    
    # Take a look at the first rows
    broadband

    💾 The data

    The research team compiled the following tables (source):
    internet
    • "Entity" - The name of the country, region, or group.
    • "Code" - Unique id for the country (null for other entities).
    • "Year" - Year from 1990 to 2019.
    • "Internet_usage" - The share of the entity's population who have used the internet in the last three months.
    people
    • "Entity" - The name of the country, region, or group.
    • "Code" - Unique id for the country (null for other entities).
    • "Year" - Year from 1990 to 2020.
    • "Users" - The number of people who have used the internet in the last three months for that country, region, or group.
    broadband
    • "Entity" - The name of the country, region, or group.
    • "Code" - Unique id for the country (null for other entities).
    • "Year" - Year from 1998 to 2020.
    • "Broadband_Subscriptions" - The number of fixed subscriptions to high-speed internet at downstream speeds >= 256 kbit/s for that country, region, or group.

    Acknowledgments: Max Roser, Hannah Ritchie, and Esteban Ortiz-Ospina (2015) - "Internet." OurWorldInData.org.

    💪 Challenge

    Create a report to answer the principal's questions. Include:

    1. What are the top 5 countries with the highest internet use (by population share)?
    2. How many people had internet access in those countries in 2019?
    3. What are the top 5 countries with the highest internet use for each of the following regions: 'Middle East & North Africa', 'Latin America & Caribbean', 'East Asia & Pacific', 'South Asia', 'North America', 'Europe & Central Asia'?
    4. Create a visualization for those five regions' internet usage over time.
    5. What are the 5 countries with the most internet users?
    6. What is the correlation between internet usage (population share) and broadband subscriptions for 2019?
    7. Summarize your findings.

    Note: This is how the World Bank defines the different regions.

    Question 1: What are the top 5 countries with the highest internet use (by population share)?

    internet_sorted = internet.sort_values(by=['Year','Internet_Usage'], ascending=False)
    internet_top_5 = internet_sorted.iloc[:5,:]
    top_5_list = list(internet_top_5['Entity'])
    print(internet_top_5)
    print(top_5_list)

    The top 5 countries with the highest internet share were Bahrain, Qatar, Kuwait, United Arab Emerites, and Denmark. This only represents the most recent year, 2019.

    Question 2: How many people had internet access in those countries in 2019?

    people_w_access_2019 = people[people['Year']==2019]
    people_w_access_2019_top_5 = people_w_access_2019[people_w_access_2019['Entity'].isin(top_5_list)].sort_values(by=['Users'], ascending=False)
    print(people_w_access_2019_top_5)
    

    🧑‍⚖️ Judging criteria

    CATEGORYWEIGHTINGDETAILS
    Response quality85%
    • Accuracy (30%) - The response must be representative of the original data and free from errors.
    • Clarity (25%) - The response must be easy to understand and clearly expressed.
    • Completeness (30%) - The response must be a full report that responds to the question posed.
    Presentation15%
    • How legible/understandable the response is.
    • How well-formatted the response is.
    • Spelling and grammar.

    In the event of a tie, earlier submission time will be used as a tie-breaker.

    ✅ Checklist before submitting your workspace

    • Rename your workspace to make it descriptive of your work. N.B., you should leave the notebook name as notebook.ipynb.
    • Remove redundant cells like the introduction to data science notebooks, so the workbook is focused on your story.
    • Check that all the cells run without error.