Skip to content
0

Analyzing global internet patterns

📖 Background

In this competition, you'll be exploring a dataset that highlights internet usage for different countries from 2000 to 2023. Your goal is import, clean, analyze and visualize the data in your preferred tool.

The end goal will be a clean, self explanatory, and interactive visualization. By conducting a thorough analysis, you'll dive deeper into how internet usage has changed over time and the countries still widely impacted by lack of internet availability.

💾 Data

You have access to the following file, but you can supplement your data with other sources to enrich your analysis.

Interet Usage (internet_usage.csv)

Column nameDescription
Country NameName of the country
Country CodeCountries 3 character country code
2000Contains the % of population of individuals using the internet in 2000
2001Contains the % of population of individuals using the internet in 2001
2002Contains the % of population of individuals using the internet in 2002
2003Contains the % of population of individuals using the internet in 2003
.......
2023Contains the % of population of individuals using the internet in 2023

The data can be downloaded from the Files section (File > Show workbook files).

💪 Challenge

Use a tool of your choice to create an interesting visual or dashboard that summarizes your analysis!

Things to consider:

  1. Use this Workspace to prepare your data (optional).
  2. Stuck on where to start, here's some ideas to get you started:
    • Visualize interner usage over time, by country
    • How has internet usage changed over time, are there any patterns emerging?
    • Consider bringing in other data to supplement your analysis
  3. Create a screenshot of your main dashboard / visuals, and paste in the designated field.
  4. Summarize your findings in an executive summary.
import pandas as pd
data = pd.read_csv("data/internet_usage.csv") 
data.head(10)

1 hidden cell
data.isna().sum()
df = data.melt(id_vars=['Country Name', 'Country Code'], 
                  var_name='Year', value_name='Consumption')
df.tail(10)
df["Consumption"] = df["Consumption"].replace("..", 0)
df
df.info()
# Convert 'Year' to integer
df['Year'] = pd.to_numeric(df['Year'], errors='coerce')

# Convert 'Consumption' to numeric (if possible), handling non-numeric values gracefully
df['Consumption'] = pd.to_numeric(df['Consumption'], errors='coerce')
df.to_csv("internetSeries.csv")
df.describe()
df["Year"].nunique()
import matplotlib.pyplot as plt
country_data = df[df['Country Name'] =='Afghanistan']
plt.plot(country_data['Year'], country_data['Consumption'])
plt.xlabel('Year')
plt.ylabel('Consumption')
plt.title('Consumption Over Time for Afghanistan Country ')
plt.show()
import plotly.graph_objects as go

# Group by 'Year' and calculate the average 'Consumption' for each year
global_trend = df.groupby('Year')['Consumption'].mean().reset_index()

# Create the plot
fig = go.Figure()

# Add line plot for the global trend
fig.add_trace(go.Scatter(
    x=global_trend['Year'],
    y=global_trend['Consumption'],
    mode='lines',
    name='Global Trend',
    line=dict(color='green')
))

# Add titles and labels
fig.update_layout(
    title='Global Internet Usage Trend (2000-2023)',
    xaxis_title='Year',
    yaxis_title='Global Average Internet Usage (%)',
    template='plotly_white'
)

# Show the plot
fig.show()
‌
‌
‌