Skip to content

Executive Summary:

This data analysis investigates the trends and patterns in unemployment rates across different demographic groups in the United States from 2010 to 2019. The dataset, sourced from the Unemployment Kaggle Dataset (2010 - 2020), provides comprehensive insights into monthly unemployment rates based on education level, race, and gender.

Key Findings:

The analysis reveals significant differences in unemployment rates based on education level. Individuals with higher degrees, such as professional or associate degrees, experience lower unemployment rates and fluctuations compared to those with primary or high school education. This underscores the importance of education in accessing economic opportunities.

There are notable disparities in unemployment rates among different racial groups. Black and Hispanic populations tend to experience higher unemployment rates compared to White and Asian populations. Further investigation is warranted to understand the underlying factors contributing to these disparities.

While there are slight differences in unemployment rates between men and women, the overall patterns indicate that women tend to have slightly lower rates. However, the analysis reveals distinct seasonality patterns in unemployment between men and women, suggesting gender-specific factors influence employment trends.

The unemployment rate for men is expected to continue decreasing, contributing to the overall decreasing trend of mean unemployment rate from 2010 to 2019.

Introduction:

In this data analysis, I try to explore the trends and patterns in unemployment rates across various demographic groups in the United States from 2010 to 2020. The dataset provides insights into how unemployment rates vary based on factors such as education level, race, and gender.

Importing the Unemployment Kaggle Dataset (2010 - 2020), uploaded by Aniruddha Shirahatti, allows to examine the monthly unemployment rates and how they fluctuate over the years. This comprehensive dataset captures the unemployment rates based on education levels ranging from primary school to professional degrees, as well as across different racial groups including White, Black, Asian, and Hispanic populations. Additionally, it provides insights into the gender-specific unemployment rates for men and women.

Questions:

  1. What are the trends in unemployment rates in the United States from 2010 to 2019?
  2. How do unemployment rates vary across different demographic groups such as education level, race, and gender?
  3. Are there significant disparities in unemployment rates between the different demographical groups?
  4. How does educational attainment influence unemployment rates?
  5. Are there seasonal patterns in unemployment rates, and if yes, how do they differ among demographic groups?
  6. How will the men's unemployment rate change in the near future? What can be expected for April 2021?

Data:

Importing the Unemployment Kaggle Dataset (2010 - 2020). This dataset, uploaded by ANIRUDDHA SHIRAHATTI, contains time series data on the unemployment rate in the US from January 2010 to the 2020. It includes records of the unemployment rate based on education, race, and gender of adults.

Spinner
DataFrameas
df
variable
SELECT * FROM 'Unemploiment rate in US.csv';

Importing neccesary libraries

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import statsmodels.api as sm
import pingouin as pg
from statsmodels.graphics import tsaplots
from statsmodels.tsa.seasonal import seasonal_decompose
from statsmodels.tsa.statespace.sarimax import SARIMAX
# Set Seaborn style
sns.set_style("white")

Automating repetitive tasks by creating functions.

def plot_mean_unemployment_by_year_month(dataframe):
    """
    Plot the mean unemployment rate for each year-month.

    Parameters:
    dataframe (DataFrame): The DataFrame containing the unemployment rate data indexed by year-month.

    Returns:
    None

    Example:
    plot_mean_unemployment_by_year_month(gender_subset)
    """
    # Extract the year and month from the index of the DataFrame
    index_year_month = dataframe.index.to_period('M')

    # Compute the mean unemployment rate for each year-month
    unemployment_by_year_month = dataframe.groupby(index_year_month).mean()
    
    # Set Seaborn style
    sns.set_style("white")

    # Plot the mean unemployment rate for each year-month
    ax = unemployment_by_year_month.plot(fontsize=6, linewidth=1)

    # Set axis labels and legend
    ax.set_xlabel('Year-Month', fontsize=10)
    ax.set_ylabel('Mean unemployment rate', fontsize=10)
    ax.legend(bbox_to_anchor=(1.05, 1), loc='upper left', fontsize=10)
    plt.show()
def plot_mean_unemployment_by_year(dataframe):
    """
    Plot the mean unemployment rate for each month.

    Parameters:
    dataframe (DataFrame): The DataFrame containing the unemployment rate data indexed by month.

    Returns:
    None

    Example:
    plot_mean_unemployment_by_month(gender_subset)
    """
    # Extract the year from the index of the DataFrame
    index_year = dataframe.index.year

    # Compute the mean unemployment rate for each month for each gender
    unemployment_by_year_for_gender = dataframe.groupby(index_year).mean()

    # Plot the mean unemployment rate for each month
    ax = unemployment_by_year_for_gender.plot(fontsize=6, linewidth=1)

    # Set axis labels and legend
    ax.set_xlabel('Month', fontsize=10)
    ax.set_ylabel('Mean unemployment rate', fontsize=10)
    ax.legend(bbox_to_anchor=(1.05, 1), loc='upper left', fontsize=10)
    plt.show()
def plot_mean_unemployment_by_month(dataframe):
    """
    Plot the mean unemployment rate for each month.

    Parameters:
    dataframe (DataFrame): The DataFrame containing the unemployment rate data indexed by month.

    Returns:
    None

    Example:
    plot_mean_unemployment_by_month(gender_subset)
    """
    # Extract the month from the index of the DataFrame
    index_month = dataframe.index.month

    # Compute the mean unemployment rate for each month
    unemployment_by_month = dataframe.groupby(index_month).mean()

    # Plot the mean unemployment rate for each month
    ax = unemployment_by_month.plot(fontsize=6, linewidth=1)

    # Set axis labels and legend
    ax.set_xlabel('Month', fontsize=10)
    ax.set_ylabel('Mean unemployment rate', fontsize=10)
    ax.legend(bbox_to_anchor=(1.05, 1), loc='upper left', fontsize=10)
    plt.show()

Exploring the dataset