Skip to content
Analyze Multiple Time Series
  • AI Chat
  • Code
  • Report
  • Analyze Multiple Time Series

    This template provides a playbook to analyze multiple time series simultaneously. You will take an indepth look into your time series data by:

    1. Loading and visualizing your data
    2. Inspecting the distribution
    3. Analyzing subsets of your data
    4. Decomposing time series into seasonality, trend and noise
    5. Visualizing correlations with a clustermap
    # Load packages
    import pandas as pd
    import matplotlib.pyplot as plt
    from import tsaplots
    import statsmodels.api as sm
    import seaborn as sns

    1. Load and visualize your data

    # Upload your data as CSV and load as a data frame
    df = pd.read_csv(
        parse_dates=["datestamp"],  # Tell pandas which column(s) to parse as dates
        index_col="datestamp",  # Use a date column as your index
    # Plot settings
    %config InlineBackend.figure_format='retina'
    plt.rcParams["figure.figsize"] = (18, 10)'ggplot')
    # Plot all time series in the df DataFrame
    ax = df.plot(
        colormap="Spectral",  # Set a colormap to avoid overlapping colors
        fontsize=10,  # Set fontsize
        linewidth=0.8, # Set width of lines
    # Set labels and legend
    ax.set_xlabel("Date", fontsize=12)  # X axis text
    ax.set_ylabel("Unemployment Rate", fontsize=12) # Set font size
    ax.set_title("Unemployment rate of U.S. workers by industry", fontsize=15)
        loc="center left",  # Set location of legend within bounding box
        bbox_to_anchor=(1.0, 0.5),  # Set location of bounding box
    # Annotate your plots with vertical lines
        "2001-07-01",  # Position of vertical line
        color="red",  # Color of line
        linestyle="--",  # Style of line
        linewidth=2, # Thickness of line
    ax.axvline("2008-09-01", color="red", linestyle="--", linewidth=2)
    # Show plot

    2. Inspect the distribution

    # Generate a boxplot
    ax = df.boxplot(fontsize=10, vert=False)  # Plots boxplot horizonally if false
    ax.set_xlabel("Unemployment Percentage")
    ax.set_title("Distribution of Unemployment by industry")

    3. Analyze subsets of your data

    a) Visualize (partial) autocorrelation

    Autocorrelation refers to the degree of correlation of a variable between two successive time intervals. It measures how the lagged version of the value of a variable is related to the original version of it in a time series.

    # Display the autocorrelation plot of your time series
    fig = tsaplots.plot_acf(
        df["Agriculture"], lags=24  # Change column to inspect
    )  # Set lag period
    # Show plot
    # Display the partial autocorrelation plot of your time series
    fig = tsaplots.plot_pacf(
        df["Agriculture"], lags=24  # Change column to inspect
    )  # Set lag period
    # Show plot

    b) Group data by different time periods

    Uncover patterns by grouping your data by different time periods e.g. yearly, monthly, daily etc.

    # Extract time period of interest
    index_year = df.index.year  # Choose year, month, day etc.
    # Compute mean for each time period
    df_by_year = df.groupby(index_year).mean()  # Replace .mean() with aggregation function
    # Plot the mean for each time period
    ax = df_by_year.plot(fontsize=10, linewidth=1)
    # Set axis labels and legend
    ax.set_xlabel("Year", fontsize=12)
    ax.set_ylabel("Mean unemployment rate", fontsize=12)
        2008,  # Position of vertical line
        color="red",  # Color of line
        linestyle="--",  # Style of line
    )  # Thickness of line
        loc="center left", bbox_to_anchor=(1.0, 0.5)  # Placement of legend within bbox
    )  # Location of boundary box (bbox)

    4. Decompose time series into seasonality, trend and noise

    Seasonality, trend and noise are essential to every time series. You can interpret them as such:

    • Trend shows you the increasing or decreasing value in the series.
    • Seasonality highlights the repeating short-term cycle in the series.
    • Noise is the random variation in the series.