Skip to content

## Analyze Multiple Time Series

This template provides a playbook to analyze multiple time series simultaneously. You will take an indepth look into your time series data by:

- Loading and visualizing your data
- Inspecting the distribution
- Analyzing subsets of your data
- Decomposing time series into seasonality, trend and noise
- Visualizing correlations with a clustermap

```
# Load packages
import pandas as pd
import matplotlib.pyplot as plt
from statsmodels.graphics import tsaplots
import statsmodels.api as sm
import seaborn as sns
```

### 1. Load and visualize your data

```
# Upload your data as CSV and load as a data frame
df = pd.read_csv(
"data.csv",
parse_dates=["datestamp"], # Tell pandas which column(s) to parse as dates
index_col="datestamp", # Use a date column as your index
)
df.head()
```

```
# Plot settings
%config InlineBackend.figure_format='retina'
plt.rcParams["figure.figsize"] = (18, 10)
plt.style.use('ggplot')
# Plot all time series in the df DataFrame
ax = df.plot(
colormap="Spectral", # Set a colormap to avoid overlapping colors
fontsize=10, # Set fontsize
linewidth=0.8, # Set width of lines
)
# Set labels and legend
ax.set_xlabel("Date", fontsize=12) # X axis text
ax.set_ylabel("Unemployment Rate", fontsize=12) # Set font size
ax.set_title("Unemployment rate of U.S. workers by industry", fontsize=15)
ax.legend(
loc="center left", # Set location of legend within bounding box
bbox_to_anchor=(1.0, 0.5), # Set location of bounding box
)
# Annotate your plots with vertical lines
ax.axvline(
"2001-07-01", # Position of vertical line
color="red", # Color of line
linestyle="--", # Style of line
linewidth=2, # Thickness of line
)
ax.axvline("2008-09-01", color="red", linestyle="--", linewidth=2)
# Show plot
plt.show()
```

### 2. Inspect the distribution

```
df.describe()
```

```
# Generate a boxplot
ax = df.boxplot(fontsize=10, vert=False) # Plots boxplot horizonally if false
ax.set_xlabel("Unemployment Percentage")
ax.set_title("Distribution of Unemployment by industry")
plt.show()
```

### 3. Analyze subsets of your data

#### a) Visualize (partial) autocorrelation

Autocorrelation refers to the degree of correlation of a variable between two successive time intervals. It measures how the lagged version of the value of a variable is related to the original version of it in a time series.

```
# Display the autocorrelation plot of your time series
fig = tsaplots.plot_acf(
df["Agriculture"], lags=24 # Change column to inspect
) # Set lag period
# Show plot
plt.show()
```

```
# Display the partial autocorrelation plot of your time series
fig = tsaplots.plot_pacf(
df["Agriculture"], lags=24 # Change column to inspect
) # Set lag period
# Show plot
plt.show()
```

#### b) Group data by different time periods

Uncover patterns by grouping your data by different time periods e.g. yearly, monthly, daily etc.

```
# Extract time period of interest
index_year = df.index.year # Choose year, month, day etc.
# Compute mean for each time period
df_by_year = df.groupby(index_year).mean() # Replace .mean() with aggregation function
# Plot the mean for each time period
ax = df_by_year.plot(fontsize=10, linewidth=1)
# Set axis labels and legend
ax.set_xlabel("Year", fontsize=12)
ax.set_ylabel("Mean unemployment rate", fontsize=12)
ax.axvline(
2008, # Position of vertical line
color="red", # Color of line
linestyle="--", # Style of line
linewidth=2,
) # Thickness of line
ax.legend(
loc="center left", bbox_to_anchor=(1.0, 0.5) # Placement of legend within bbox
) # Location of boundary box (bbox)
plt.show()
```

### 4. Decompose time series into seasonality, trend and noise

Seasonality, trend and noise are essential to every time series. You can interpret them as such:

**Trend**shows you the increasing or decreasing value in the series.**Seasonality**highlights the repeating short-term cycle in the series.**Noise**is the random variation in the series.