course
Autocorrelation in R
If you want to take our Introduction to Time Series Analysis in R course, here is the link.
Calculating Autocorrelations
Autocorrelations or lagged correlations are used to assess whether a time series is dependent on its past. For a time series x
of length n
we consider the n-1
pairs of observations one time unit apart. The first such pair is (x[2],x[1])
, and the next is (x[3],x[2])
. Each such pair is of the form (x[t],x[t-1])
where t is the observation index, which we vary from 2 to n in this case. The lag-1 autocorrelation of x can be estimated as the sample correlation of these (x[t], x[t-1])
pairs.
In general, we can manually create these pairs of observations. First, create two vectors, x_t0
and x_t1
, each with length n-1, such that the rows correspond to (x[t], x[t-1])
pairs. Then apply the cor()
function to estimate the lag-1 autocorrelation.
Luckily, the acf() command provides a shortcut. Applying acf(..., lag.max = 1, plot = FALSE)
to a series x
automatically calculates the lag-1 autocorrelation.
Finally, note that the two estimates differ slightly as they use slightly different scalings in their calculation of sample covariance, 1/(n-1)
versus 1/n
. Although the latter would provide a biased estimate, it is preferred in time series analysis, and the resulting autocorrelation estimates only differ by a factor of (n-1)/n
.
In this exercise, you'll practice both the manual and automatic calculation of a lag-1 autocorrelation. The time series x
and its length n
(150) have already been loaded. The series is shown in the plot on the right.
Instructions
- Create two vectors,
x_t0
andx_t1
, each with lengthn-1
such that the rows correspond to the(x[t], x[t-1])
pairs. - Confirm that
x_t0
andx_t1
are(x[t], x[t-1])
pairs using the pre-written code. - Use
plot()
to view the scatterplot ofx_t0
andx_t1
. - Use
cor()
to view the correlation betweenx_t0
andx_t1
. - Use
acf()
withx
to automatically calculate the lag-1 autocorrelation. Set thelag.max
argument to1
to produce a single lag period and set theplot
argument toFALSE
. - Confirm that the difference factor is
(n-1)/n
using the pre-written code.
If that makes sense keep going to the next exercise! If not, here is an overview video.
Overview Video on Autocorrelation
The Autocorrelation Function
Autocorrelations can be estimated at many lags to better assess how a time series relates to its past. We are typically most interested in how a series relates to its most recent past.
The acf(..., lag.max = ..., plot = FALSE)
function will estimate all autocorrelations from 0, 1, 2,..., up to the value specified by the argument lag.max
. In the previous exercise, you focused on the lag-1 autocorrelation by setting the lag.max
argument to 1
.
In this exercise, you'll explore some further applications of the acf()
command. Once again, the time series x
has been preloaded for you and is shown in the plot on the right.
Instructions
- Use
acf()
to view the autocorrelations of seriesx
from 0 to 10. Set thelag.max
argument to10
and keep theplot
argument asFALSE
. - Copy and paste the autocorrelation estimate (ACF) at lag-10.
- Copy and paste the autocorrelation estimate (ACF) at lag-5.
Visualizing the Autocorrelation Function
Estimating the autocorrelation function (ACF) at many lags allows us to assess how a time series x
relates to its past. The numeric estimates are important for detailed calculations, but it is also useful to visualize the ACF as a function of the lag.
In fact, the acf()
command produces a figure by default. It also makes a default choice for lag.max
, the maximum number of lags to be displayed.
Three time series x
, y
, and z
have been loaded into your R environment and are plotted on the right. The time series x
shows strong persistence, meaning the current value is closely relatively to those that proceed it. The time series y
shows a periodic pattern with a cycle length of approximately four observations, meaning the current value is relatively close to the observation four before it. The time series z
does not exhibit any clear pattern.
In this exercise, you'll plot an estimated autocorrelation function for each time series. In the plots produced by acf()
, the lag for each autocorrelation estimate is denoted on the horizontal axis and each autocorrelation estimate is indicated by the height of the vertical bars. Recall that the ACF at lag-0 is always 1.
Finally, each ACF figure includes a pair of blue, horizontal, dashed lines representing lag-wise 95% confidence intervals centered at zero. These are used for determining the statistical significance of an individual autocorrelation estimate at a given lag versus a null value of zero, i.e., no autocorrelation at that lag.
Instructions
Use three calls of the function acf()
to display the estimated ACFs of each of your three time series (x
, y
, and z
). There is no need to specify additional arguments in your calls to acf()
.
If you want to learn more from this course, here is the link.
Check out our Time Series Analysis using R: Tutorial.
R courses
course
Intermediate R
course
Introduction to Regression in R
blog
R Correlation Tutorial
tutorial
Time Series Analysis using R: Tutorial
Salin Kc
16 min
tutorial
Basic Programming Skills in R
Ryan Sheehy
5 min
tutorial
Data Frames in R
Ryan Sheehy
4 min
tutorial
Introduction to Data frames in R
Ryan Sheehy
5 min
tutorial
Creating a List in R
Ryan Sheehy
3 min