TS-LM2-Part-III.knit

Lab Manual 2 - Stationary and Non-Stationary Models

(Part III)

Department of Mathematics and Statistics
University of Calgary

Time Series Analysis
Winter 2026

© 2026 Prof. Mina Aminghafari, University of Calgary — Dept. of Mathematics and Statistics.
For educational use only. Redistribution without permission is prohibited.

Objectives

In the previous lab manual (Part II), we became familiar with how to assess stationarity by visually inspecting time series plots. In this lab, we will learn how to compute the sample autocorrelation function (ACF) and the sample partial autocorrelation function (PACF) for a time series.

We will also learn how to produce and interpret sample ACF/PACF plots. In addition, we will discuss common causes of non-stationarity and explore several methods to address them, including the Box–Cox transformation and log transformation.

Section 1: Sample ACF and PACF

In practice, we do not have access to the model itself in order to compute its expectation and autocovariance; instead, we only have a series of observations generated by the model. Therefore, we must compute the sample autocorrelation function. The sample autocorrelation function serves as an estimate of the model’s true autocorrelation function.

The sample autocovariance estimator is: \[ \hat{\gamma}(h)=\frac{1}{n}\sum_{t=1}^{n-h}(X_{t+h}-\bar{X})(X_t-\bar{X}), \] and the sample autocorrelation estimator is: \[ \hat{\rho}(h)=\frac{\hat{\gamma}(h)}{\hat{\gamma}(0)}. \]

Activity 1: Combined ACF/PACF display (cangas)

In this activity, we will take a look at the dataset cangas, which contains monthly Canadian gas production (in billions of cubic meteres) from January 1960 to February 2005, available in the expsmooth package.

# install.packages("expsmooth")/install.packages("itsmr") Copy and paste this into your console to install the package if the library is not installed.

library(expsmooth)

## Warning: package 'expsmooth' was built under R version 4.2.3

## Loading required package: forecast

## Warning: package 'forecast' was built under R version 4.2.3

## Registered S3 method overwritten by 'quantmod':
##   method            from
##   as.zoo.data.frame zoo

library(itsmr)

## 
## Attaching package: 'itsmr'

## The following object is masked from 'package:forecast':
## 
##     forecast

# Load the dataset
head(cangas, 12)

##         Jan    Feb    Mar    Apr    May    Jun    Jul    Aug    Sep    Oct
## 1960 1.4306 1.3059 1.4022 1.1699 1.1161 1.0113 0.9660 0.9773 1.0311 1.2521
##         Nov    Dec
## 1960 1.4419 1.5637

plota(cangas)

Do the plots indicate a seasonal pattern? If yes, estimate the seasonal period (the number of observations per cycle) and report the period value.
Based on the ACF and PACF plots, does the time series appear to be stationary? Briefly justify your answer.

You may also alternatively use the built-in functions acf() and pacf() to compute and visualize the sample ACF and PACF plots.

par(mfrow = c(1,2))

acf(cangas, lag.max = 100, col = 'blue')
pacf(cangas, lag.max = 100, col = 'blue')

par(mfrow = c(1,1))

Section 2: Non-stationary series (common causes)

In this part, we review the factors that cause non-stationarity, including:

Non-stationarity in mean (trend):

We say that a time series is non-stationary in the mean or has a trend if the mean of \(X_t\) changes over time. For example, the data may exhibit an upward or downward trend. In this case, we write \[ \mathbb{E}(X_t) = \mu(t), \] indicating that the mean is a function of time.

Seasonality:

A time series is called seasonal if it exhibits similar behavior over different time periods. For example, consider the amount of ice cream sold in a particular city at different times of the year. It is clear that sales are highest in the summer and lowest in the winter. Therefore, across all years in which this quantity is recorded, we observe the repetition of a specific pattern. (The seasonal pattern is usually monthly and quarterly)

Cyclicality:

A cyclical time series is similar to a seasonal time series, with the difference that the length of the repeating cycle does not necessarily correspond to a seasonal period (for example, 4 or 12 months). Instead, the cycle may have a different length, such as an 8-month repeating pattern.

Non-stationarity in variance:

Sometimes, as time progresses, the amplitude of the series also changes; in other words, the variability of the series depends on time. In such cases, we say that the series is non-stationary in variance, or equivalently, that the variance is not constant, meaning \[ \mathrm{Var}(X_t) = g(t). \]

Section 3: Removing Non-stationarity

As shown in the plot below, when non-stationary variance is present, the range of the data changes over time. If we attempt to enclose all data points between two lines, those lines are not parallel. This indicates that the increments \(X_t - X_{t-1}\) do not increase by a constant amount over time but instead vary.

plot(cangas , main = 'Air Passengers plot', col = 'blue')

Does the dispersion of the data increase or decrease as time progresses?
Does the series exhibit a seasonal (periodic) component?
Does the data display a funnel-shaped pattern, where variability increases with the level of the series?
Do the data show repeating local maxima and minima at fixed time intervals?
Is there evidence of non-stationarity in the variance?

Activity 2: Variance stabilisation: Box–Cox transformation

To remove non-stationarity in variance, we apply an appropriate transformation so that the data attain approximately constant variance. A common approach is the Box–Cox transformation, which is used to stabilize variance and is defined as:

\[ f(X_t) = \begin{cases} \log(X_t), & \lambda = 0, \\ \dfrac{X_t^\lambda - 1}{\lambda}, & \lambda \neq 0. \end{cases} \]

The parameter \(\lambda\) must be determined from the data. To illustrate this, check the code below:

# install.packages("forecast") Copy and paste this into your console to install the package if the library is not installed.
library(forecast)

lambda = BoxCox.lambda(cangas)

lambda

## [1] 0.5767759

The Box–Cox parameter \(\lambda\) determines the type of transformation that is most appropriate for stabilizing the variance of the time series.

For example, if \(\lambda = 1\), then \[ f(X_t) = X_t, \]meaning that no transformation is needed, since the variance is already approximately constant over time.

If \(\lambda \neq 1\), this suggests that the variance changes with the level of the series, and a transformation may be helpful.

Different values of \(\lambda\) correspond to different variance-stabilizing transformations. For instance:

If \(\lambda = 0\), the Box–Cox transformation reduces to the log transformation: \[ f(X_t) = \log(X_t), \] which is commonly used when the variance increases rapidly as the series grows.
If \(\lambda \approx 0.5\), the transformation is similar to a square-root transformation:

\[ f(X_t) = 2(\sqrt{X_t} - 1), \] which provides a more moderate variance stabilization.

In general, applying the Box–Cox transformation helps reduce non-stationarity in variance and makes the variability of the series more constant over time.

To better understand this idea, consider the example below.

Let’s use the log transformation to see whether the time series becomes variance-stable or not. Check the plot below and answer the corresponding questions.

# A common practical choice for AirPassengers is the log-transform (variance stabilization)
plot(log(cangas), main = 'The log transformation plot', col = 'blue')

Does the dispersion of the data increase or decrease as time progresses?
Does the series exhibit a seasonal (periodic) component?
Does the data display a funnel-shaped pattern, where variability increases with the level of the series?
Do the data show repeating local maxima and minima at fixed time intervals?
Is there evidence of non-stationarity in the variance?
How does the log transformation affect the variability of the series over time?
Does the log-transformed series appear to have more stable variance than the original one?

We can see that a log transformation is not the most appropriate choice for stabilizing the variance in this series. Instead, we apply the Box–Cox transformation \(f(X_t)=\frac{X_t^\lambda - 1}{\lambda},\) using the value of \(\lambda\) estimated from the Box–Cox procedure.

cangas_bc <- BoxCox(cangas, lambda) # Transformed sample using the lambda derived above

#################

# Your Code here

###############

Plot the original time series and the transformed time series in a \(2 \times 1\) layout using par(mfrow = c(2, 1)). What changes do you observe? Do you think the series is more stable in terms of variance after the transformation?
To support your visual observations, compute the sample variance of both the original and the transformed time series over three sub-periods: 1960–1975, 1976–1990, and 1991–2005.
Compare the variances across these periods to determine whether the Box–Cox transformation has helped stabilize the variance over time.

(Hint: The data are monthly, so each year contains 12 observations. For example, 1960–1975 corresponds to \(16 \times 12 = 192\) observations, so you can use indices 1:192 for the first period, and similarly define the next intervals.)

Plot the sample ACF and PACF of the transformed series in a different \(2 \times 1\) layout. What do you observe? Does the transformed series appear more stationary? If not, what are the potential reasons (for example, remaining non=stationary reasons)?

Section 4: CO\(_2\) Dataset

A key part of your project is being able to import a dataset and convert the relevant variable into a time series object in R.

# Import your dataset 
dt_path <- file.choose()
dt <- read.csv(dt_path)
head(dt)

##   Entity     Code        Day Monthly.concentration Yearly.average.of.CO2
## 1  World OWID_WRL 1979-01-15                336.56                    NA
## 2  World OWID_WRL 1979-02-15                337.29                    NA
## 3  World OWID_WRL 1979-03-15                337.88                    NA
## 4  World OWID_WRL 1979-04-15                338.32                    NA
## 5  World OWID_WRL 1979-05-15                338.26                    NA
## 6  World OWID_WRL 1979-06-15                337.38                    NA

In many real-world scenarios, datasets are not provided in a time series format by default. Therefore, you first need to convert the relevant variable into a time series object. In R, this can be done using the ts() function by specifying the starting time and the observation frequency (e.g., 12 for monthly data).

# Convert the CO2 monthly concentration data into a time series object
dt_ts <- ts(dt, start = c(1979, 01), frequency = 12)


head(dt_ts)

##          Entity Code Day Monthly.concentration Yearly.average.of.CO2
## Jan 1979      1    1   1                336.56                    NA
## Feb 1979      1    1   2                337.29                    NA
## Mar 1979      1    1   3                337.88                    NA
## Apr 1979      1    1   4                338.32                    NA
## May 1979      1    1   5                338.26                    NA
## Jun 1979      1    1   6                337.38                    NA

Consider the Monthly concentration of co2 as you variable and answer the questions below.

Plot the series and examine its sample ACF and PACF.

y = dt_ts[,4]

#######################

# write your code here

######################

Based on the plot, discuss whether the series suggests non-stationarity in the variance.
Would you apply a log transformation before differencing? Justify your answer.
To strengthen your findings, compute the sample variance of the series over three different time intervals: 1979–1995, 1996–2010, and 2011–2024. Compare the variance values across these periods to determine whether the variability of the series remains approximately constant over time.
If the time series is not stationary, estimate a suitable value of \(\lambda\) and apply an appropriate Box–Cox transformation to help stabilize the variance.
Compute the sample variance of your series over different time intervals. Does the variance appear approximately constant over time?