Lab Manual 2 - Stationary and Non-Stationary Models

(Part III)



Department of Mathematics and Statistics
University of Calgary



Time Series Analysis
Winter 2026





© 2026 Prof. Mina Aminghafari, University of Calgary — Dept. of Mathematics and Statistics.
For educational use only. Redistribution without permission is prohibited.




Objectives

In the previous lab manual (Part II), we became familiar with how to assess stationarity by visually inspecting time series plots. In this lab, we will learn how to compute the sample autocorrelation function (ACF) and the sample partial autocorrelation function (PACF) for a time series.

We will also learn how to produce and interpret sample ACF/PACF plots. In addition, we will discuss common causes of non-stationarity and explore several methods to address them, including the Box–Cox transformation and log transformation.



Section 1: Sample ACF and PACF

In practice, we do not have access to the model itself in order to compute its expectation and autocovariance; instead, we only have a series of observations generated by the model. Therefore, we must compute the sample autocorrelation function. The sample autocorrelation function serves as an estimate of the model’s true autocorrelation function.

The sample autocovariance estimator is: \[ \hat{\gamma}(h)=\frac{1}{n}\sum_{t=1}^{n-h}(X_{t+h}-\bar{X})(X_t-\bar{X}), \] and the sample autocorrelation estimator is: \[ \hat{\rho}(h)=\frac{\hat{\gamma}(h)}{\hat{\gamma}(0)}. \]

Activity 1: Combined ACF/PACF display (cangas)

In this activity, we will take a look at the dataset cangas, which contains monthly Canadian gas production (in billions of cubic meteres) from January 1960 to February 2005, available in the expsmooth package.

# install.packages("expsmooth")/install.packages("itsmr") Copy and paste this into your console to install the package if the library is not installed.

library(expsmooth)
## Warning: package 'expsmooth' was built under R version 4.2.3
## Loading required package: forecast
## Warning: package 'forecast' was built under R version 4.2.3
## Registered S3 method overwritten by 'quantmod':
##   method            from
##   as.zoo.data.frame zoo
library(itsmr)
## 
## Attaching package: 'itsmr'
## The following object is masked from 'package:forecast':
## 
##     forecast
# Load the dataset
head(cangas, 12)
##         Jan    Feb    Mar    Apr    May    Jun    Jul    Aug    Sep    Oct
## 1960 1.4306 1.3059 1.4022 1.1699 1.1161 1.0113 0.9660 0.9773 1.0311 1.2521
##         Nov    Dec
## 1960 1.4419 1.5637
plota(cangas)



You may also alternatively use the built-in functions acf() and pacf() to compute and visualize the sample ACF and PACF plots.

par(mfrow = c(1,2))

acf(cangas, lag.max = 100, col = 'blue')
pacf(cangas, lag.max = 100, col = 'blue')

par(mfrow = c(1,1))



Section 2: Non-stationary series (common causes)

In this part, we review the factors that cause non-stationarity, including:

We say that a time series is non-stationary in the mean or has a trend if the mean of \(X_t\) changes over time. For example, the data may exhibit an upward or downward trend. In this case, we write \[ \mathbb{E}(X_t) = \mu(t), \] indicating that the mean is a function of time.

A time series is called seasonal if it exhibits similar behavior over different time periods. For example, consider the amount of ice cream sold in a particular city at different times of the year. It is clear that sales are highest in the summer and lowest in the winter. Therefore, across all years in which this quantity is recorded, we observe the repetition of a specific pattern. (The seasonal pattern is usually monthly and quarterly)

A cyclical time series is similar to a seasonal time series, with the difference that the length of the repeating cycle does not necessarily correspond to a seasonal period (for example, 4 or 12 months). Instead, the cycle may have a different length, such as an 8-month repeating pattern.

Sometimes, as time progresses, the amplitude of the series also changes; in other words, the variability of the series depends on time. In such cases, we say that the series is non-stationary in variance, or equivalently, that the variance is not constant, meaning \[ \mathrm{Var}(X_t) = g(t). \]



Section 3: Removing Non-stationarity

As shown in the plot below, when non-stationary variance is present, the range of the data changes over time. If we attempt to enclose all data points between two lines, those lines are not parallel. This indicates that the increments \(X_t - X_{t-1}\) do not increase by a constant amount over time but instead vary.

plot(cangas , main = 'Air Passengers plot', col = 'blue')


Activity 2: Variance stabilisation: Box–Cox transformation

To remove non-stationarity in variance, we apply an appropriate transformation so that the data attain approximately constant variance. A common approach is the Box–Cox transformation, which is used to stabilize variance and is defined as:

\[ f(X_t) = \begin{cases} \log(X_t), & \lambda = 0, \\ \dfrac{X_t^\lambda - 1}{\lambda}, & \lambda \neq 0. \end{cases} \]

The parameter \(\lambda\) must be determined from the data. To illustrate this, check the code below:

# install.packages("forecast") Copy and paste this into your console to install the package if the library is not installed.
library(forecast)

lambda = BoxCox.lambda(cangas)

lambda
## [1] 0.5767759

The Box–Cox parameter \(\lambda\) determines the type of transformation that is most appropriate for stabilizing the variance of the time series.

For example, if \(\lambda = 1\), then \[ f(X_t) = X_t, \]meaning that no transformation is needed, since the variance is already approximately constant over time.

If \(\lambda \neq 1\), this suggests that the variance changes with the level of the series, and a transformation may be helpful.

Different values of \(\lambda\) correspond to different variance-stabilizing transformations. For instance:

\[ f(X_t) = 2(\sqrt{X_t} - 1), \] which provides a more moderate variance stabilization.

In general, applying the Box–Cox transformation helps reduce non-stationarity in variance and makes the variability of the series more constant over time.

To better understand this idea, consider the example below.


Let’s use the log transformation to see whether the time series becomes variance-stable or not. Check the plot below and answer the corresponding questions.

# A common practical choice for AirPassengers is the log-transform (variance stabilization)
plot(log(cangas), main = 'The log transformation plot', col = 'blue')


We can see that a log transformation is not the most appropriate choice for stabilizing the variance in this series. Instead, we apply the Box–Cox transformation \(f(X_t)=\frac{X_t^\lambda - 1}{\lambda},\) using the value of \(\lambda\) estimated from the Box–Cox procedure.

cangas_bc <- BoxCox(cangas, lambda) # Transformed sample using the lambda derived above

#################

# Your Code here

###############

(Hint: The data are monthly, so each year contains 12 observations. For example, 1960–1975 corresponds to \(16 \times 12 = 192\) observations, so you can use indices 1:192 for the first period, and similarly define the next intervals.)




Section 4: CO\(_2\) Dataset

A key part of your project is being able to import a dataset and convert the relevant variable into a time series object in R.

# Import your dataset 
dt_path <- file.choose()
dt <- read.csv(dt_path)
head(dt)
##   Entity     Code        Day Monthly.concentration Yearly.average.of.CO2
## 1  World OWID_WRL 1979-01-15                336.56                    NA
## 2  World OWID_WRL 1979-02-15                337.29                    NA
## 3  World OWID_WRL 1979-03-15                337.88                    NA
## 4  World OWID_WRL 1979-04-15                338.32                    NA
## 5  World OWID_WRL 1979-05-15                338.26                    NA
## 6  World OWID_WRL 1979-06-15                337.38                    NA

In many real-world scenarios, datasets are not provided in a time series format by default. Therefore, you first need to convert the relevant variable into a time series object. In R, this can be done using the ts() function by specifying the starting time and the observation frequency (e.g., 12 for monthly data).

# Convert the CO2 monthly concentration data into a time series object
dt_ts <- ts(dt, start = c(1979, 01), frequency = 12)


head(dt_ts)
##          Entity Code Day Monthly.concentration Yearly.average.of.CO2
## Jan 1979      1    1   1                336.56                    NA
## Feb 1979      1    1   2                337.29                    NA
## Mar 1979      1    1   3                337.88                    NA
## Apr 1979      1    1   4                338.32                    NA
## May 1979      1    1   5                338.26                    NA
## Jun 1979      1    1   6                337.38                    NA

Consider the Monthly concentration of co2 as you variable and answer the questions below.

y = dt_ts[,4]

#######################

# write your code here

######################