outline:

  1. Time series basics: create a (time series) ts object from scratch, read in from files, plot

  2. Check stationarity for classical times series models by simulation.

  3. Handle non-stationary time series. (The goal is to turn non-stationary time series into stationary time series through a series of filters (e.g. detrending, autocorrelation). Illustrate using real data (electricity and temperature)

Import libraries

library(tseries)
library(ggplot2)
library(zoo)
library(data.table)
library(forecast)
library(tidyverse)
library(fma)

PART I

Time series basics

Create a single time series object

create a time series (ts) object with (1) value: \(2t + e_t\), where \(e_t \sim N(0,20)\), \(t=1,2,\cdots,120\); (2) time: from Jan-2019 to Dec-2028. Name this object “tseries”.

Use ?ts to check documents.

set.seed(100) #  sets the starting number used to generate a sequence of random numbers – it ensures that you get the same result if you start with that same seed each time you run the same process.

t <- 2 * seq(from = 1, to = 120, by = 1)  + rnorm(120, mean = 0, sd = 20)

tseries <- ts(data = t, start=c(2019,1), end=c(2028,12), frequency = 12) # needs to be in equal time increments 
# a time series is an ordered sequence of values of a variable at equally spaced time intervals.

print(tseries)
##             Jan        Feb        Mar        Apr        May        Jun
## 2019  -8.043847   6.630623   4.421658  25.735696  12.339425  18.372602
## 2020  21.967321  42.796810  32.467590  31.413666  26.222915  46.217125
## 2021  33.712418  43.230989  39.595569  60.618891  34.845411  64.941520
## 2022  77.658154  84.346466  99.308047  99.404040  79.967415 112.064070
## 2023  98.875581  62.426882  93.058756  69.228041 109.577297 145.949314
## 2024 116.760085 122.623119 118.422329 179.639179 132.596683 117.739500
## 2025 122.751614 180.970435 108.758080 152.254994 132.249433 161.410790
## 2026 184.191632 168.841899 178.327357 192.347242 212.543515 177.924594
## 2027 177.350084 204.270397 174.426337 176.519305 195.341533 231.262274
## 2028 242.802029 217.851324 225.451870 229.092025 213.709323 199.415698
##             Jul        Aug        Sep        Oct        Nov        Dec
## 2019   2.364186  30.290654   1.494811  12.802757  23.797723  25.925489
## 2020  19.723716  86.205936  33.238200  59.281212  51.239226  63.468092
## 2021  60.177729  99.147512  63.241408  65.776130  56.199714  67.564115
## 2022  50.464487 100.457348  79.554333 118.444619  86.731193 122.381315
## 2023  64.561490 131.609283  86.023488 152.497448 145.625975 103.222962
## 2024 146.759885 140.033832 136.601661 138.150202 150.978065 122.712887
## 2025 178.169037 118.511905 179.936445 163.000085 139.093014 129.375769
## 2026 170.857554 212.566029 168.140852 164.848575 179.394071 240.913655
## 2027 196.617053 224.857513 180.840126 203.993882 198.471654 208.614070
## 2028 223.380491 234.567721 254.362400 230.888526 231.949180 272.303814

read in from file

Use read.table to read the data from “ts.sample.txt” into a ts object and print it out.

?read.table
ts.sample <- read.csv("ts.sample.txt", header=TRUE, sep="") # defaults between read.table and read.csv are different
ts.sample
##              x
## 1   1.73067773
## 2  -1.08274980
## 3  -2.21151312
## 4   1.50704389
## 5   0.14582993
## 6   0.40603023
## 7   1.30107501
## 8  -0.19889175
## 9  -0.81168049
## 10  0.90951703
## 11  0.08432402
## 12 -0.31358711
## 13  1.72252075
## 14 -0.18193383
## 15 -1.32927078
## 16 -0.27564470
## 17 -0.23653474
## 18  0.78487600
## 19  0.99247488
## 20  0.69994919
## 21 -1.91246878
## 22 -1.24671867
## 23  0.49855245
## 24 -0.45423027
## 25  1.49903922
## 26  1.01092209
## 27 -0.22414683
## 28 -1.60582153
## 29 -0.06655503
## 30 -0.65764414
## 31  0.14960412
## 32  1.30539387
## 33 -0.50526480
## 34 -1.12061501
## 35  0.77018017
## 36 -0.18834248
## 37 -0.31537627
## 38  0.89977692
## 39  0.68375782
## 40 -1.44608804
ts.sample <- read.table("ts.sample.txt")
ts.sample
##              x
## 1   1.73067773
## 2  -1.08274980
## 3  -2.21151312
## 4   1.50704389
## 5   0.14582993
## 6   0.40603023
## 7   1.30107501
## 8  -0.19889175
## 9  -0.81168049
## 10  0.90951703
## 11  0.08432402
## 12 -0.31358711
## 13  1.72252075
## 14 -0.18193383
## 15 -1.32927078
## 16 -0.27564470
## 17 -0.23653474
## 18  0.78487600
## 19  0.99247488
## 20  0.69994919
## 21 -1.91246878
## 22 -1.24671867
## 23  0.49855245
## 24 -0.45423027
## 25  1.49903922
## 26  1.01092209
## 27 -0.22414683
## 28 -1.60582153
## 29 -0.06655503
## 30 -0.65764414
## 31  0.14960412
## 32  1.30539387
## 33 -0.50526480
## 34 -1.12061501
## 35  0.77018017
## 36 -0.18834248
## 37 -0.31537627
## 38  0.89977692
## 39  0.68375782
## 40 -1.44608804
ts.sample = ts(ts.sample) # we don't know the frequency of this
ts.sample
## Time Series:
## Start = 1 
## End = 40 
## Frequency = 1 
##                 x
##  [1,]  1.73067773
##  [2,] -1.08274980
##  [3,] -2.21151312
##  [4,]  1.50704389
##  [5,]  0.14582993
##  [6,]  0.40603023
##  [7,]  1.30107501
##  [8,] -0.19889175
##  [9,] -0.81168049
## [10,]  0.90951703
## [11,]  0.08432402
## [12,] -0.31358711
## [13,]  1.72252075
## [14,] -0.18193383
## [15,] -1.32927078
## [16,] -0.27564470
## [17,] -0.23653474
## [18,]  0.78487600
## [19,]  0.99247488
## [20,]  0.69994919
## [21,] -1.91246878
## [22,] -1.24671867
## [23,]  0.49855245
## [24,] -0.45423027
## [25,]  1.49903922
## [26,]  1.01092209
## [27,] -0.22414683
## [28,] -1.60582153
## [29,] -0.06655503
## [30,] -0.65764414
## [31,]  0.14960412
## [32,]  1.30539387
## [33,] -0.50526480
## [34,] -1.12061501
## [35,]  0.77018017
## [36,] -0.18834248
## [37,] -0.31537627
## [38,]  0.89977692
## [39,]  0.68375782
## [40,] -1.44608804

convinient functions for ts

  1. Extract times from tseries as a vector using function time; (2) Extract start time, end time and frequency using function start, end, frequency; (3) extract a subset from Jan-2019 to June-2020 using function window.
time(tseries) # goes by twelfves  
##           Jan      Feb      Mar      Apr      May      Jun      Jul      Aug
## 2019 2019.000 2019.083 2019.167 2019.250 2019.333 2019.417 2019.500 2019.583
## 2020 2020.000 2020.083 2020.167 2020.250 2020.333 2020.417 2020.500 2020.583
## 2021 2021.000 2021.083 2021.167 2021.250 2021.333 2021.417 2021.500 2021.583
## 2022 2022.000 2022.083 2022.167 2022.250 2022.333 2022.417 2022.500 2022.583
## 2023 2023.000 2023.083 2023.167 2023.250 2023.333 2023.417 2023.500 2023.583
## 2024 2024.000 2024.083 2024.167 2024.250 2024.333 2024.417 2024.500 2024.583
## 2025 2025.000 2025.083 2025.167 2025.250 2025.333 2025.417 2025.500 2025.583
## 2026 2026.000 2026.083 2026.167 2026.250 2026.333 2026.417 2026.500 2026.583
## 2027 2027.000 2027.083 2027.167 2027.250 2027.333 2027.417 2027.500 2027.583
## 2028 2028.000 2028.083 2028.167 2028.250 2028.333 2028.417 2028.500 2028.583
##           Sep      Oct      Nov      Dec
## 2019 2019.667 2019.750 2019.833 2019.917
## 2020 2020.667 2020.750 2020.833 2020.917
## 2021 2021.667 2021.750 2021.833 2021.917
## 2022 2022.667 2022.750 2022.833 2022.917
## 2023 2023.667 2023.750 2023.833 2023.917
## 2024 2024.667 2024.750 2024.833 2024.917
## 2025 2025.667 2025.750 2025.833 2025.917
## 2026 2026.667 2026.750 2026.833 2026.917
## 2027 2027.667 2027.750 2027.833 2027.917
## 2028 2028.667 2028.750 2028.833 2028.917
start(tseries) 
## [1] 2019    1
end(tseries)
## [1] 2028   12
frequency(tseries)
## [1] 12
window(tseries, start=c(2019, 1), end=c(2020,6))
##            Jan       Feb       Mar       Apr       May       Jun       Jul
## 2019 -8.043847  6.630623  4.421658 25.735696 12.339425 18.372602  2.364186
## 2020 21.967321 42.796810 32.467590 31.413666 26.222915 46.217125          
##            Aug       Sep       Oct       Nov       Dec
## 2019 30.290654  1.494811 12.802757 23.797723 25.925489
## 2020

ts plot

Plot the ts object tseries you create using plot.ts function. (You can also try plot or ggplot)

# plot
plot.ts(tseries, col='steelblue2', lty=1, lwd=2)

#plot(tseries, col='steelblue2', lty=1, lwd=2) # how to do in ggplot?
# ggplot
df.ts = data.frame(month=time(tseries), tseries)
ggplot(df.ts, aes(x=month, y=tseries)) + geom_line(col='steelblue2', size=1)
## Don't know how to automatically pick scale for object of type ts. Defaulting to continuous.
## Don't know how to automatically pick scale for object of type ts. Defaulting to continuous.

Can you show both points and line on the plot?

plot(tseries, type='l', col='steelblue2', lty=1, lwd=2)