library(tseries)
library(tidyverse)
library(magrittr)
library(plotly)
library(timetk)
library(lubridate)
library(pracma)
library(fma)
Assuming the following solutions by comparing the given pictures: A -> 4, B -> 3, C -> 2, D -> 1
Checking by code:
par(mfrow=c(2,4))
# Data set cow temperature
plot(cowtemp)
acf(cowtemp)
# Data set air passengers
plot(AirPassengers)
acf(AirPassengers)
# Data set mink trappings
plot(mink)
acf(mink)
# Data set accidential deaths in the US
plot(usdeaths)
acf(usdeaths)
True results: A -> 2, B -> 3, C -> 1, D -> 4
Failed to solve this task. Reasons: time and complexity (because this task is based on exercise 1.8 level).
# function plug-in estimator for autocorrelation
# load data
b <- beer
c <- chicken
set.seed(22)
d <- arima.sim(list(ar = c(0.8)), n = 1000)
plot(d)
par(mfrow=c(2,1))
plot(ARMAacf(ar = c(0.8), lag.max = 100), type = "h")
acf(d, lag.max = 100)
# assumption k = 100 as before
plot(ARMAacf(ar = c(0.8), lag.max = 100), ylim = c(-1, 1))
Dependence is decreasing fast from Lag 1 to lag 20. Afterwards the dependence continues stable towards zero.
par(mfrow=c(2,1))
plot(ARMAacf(ar = c(1), lag.max = 100, pacf = TRUE))
pacf(d, lag.max = 100)
Observation: good agreement only for the first lag. Theoretical PACF just generates a value for lag 1. All other values are NAs!
d <- read_table2("kreatin.txt")
d %<>% select(-X3)
plot(d)
summary(d$gehalt)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 61.00 69.00 70.00 70.12 71.00 79.00
sd(d$gehalt)
## [1] 2.758089
hist(d$gehalt, breaks = 25)
plot(density(d$gehalt))
boxplot(d$gehalt)
Assumption: There should be constanct values with error. Meaning: It should be a stationary process. Generated plots above do not confirm the assumption completly. The following aspect is not given: All Xt are not identically distributed.
ts <- ts(d$gehalt)
acf(ts)
pacf(ts)
Insights do not confirm a stationary process. There are correlations at multiple lags (e. g. 2, 3, 4, 9). One could recognize partial autocorrelations at multiple lags as well. Conclusion: One could recognize that autocovariance dependends not only at lag h.
Next possible step: Use differencing at lag 1 and check acf as well as pacf again.
d <- read.table("cbe.dat.txt")
ts <- ts(d, frequency = 12, start = c(1956,1))
ts <- ts[, "elec"]
plot(ts)
Because one can see a clear trend as well as a seasonality. We need to adress this issues first.
plot(ts_decomposition <- decompose(ts, type = "multiplicative"))
plot(ts_decomposition$seasonal[1:12], type = "l") # one saison
plot(ts_decomposition$random)
acf(ts_decomposition$random, na.action = na.pass, plot = TRUE)
One sees a clear increasing trend as well as a yearly saisonality with peaks in summer. The remainder represents structure, e. g. at lag 9 to 11. Conclusion: The residuals don’t seem to appear random. That is an issue to be solved.
Averaging window approach
plot(ts_stl <- stl(log(ts), s.window = "periodic"))
plot(ts_stl_remainder <- ts_stl$time.series[,3])
acf(ts_stl_remainder)
Same as above in b): One sees a clear increasing trend as well as a yearly saisonality with peaks in summer. The remainder represents structure, e. g. at lag 9 to 11. Conclusion: The residuals don’t seem to appear random. That is an issue to be solved.
Smoothing window approach
plot(ts_stl2 <- stl(log(ts), s.window = 12)) # assumption: 12 means a year span in this case
plot(ts_stl_remainder2 <- ts_stl2$time.series[,3])
acf(ts_stl_remainder2)
Smooting window approach worked better than the two previous approaches. The remainder seems pretty stochastic with little signs of autocorrelation at lag 8, 19 and 20. Hint: we didn’t cover residual analyis so far. Therefore we do not dive deeper regarding this topic now.
Parameter type = “multiplicative”: Because there is a seasonality and a trend at the same time and both are changing over time (e. g. their variation).
Log-transformation: Needed to transform the multiplicative to an additive model.
lag 1
ts_diff_lag1 <- diff(ts, lag = 1)
plot(ts_diff_lag1)
acf(ts_diff_lag1)
pacf(ts_diff_lag1)
Time series is not stationary after differencing with lag 1. There are obvious autocorrelations at lag 5, 10, 15 and 20. The pacf shows autocorrelations at lag 3, 5, 6, 8, 9, 10 and 11. It would be interesting to analyze the values at lag 5 and 10 in more detail. One should consider applying the log transformation as well.
ts_diff_lag12 <- diff(ts, lag = 12)
head(ts, 36)
## Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
## 1956 1254 1290 1379 1346 1535 1555 1655 1651 1500 1538 1486 1394
## 1957 1409 1387 1543 1502 1693 1616 1841 1787 1631 1649 1586 1500
## 1958 1497 1463 1648 1595 1777 1824 1994 1835 1787 1699 1633 1645
head(ts_diff_lag12, 24)
## Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
## 1957 155 97 164 156 158 61 186 136 131 111 100 106
## 1958 88 76 105 93 84 208 153 48 156 50 47 145
plot(ts_diff_lag12)
acf(ts_diff_lag12)
pacf(ts_diff_lag12)
Time series is not stationary after differencing with lag 12 as well. There are obvious autocorrelations at lag 1 to 9. The pacf shows autocorrelations at lag 1, 2, 10, 17 and 20. It would be interesting to analyze the values at lag 1 in more detail. One should consider applying the log transformation as well.
ts1
d <- read.table("ts_S3_A2.dat.txt")
ts <- ts(d)
# par(mfrow=c(2,1))
ts1 <- ts[, 1]
ts2 <- ts[, 2]
cat("mean: ", mean(ts1), " ")
## mean: 51
cat("variance: ", var(ts1))
## variance: 858.5
plot(ts1)
acf(ts1)
# pacf(ts1) # see b)
Box.test(ts1, lag = 1, type = "Ljung-Box")
##
## Box-Ljung test
##
## data: ts1
## X-squared = 16.082, df = 1, p-value = 6.067e-05
Yes, ts1 appears to follow a stationary process. The parameters mean (equals 51) and variance don’t change over time. The autocorrelations plots that there is little to no autocorrelation. Ljung-Box test confirms the insights.
ts2
cat("mean: ", mean(ts2), " ")
## mean: 51
cat("variance: ", var(ts2))
## variance: 858.5
plot(ts2)
acf(ts2)
# pacf(ts2) # see b)
Box.test(ts2, lag = 1, type = "Ljung-Box")
##
## Box-Ljung test
##
## data: ts2
## X-squared = 14.822, df = 1, p-value = 0.0001182
Yes, ts2 appears to follow a stationary process. The parameters mean (equals 51) and variance don’t change over time. The autocorrelations plots that there is little to no autocorrelation. Ljung-Box test confirms the insights.
pacf(ts1)
pacf(ts2)
ar(ts1)
##
## Call:
## ar(x = ts1)
##
## Coefficients:
## 1 2 3
## 0.4314 -0.1055 -0.2033
##
## Order selected 3 sigma^2 estimated as 688.2
ar(ts2)
##
## Call:
## ar(x = ts2)
##
## Coefficients:
## 1 2 3 4
## 0.2524 0.2501 -0.0427 0.2065
##
## Order selected 4 sigma^2 estimated as 660
ts1: Yes, with order 3. ts2: Yes, with order 4.
Hint: the ts1 and ts2 were already created by an AR process.
set.seed(22)
d <- arima.sim(list(ar = c(0.6, -0.5, 0.4)), n = 50)
plot(d)
par(mfrow = c(2,1))
plot(ARMAacf(ar = c(0.6, -0.5, 0.4), lag.max = 100), type = "h")
acf(d, lag.max = 100)
par(mfrow = c(2,1))
plot(ARMAacf(ar = c(0.6, -0.5, 0.4), lag.max = 100, pacf = TRUE), type = "h")
pacf(d, lag.max = 100)
ACF: theoretical function is quite similar to estimated function. Significant lag 1 is the same and both functions show no significant autocorrelations after lag 1. PACF: theoretical function is different to estimated function. Lag 1 and lag 2 differ to most. They appear to be significant in the theoretical function but not in the estimated one. Otherwise the both functions are quite similar to each other.
polyroot(c(1, -1, -1, -1))
## [1] 0.543689+0.000000i -0.771845+1.115143i -0.771845-1.115143i
alpha1 seems to be a rational zero (y-value equals 0, x in this case = 0.54).