Series 2

Exercise 2.1 - Match plots to correlograms

Assuming the following solutions by comparing the given pictures: A -> 4, B -> 3, C -> 2, D -> 1

Checking by code:

par(mfrow=c(2,4))
# Data set cow temperature
plot(cowtemp)
acf(cowtemp)
# Data set air passengers
plot(AirPassengers)
acf(AirPassengers)
# Data set mink trappings
plot(mink)
acf(mink)
# Data set accidential deaths in the US
plot(usdeaths)
acf(usdeaths)

True results: A -> 2, B -> 3, C -> 1, D -> 4

Exercise 2.2 - calculate the lagged scatter plot and the plug-in estimator

Failed to solve this task. Reasons: time and complexity (because this task is based on exercise 1.8 level).

# function plug-in estimator for autocorrelation

# load data
b <- beer
c <- chicken

Exercise 2.3 - AR(1) process

set.seed(22)
d <- arima.sim(list(ar = c(0.8)), n = 1000)
plot(d)

par(mfrow=c(2,1))
plot(ARMAacf(ar = c(0.8), lag.max = 100), type = "h")
acf(d, lag.max = 100)

# assumption k = 100 as before
plot(ARMAacf(ar = c(0.8), lag.max = 100), ylim = c(-1, 1))

Dependence is decreasing fast from Lag 1 to lag 20. Afterwards the dependence continues stable towards zero.

par(mfrow=c(2,1))
plot(ARMAacf(ar = c(1), lag.max = 100, pacf = TRUE))
pacf(d, lag.max = 100)

Observation: good agreement only for the first lag. Theoretical PACF just generates a value for lag 1. All other values are NAs!

Exercise 2.4 Stochastic model

d <- read_table2("kreatin.txt")
d %<>% select(-X3)
plot(d)

summary(d$gehalt)

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   61.00   69.00   70.00   70.12   71.00   79.00

sd(d$gehalt)

## [1] 2.758089

hist(d$gehalt, breaks = 25)

plot(density(d$gehalt))

boxplot(d$gehalt)

Assumption: There should be constanct values with error. Meaning: It should be a stationary process. Generated plots above do not confirm the assumption completly. The following aspect is not given: All Xt are not identically distributed.

ts <- ts(d$gehalt)
acf(ts)

pacf(ts)

Insights do not confirm a stationary process. There are correlations at multiple lags (e. g. 2, 3, 4, 9). One could recognize partial autocorrelations at multiple lags as well. Conclusion: One could recognize that autocovariance dependends not only at lag h.
Next possible step: Use differencing at lag 1 and check acf as well as pacf again.

Exercise 2.5 Decomposition, ACF & PACF

d <- read.table("cbe.dat.txt")
ts <- ts(d, frequency = 12, start = c(1956,1))
ts <- ts[, "elec"]
plot(ts)

Because one can see a clear trend as well as a seasonality. We need to adress this issues first.

plot(ts_decomposition <- decompose(ts, type = "multiplicative"))

plot(ts_decomposition$seasonal[1:12], type = "l") # one saison

plot(ts_decomposition$random)

acf(ts_decomposition$random, na.action = na.pass, plot = TRUE)

One sees a clear increasing trend as well as a yearly saisonality with peaks in summer. The remainder represents structure, e. g. at lag 9 to 11. Conclusion: The residuals don’t seem to appear random. That is an issue to be solved.

Averaging window approach

plot(ts_stl <- stl(log(ts), s.window = "periodic"))

plot(ts_stl_remainder <- ts_stl$time.series[,3])

acf(ts_stl_remainder)

Same as above in b): One sees a clear increasing trend as well as a yearly saisonality with peaks in summer. The remainder represents structure, e. g. at lag 9 to 11. Conclusion: The residuals don’t seem to appear random. That is an issue to be solved.

Smoothing window approach

plot(ts_stl2 <- stl(log(ts), s.window = 12)) # assumption: 12 means a year span in this case

plot(ts_stl_remainder2 <- ts_stl2$time.series[,3])

acf(ts_stl_remainder2)

Smooting window approach worked better than the two previous approaches. The remainder seems pretty stochastic with little signs of autocorrelation at lag 8, 19 and 20. Hint: we didn’t cover residual analyis so far. Therefore we do not dive deeper regarding this topic now.

Parameter type = “multiplicative”: Because there is a seasonality and a trend at the same time and both are changing over time (e. g. their variation).
Log-transformation: Needed to transform the multiplicative to an additive model.

lag 1

ts_diff_lag1 <- diff(ts, lag = 1)
plot(ts_diff_lag1)

acf(ts_diff_lag1)

pacf(ts_diff_lag1)

Time series is not stationary after differencing with lag 1. There are obvious autocorrelations at lag 5, 10, 15 and 20. The pacf shows autocorrelations at lag 3, 5, 6, 8, 9, 10 and 11. It would be interesting to analyze the values at lag 5 and 10 in more detail. One should consider applying the log transformation as well.

ts_diff_lag12 <- diff(ts, lag = 12)
head(ts, 36)

##       Jan  Feb  Mar  Apr  May  Jun  Jul  Aug  Sep  Oct  Nov  Dec
## 1956 1254 1290 1379 1346 1535 1555 1655 1651 1500 1538 1486 1394
## 1957 1409 1387 1543 1502 1693 1616 1841 1787 1631 1649 1586 1500
## 1958 1497 1463 1648 1595 1777 1824 1994 1835 1787 1699 1633 1645

head(ts_diff_lag12, 24)

##      Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
## 1957 155  97 164 156 158  61 186 136 131 111 100 106
## 1958  88  76 105  93  84 208 153  48 156  50  47 145

plot(ts_diff_lag12)

acf(ts_diff_lag12)

pacf(ts_diff_lag12)

Time series is not stationary after differencing with lag 12 as well. There are obvious autocorrelations at lag 1 to 9. The pacf shows autocorrelations at lag 1, 2, 10, 17 and 20. It would be interesting to analyze the values at lag 1 in more detail. One should consider applying the log transformation as well.

Exercise 2.6 AR process

ts1

d <- read.table("ts_S3_A2.dat.txt")
ts <- ts(d)
# par(mfrow=c(2,1))
ts1 <- ts[, 1]
ts2 <- ts[, 2]

cat("mean: ", mean(ts1), " ")

## mean:  51

cat("variance: ", var(ts1))

## variance:  858.5

plot(ts1)

acf(ts1)

# pacf(ts1) # see b)
Box.test(ts1, lag = 1, type = "Ljung-Box")

## 
##  Box-Ljung test
## 
## data:  ts1
## X-squared = 16.082, df = 1, p-value = 6.067e-05

Yes, ts1 appears to follow a stationary process. The parameters mean (equals 51) and variance don’t change over time. The autocorrelations plots that there is little to no autocorrelation. Ljung-Box test confirms the insights.

ts2

cat("mean: ", mean(ts2), " ")

## mean:  51

cat("variance: ", var(ts2))

## variance:  858.5

plot(ts2)

acf(ts2)

# pacf(ts2) # see b)
Box.test(ts2, lag = 1, type = "Ljung-Box")

## 
##  Box-Ljung test
## 
## data:  ts2
## X-squared = 14.822, df = 1, p-value = 0.0001182

Yes, ts2 appears to follow a stationary process. The parameters mean (equals 51) and variance don’t change over time. The autocorrelations plots that there is little to no autocorrelation. Ljung-Box test confirms the insights.

pacf(ts1)

pacf(ts2)

ar(ts1)

## 
## Call:
## ar(x = ts1)
## 
## Coefficients:
##       1        2        3  
##  0.4314  -0.1055  -0.2033  
## 
## Order selected 3  sigma^2 estimated as  688.2

ar(ts2)

## 
## Call:
## ar(x = ts2)
## 
## Coefficients:
##       1        2        3        4  
##  0.2524   0.2501  -0.0427   0.2065  
## 
## Order selected 4  sigma^2 estimated as  660

ts1: Yes, with order 3. ts2: Yes, with order 4.

Hint: the ts1 and ts2 were already created by an AR process.

Exercise 2.7 AR(3) model

set.seed(22)
d <- arima.sim(list(ar = c(0.6, -0.5, 0.4)), n = 50)
plot(d)

par(mfrow = c(2,1))
plot(ARMAacf(ar = c(0.6, -0.5, 0.4), lag.max = 100), type = "h")
acf(d, lag.max = 100)

par(mfrow = c(2,1))
plot(ARMAacf(ar = c(0.6, -0.5, 0.4), lag.max = 100, pacf = TRUE), type = "h")
pacf(d, lag.max = 100)

ACF: theoretical function is quite similar to estimated function. Significant lag 1 is the same and both functions show no significant autocorrelations after lag 1. PACF: theoretical function is different to estimated function. Lag 1 and lag 2 differ to most. They appear to be significant in the theoretical function but not in the estimated one. Otherwise the both functions are quite similar to each other.

polyroot(c(1, -1, -1, -1))

## [1]  0.543689+0.000000i -0.771845+1.115143i -0.771845-1.115143i

alpha1 seems to be a rational zero (y-value equals 0, x in this case = 0.54).

Solutions_Ramon_Schildknecht_RTP Excercise Sheet - Series 2