Positive, since we see a strong linear trend with values that are similar to past values. There are no steep peaks or valleys in this time series.
#Imported, made range
CanWkHrs <- read.csv("CanadianWorkHours.csv")
CanTS <- ts(CanWkHrs$HoursPerWeek, start = c(1,1), freq=1)
yrange = range(CanTS)
#Compute ACF, plot up to lag-15
CanACF <- Acf(CanTS)
#Print ACF result, showing positive lag-1 autocorrelation of 0.928
CanACF
##
## Autocorrelations of series 'CanTS', by lag
##
## 0 1 2 3 4 5 6 7 8 9
## 1.000 0.928 0.839 0.752 0.665 0.571 0.473 0.369 0.265 0.164
## 10 11 12 13 14 15
## 0.047 -0.082 -0.185 -0.261 -0.310 -0.346
#Plot differenced series up to lag-12 ACF, still positive
Acf(diff(CanTS, lag=1), lag.max=12, main="ACF plot for differenced series")
This confirms the positive autocorrelation suspected in Question 1.
#Imported, plotted diff series
WalMart <- read.csv("WalMartStock.csv")
WalMartTS <- ts(WalMart$Close, start = c(1,1), freq=365)
plot(diff(WalMartTS, lag=1), bty="l")
#ACF plot
Acf(diff(WalMartTS, lag=1), lag.max=12, main="ACF plot for differenced series")
The autocorrelations of the closing price series.
The AR(1) slope coefficient for the closing price series.
The AR(1) constant coefficient for the closing price series.
The autocorrelations of the differenced series.
fit <- Arima(WalMartTS, order=c(1,0,0))
fit
## Series: WalMartTS
## ARIMA(1,0,0) with non-zero mean
##
## Coefficients:
## ar1 intercept
## 0.9558 52.9497
## s.e. 0.0187 1.3280
##
## sigma^2 estimated as 0.9815: log likelihood=-349.8
## AIC=705.59 AICc=705.69 BIC=716.13
#Calculated two-tailed p-value using S.E. value above
2*pt(-abs((1 - fit$coef["ar1"]) / 0.0187), df=length(WalMartTS)-1)
## ar1
## 0.01896261
#P-value, normal distribution
2*pnorm(-abs((1-fit$coef["ar1"])/0.0187))
## ar1
## 0.01818593
Both ways, the p-value is less than 0.05, the statistically significant threshold. It doesn’t indicate a random walk.
It is impossible to obtain useful forecasts of the series: You can forecast random walks, but they are naive forecasts.
The changes in the series from one period to the other are random: The book notes this on page 154.
#Imported, made range, partitioned
Souvenir <- read.csv("SouvenirSales.csv")
SouvTS <- ts(Souvenir$Sales, start = c(1995,1), freq=12)
souvValid <- 12
souvTrain <- length(SouvTS) - souvValid
souvTrainTS <- window(SouvTS, start= c(1995,1), end=c(1995, souvTrain))
souvValidTS <- window(SouvTS, start= c(1995, souvTrain+1), end= c(1995, souvTrain+souvValid))
#Fitted model to training set with log(Sales) model
logSalesM <- tslm(log(souvTrainTS) ~ trend + season)
#Generated forecast for Feb-2002
feb02Forecast <- logSalesM$coef["(Intercept)"] + logSalesM$coef["trend"]*86
exp(feb02Forecast)
## (Intercept)
## 12869.98
#ACF plot to lag-15
residualACF <- Acf(logSalesM$residuals, lag.max=15)
#Create ARIMA, print
lag2Model <- Arima(logSalesM$residuals, order =c(2,0,0))
lag2Model
## Series: logSalesM$residuals
## ARIMA(2,0,0) with non-zero mean
##
## Coefficients:
## ar1 ar2 intercept
## 0.3072 0.3687 -0.0025
## s.e. 0.1090 0.1102 0.0489
##
## sigma^2 estimated as 0.0205: log likelihood=39.03
## AIC=-70.05 AICc=-69.46 BIC=-60.95
#Calculate t statistics for each: coefficient / s.e.
#Remember from lesson: Rough rule is anything > 2 (or < -2) is statistically significant
lag2Model$coef["ar1"]/ 0.1090
## ar1
## 2.818482
lag2Model$coef["ar2"]/ 0.1102
## ar2
## 3.346186
These t-statistics indicate statistical significance, because they’re greater than 2.
#Estimated p-values for both, normal distribution
2*pnorm(-abs(lag2Model$coef["ar1"]/ 0.1090))
## ar1
## 0.004825136
2*pnorm(-abs(lag2Model$coef["ar2"]/ 0.1102))
## ar2
## 0.0008193151
The p-values support the above finding of significance, since they’re both well under 0.05.
#Linear regression forecast
lrForecast <- forecast(logSalesM, h=1)
lrForecast
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## Jan 2001 9.188097 8.91722 9.458974 8.76989 9.606304
#Error forecast
errorForecast <- forecast(lag2Model, h=1)
errorForecast
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## Jan 2001 0.1078821 -0.07561892 0.2913832 -0.1727585 0.3885227
#Added to create adjusted forecast
adjustedForecast <- lrForecast$mean + errorForecast$mean
adjustedForecast
## Jan
## 2001 9.295979
Lag-4, since the data is quarterly. For example, 1986-Q1 should and does correspond reasonably well with 1985-Q1 because they’re the same parts of similar years.
#Imported, made time series
Appliance <- read.csv("ApplianceShipments.csv")
ApplianceTS <- ts(Appliance$Shipments, start = c(1985,1), end = c(1989,4), freq=4)
#Created ACF plot
AppACF <- Acf(ApplianceTS)
#Printed results
AppACF
##
## Autocorrelations of series 'ApplianceTS', by lag
##
## 0 1 2 3 4 5 6 7 8 9
## 1.000 0.261 -0.098 0.164 0.387 -0.030 -0.269 0.081 0.086 -0.168
## 10 11 12 13
## -0.325 -0.019 0.047 -0.096
It confirms that Lag-4 has the largest coefficient.