Chapter 5
library(fpp2)
## Registered S3 method overwritten by 'quantmod':
## method from
## as.zoo.data.frame zoo
## -- Attaching packages ---------------------------------------------- fpp2 2.4 --
## v ggplot2 3.3.5 v fma 2.4
## v forecast 8.15 v expsmooth 2.3
##
daily20 = head(elecdaily,20)
daily20_ts = ts(daily20)
plot(Demand~Temperature, data = daily20_ts)
Answer: As temperature increases, the demand of electricity increases, which is why we see a positive relationship between Demand and Temperature.
daily_tslm = tslm(Demand~Temperature, data = daily20_ts)
checkresiduals(daily_tslm$residuals)
## Warning in modeldf.default(object): Could not find appropriate degrees of
## freedom for this model.
Answer: there seems to be no correlation with the residuals, so it seems the model will be fairly accurate
forecast15 = forecast(daily_tslm, newdata = data.frame(Temperature = 15))
forecast35 = forecast(daily_tslm, newdata = data.frame(Temperature = 35))
forecast15
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## 21 140.5701 108.681 172.4591 90.21166 190.9285
forecast35
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## 21 275.7146 245.2278 306.2014 227.5706 323.8586
Answer: Forecasts seem realistic, if the temperature is higher we would expect
a higher demand of electricity.
autoplot(daily20_ts, facets = TRUE)
daily20 %>%
as.data.frame( ) %>%
ggplot(aes(x=Temperature, y=Demand)) +
geom_point( ) +
geom_smooth(method = "lm", se = F)
## `geom_smooth()` using formula 'y ~ x'
fit = tslm(Demand~Temperature, data = daily20_ts)
checkresiduals(fit)
##
## Breusch-Godfrey test for serial correlation of order up to 5
##
## data: Residuals from Linear regression model
## LM test = 3.8079, df = 5, p-value = 0.5774
forecast(fit, newdata = data.frame(Temperature = c(15,35)))
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## 21 140.5701 108.6810 172.4591 90.21166 190.9285
## 22 275.7146 245.2278 306.2014 227.57056 323.8586
plot(Demand ~ Temperature, data = elecdaily)
2. a. Plot the winning time against the year. Describe main features of the plot
autoplot(mens400, xlab = "Time", ylab = "Year")
Answer: The winning time has decreased from the late 1800s until 2016. There are some gaps in history where I assume the even did not take place, but there is a clear decrease in winning time over the century.
fit = tslm(mens400~time(mens400), data = mens400)
fit
##
## Call:
## tslm(formula = mens400 ~ time(mens400), data = mens400)
##
## Coefficients:
## (Intercept) time(mens400)
## 172.48148 -0.06457
Answer: we can see a decrease of about 0.065 seconds per year.
autoplot(mens400, xlab = "Time", ylab = "Year") +
geom_abline(slope = fit$coefficients[2], intercept = fit$coefficients[1])
c) Plot the residuals against the year. What does this indicate about the suitability of the fitted line?
checkresiduals(fit$residuals)
## Warning in modeldf.default(object): Could not find appropriate degrees of
## freedom for this model.
Answer: Residuals seem to have a fair fit. Not too many fall outside the scopes.
time = time(mens400)
forecast(fit, newdata = data.frame(time=time + 1))
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## 2020 50.04947 48.45492 51.64401 47.55691 52.54202
## 2024 49.79117 48.20567 51.37668 47.31275 52.26959
## 2028 49.53288 47.95586 51.10989 47.06772 51.99803
## 2032 49.27458 47.70549 50.84367 46.82182 51.72734
## 2036 49.01629 47.45456 50.57801 46.57503 51.45754
## 2040 48.75799 47.20304 50.31294 46.32734 51.18864
## 2044 48.49969 46.95095 50.04844 46.07873 50.92066
## 2048 48.24140 46.69826 49.78454 45.82920 50.65360
## 2052 47.98310 46.44497 49.52124 45.57873 50.38748
## 2056 47.72481 46.19108 49.25854 45.32732 50.12230
## 2060 47.46651 45.93657 48.99645 45.07495 49.85807
## 2064 47.20822 45.68146 48.73498 44.82163 49.59481
## 2068 46.94992 45.42573 48.47412 44.56734 49.33251
## 2072 46.69163 45.16937 48.21388 44.31208 49.07118
## 2076 46.43333 44.91239 47.95427 44.05584 48.81082
## 2080 46.17504 44.65479 47.69528 43.79863 48.55144
## 2084 45.91674 44.39656 47.43692 43.54044 48.29304
## 2088 45.65845 44.13771 47.17918 43.28127 48.03562
## 2092 45.40015 43.87823 46.92207 43.02112 47.77918
## 2096 45.14185 43.61813 46.66558 42.76000 47.52371
## 2100 44.88356 43.35740 46.40972 42.49791 47.26921
## 2104 44.62526 43.09606 46.15447 42.23485 47.01568
## 2108 44.36697 42.83410 45.89984 41.97083 46.76311
## 2112 44.10867 42.57153 45.64581 41.70585 46.51149
## 2116 43.85038 42.30836 45.39240 41.43993 46.26082
## 2120 43.59208 42.04458 45.13958 41.17308 46.01109
## 2124 43.33379 41.78022 44.88736 40.90529 45.76229
## 2128 43.07549 41.51527 44.63572 40.63659 45.51440
## 2132 42.81720 41.24973 44.38466 40.36698 45.26741
## 2136 42.55890 40.98363 44.13417 40.09648 45.02132
## 2140 42.30061 40.71697 43.88425 39.82510 44.77611
Answer: The forecasted winning time is 50.05 seconds with the 80% interval of 48.45 to 51.64 seconds. And a 95% interval of 47.56 seconds to 52.54 seconds. I assumed the trend remained the same from the previous years.
easter(ausbeer)
## Qtr1 Qtr2 Qtr3 Qtr4
## 1956 0.67 0.33 0.00 0.00
## 1957 0.00 1.00 0.00 0.00
## 1958 0.00 1.00 0.00 0.00
## 1959 1.00 0.00 0.00 0.00
## 1960 0.00 1.00 0.00 0.00
## 1961 0.33 0.67 0.00 0.00
## 1962 0.00 1.00 0.00 0.00
## 1963 0.00 1.00 0.00 0.00
## 1964 1.00 0.00 0.00 0.00
## 1965 0.00 1.00 0.00 0.00
## 1966 0.00 1.00 0.00 0.00
## 1967 1.00 0.00 0.00 0.00
## 1968 0.00 1.00 0.00 0.00
## 1969 0.00 1.00 0.00 0.00
## 1970 1.00 0.00 0.00 0.00
## 1971 0.00 1.00 0.00 0.00
## 1972 0.33 0.67 0.00 0.00
## 1973 0.00 1.00 0.00 0.00
## 1974 0.00 1.00 0.00 0.00
## 1975 1.00 0.00 0.00 0.00
## 1976 0.00 1.00 0.00 0.00
## 1977 0.00 1.00 0.00 0.00
## 1978 1.00 0.00 0.00 0.00
## 1979 0.00 1.00 0.00 0.00
## 1980 0.00 1.00 0.00 0.00
## 1981 0.00 1.00 0.00 0.00
## 1982 0.00 1.00 0.00 0.00
## 1983 0.00 1.00 0.00 0.00
## 1984 0.00 1.00 0.00 0.00
## 1985 0.00 1.00 0.00 0.00
## 1986 1.00 0.00 0.00 0.00
## 1987 0.00 1.00 0.00 0.00
## 1988 0.00 1.00 0.00 0.00
## 1989 1.00 0.00 0.00 0.00
## 1990 0.00 1.00 0.00 0.00
## 1991 1.00 0.00 0.00 0.00
## 1992 0.00 1.00 0.00 0.00
## 1993 0.00 1.00 0.00 0.00
## 1994 0.00 1.00 0.00 0.00
## 1995 0.00 1.00 0.00 0.00
## 1996 0.00 1.00 0.00 0.00
## 1997 1.00 0.00 0.00 0.00
## 1998 0.00 1.00 0.00 0.00
## 1999 0.00 1.00 0.00 0.00
## 2000 0.00 1.00 0.00 0.00
## 2001 0.00 1.00 0.00 0.00
## 2002 1.00 0.00 0.00 0.00
## 2003 0.00 1.00 0.00 0.00
## 2004 0.00 1.00 0.00 0.00
## 2005 1.00 0.00 0.00 0.00
## 2006 0.00 1.00 0.00 0.00
## 2007 0.00 1.00 0.00 0.00
## 2008 1.00 0.00 0.00 0.00
## 2009 0.00 1.00 0.00 0.00
## 2010 0.00 1.00
Answer: The easter() function tells your what quarter Easter weekend falls on. If there is a number between 0 and 1, Easter weekend falls inbetween March and April.
An elasticity coefficient is the ratio of the percentage change in the forecast variable (y) to the percentage change in the predictor variable (x). Mathematically, the elasticity is defined as (dy/dx)*(x/y). Consider the log-log model, logy=β0+β1logx+ε. Express y as a function of x and show that the coefficient β1 is the elasticity coefficient.
autoplot(fancy)
Answer: There seems to be some seasonality in the time series. It also seems that each year, there is a bigger spike in the season.
Explain why it is necessary to take logarithms of these data before fitting a model. Answer: The logarithms will smooth out the extreme increases of each year. This will make the data easier to work with and have less bias.
Use R to fit a regression model to the lagarithms of these sales data with a linear trend, seasonal dummies and a “surfing festival” dummy variable.
fit = tslm(BoxCox(fancy, 0)~season + trend)
fit
##
## Call:
## tslm(formula = BoxCox(fancy, 0) ~ season + trend)
##
## Coefficients:
## (Intercept) season2 season3 season4 season5 season6
## 7.60586 0.25104 0.69521 0.38293 0.40799 0.44696
## season7 season8 season9 season10 season11 season12
## 0.60822 0.58535 0.66634 0.74403 1.20302 1.95814
## trend
## 0.02239
autoplot(fit$residuals)
Answer: There seems to be a slight correlation of the residuals and time.
time = time(fancy)
cbind.data.frame(month = factor(month.abb[round(12*(time - floor(time))+1)],
labels = month.abb,
ordered = T),
Residuals = fit$residuals) %>%
ggplot(aes(x = month, y = Residuals)) + geom_boxplot()
## Don't know how to automatically pick scale for object of type ts. Defaulting to continuous.
Answer: There are some irregularities per month, which can indicate some bias or correlation in certain months.
fit$coefficients
## (Intercept) season2 season3 season4 season5 season6
## 7.60586042 0.25104367 0.69520658 0.38293405 0.40799437 0.44696253
## season7 season8 season9 season10 season11 season12
## 0.60821562 0.58535238 0.66634464 0.74403359 1.20301639 1.95813657
## trend
## 0.02239298
Answer: Coefficients are positive telling me there is a positive trend through time.
checkresiduals(fit)
##
## Breusch-Godfrey test for serial correlation of order up to 17
##
## data: Residuals from Linear regression model
## LM test = 37.152, df = 17, p-value = 0.003209
Answer: Assuming an alpha value of 0.05, we can say the residuals are significantly correlated.
fancy_forecast = forecast(fit, newdata = data.frame(time=time + 1))
fancy_forecast
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## Jan 1994 9.509264 9.246995 9.771532 9.105002 9.913525
## Feb 1994 9.782700 9.520432 10.044969 9.378439 10.186962
## Mar 1994 10.249256 9.986988 10.511525 9.844995 10.653518
## Apr 1994 9.959377 9.697108 10.221645 9.555115 10.363638
## May 1994 10.006830 9.744562 10.269098 9.602569 10.411091
## Jun 1994 10.068191 9.805923 10.330459 9.663930 10.472452
## Jul 1994 10.251837 9.989569 10.514105 9.847576 10.656099
## Aug 1994 10.251367 9.989099 10.513635 9.847106 10.655628
## Sep 1994 10.354752 10.092484 10.617020 9.950491 10.759014
## Oct 1994 10.454834 10.192566 10.717102 10.050573 10.859095
## Nov 1994 10.936210 10.673942 11.198478 10.531949 11.340471
## Dec 1994 11.713723 11.451455 11.975991 11.309462 12.117984
## Jan 1995 9.777979 9.512777 10.043182 9.369195 10.186763
## Feb 1995 10.051416 9.786214 10.316618 9.642632 10.460200
## Mar 1995 10.517972 10.252770 10.783174 10.109188 10.926756
## Apr 1995 10.228092 9.962890 10.493295 9.819308 10.636876
## May 1995 10.275546 10.010343 10.540748 9.866762 10.684330
## Jun 1995 10.336907 10.071704 10.602109 9.928123 10.745691
## Jul 1995 10.520553 10.255351 10.785755 10.111769 10.929337
## Aug 1995 10.520083 10.254880 10.785285 10.111299 10.928867
## Sep 1995 10.623468 10.358266 10.888670 10.214684 11.032252
## Oct 1995 10.723550 10.458347 10.988752 10.314766 11.132334
## Nov 1995 11.204926 10.939723 11.470128 10.796142 11.613710
## Dec 1995 11.982439 11.717236 12.247641 11.573655 12.391223
## Jan 1996 10.046695 9.777950 10.315440 9.632451 10.460940
## Feb 1996 10.320132 10.051387 10.588877 9.905887 10.734376
## Mar 1996 10.786688 10.517943 11.055433 10.372443 11.200932
## Apr 1996 10.496808 10.228063 10.765553 10.082564 10.911053
## May 1996 10.544261 10.275516 10.813007 10.130017 10.958506
## Jun 1996 10.605623 10.336878 10.874368 10.191378 11.019867
## Jul 1996 10.789269 10.520524 11.058014 10.375024 11.203513
## Aug 1996 10.788798 10.520053 11.057543 10.374554 11.203043
## Sep 1996 10.892184 10.623439 11.160929 10.477939 11.306428
## Oct 1996 10.992266 10.723521 11.261011 10.578021 11.406510
## Nov 1996 11.473641 11.204896 11.742386 11.059397 11.887886
## Dec 1996 12.251155 11.982409 12.519900 11.836910 12.665399
## Jan 1997 10.315411 10.042538 10.588283 9.894804 10.736018
## Feb 1997 10.588848 10.315975 10.861720 10.168241 11.009454
## Mar 1997 11.055403 10.782531 11.328276 10.634797 11.476010
## Apr 1997 10.765524 10.492651 11.038396 10.344917 11.186131
## May 1997 10.812977 10.540105 11.085850 10.392371 11.233584
## Jun 1997 10.874338 10.601466 11.147211 10.453732 11.294945
## Jul 1997 11.057984 10.785112 11.330857 10.637378 11.478591
## Aug 1997 11.057514 10.784642 11.330387 10.636907 11.478121
## Sep 1997 11.160899 10.888027 11.433772 10.740293 11.581506
## Oct 1997 11.260981 10.988109 11.533854 10.840375 11.681588
## Nov 1997 11.742357 11.469485 12.015230 11.321750 12.162964
## Dec 1997 12.519870 12.246998 12.792743 12.099264 12.940477
## Jan 1998 10.584127 10.306568 10.861685 10.156297 11.011957
## Feb 1998 10.857563 10.580005 11.135122 10.429733 11.285393
## Mar 1998 11.324119 11.046561 11.601678 10.896289 11.751949
## Apr 1998 11.034240 10.756681 11.311798 10.606410 11.462070
## May 1998 11.081693 10.804134 11.359252 10.653863 11.509523
## Jun 1998 11.143054 10.865495 11.420613 10.715224 11.570884
## Jul 1998 11.326700 11.049142 11.604259 10.898870 11.754530
## Aug 1998 11.326230 11.048671 11.603789 10.898400 11.754060
## Sep 1998 11.429615 11.152056 11.707174 11.001785 11.857445
## Oct 1998 11.529697 11.252138 11.807256 11.101867 11.957527
## Nov 1998 12.011073 11.733514 12.288632 11.583243 12.438903
## Dec 1998 12.788586 12.511027 13.066145 12.360756 13.216416
## Jan 1999 10.852842 10.570067 11.135618 10.416971 11.288714
## Feb 1999 11.126279 10.843503 11.409055 10.690407 11.562151
## Mar 1999 11.592835 11.310059 11.875611 11.156963 12.028707
## Apr 1999 11.302955 11.020180 11.585731 10.867084 11.738827
## May 1999 11.350409 11.067633 11.633185 10.914537 11.786280
## Jun 1999 11.411770 11.128994 11.694546 10.975898 11.847642
## Jul 1999 11.595416 11.312640 11.878192 11.159544 12.031288
## Aug 1999 11.594946 11.312170 11.877722 11.159074 12.030817
## Sep 1999 11.698331 11.415555 11.981107 11.262459 12.134203
## Oct 1999 11.798413 11.515637 12.081189 11.362541 12.234285
## Nov 1999 12.279789 11.997013 12.562564 11.843917 12.715660
## Dec 1999 13.057302 12.774526 13.340078 12.621430 13.493174
## Jan 2000 11.121558 10.833063 11.410053 10.676871 11.566246
## Feb 2000 11.394995 11.106500 11.683490 10.950307 11.839682
## Mar 2000 11.861551 11.573056 12.150046 11.416863 12.306238
## Apr 2000 11.571671 11.283176 11.860166 11.126984 12.016359
## May 2000 11.619124 11.330629 11.907620 11.174437 12.063812
## Jun 2000 11.680486 11.391991 11.968981 11.235798 12.125173
## Jul 2000 11.864132 11.575637 12.152627 11.419444 12.308819
## Aug 2000 11.863661 11.575166 12.152157 11.418974 12.308349
## Sep 2000 11.967047 11.678552 12.255542 11.522359 12.411734
## Oct 2000 12.067129 11.778633 12.355624 11.622441 12.511816
## Nov 2000 12.548504 12.260009 12.837000 12.103817 12.993192
## Dec 2000 13.326018 13.037522 13.614513 12.881330 13.770705
autoplot(fancy_forecast)
j) How could you improve these predictions by modifying the model? Answer: We can test more transformations of the variables to see which one improves our results the best. We could also find different combinations of variables to use in our model.
gas = window(gasoline, end = 2005)
autoplot(gas)
gas1 = tslm(gas ~ trend + fourier(gas, K=5))
gas2 = tslm(gas ~ trend + fourier(gas, K=10))
gas3 = tslm(gas ~ trend + fourier(gas, K=15))
autoplot(gas) + autolayer(fitted(gas1)) + autolayer(fitted(gas2)) +
autolayer(fitted(gas3))
ANswer: the trend for each of the different Fourier terms are very similar, from first glance, it seems Fourier term = 15 (gas3) seems to be the best out of gas1, gas2, and gas3.
CV(gas1)
## CV AIC AICc BIC AdjR2
## 7.553646e-02 -1.873234e+03 -1.872723e+03 -1.813596e+03 8.406928e-01
CV(gas2)
## CV AIC AICc BIC AdjR2
## 7.135754e-02 -1.915014e+03 -1.913441e+03 -1.809500e+03 8.516102e-01
CV(gas3)
## CV AIC AICc BIC AdjR2
## 7.190862e-02 -1.910231e+03 -1.906988e+03 -1.758841e+03 8.525942e-01
Answer: Looks like if we have 10 Fourier terms, we have the lowest CV and AICc values.
checkresiduals(gas2$residuals)
## Warning in modeldf.default(object): Could not find appropriate degrees of
## freedom for this model.
d) Forecast using harmonic regression.
fit = forecast(gas2, newdata = data.frame(fourier(gas, 10, 52)))
fit
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## 2005.014 8.745595 8.402358 9.088832 8.220249 9.270941
## 2005.033 8.604075 8.260737 8.947413 8.078574 9.129575
## 2005.052 8.607195 8.263839 8.950551 8.081666 9.132724
## 2005.071 8.683215 8.339879 9.026550 8.157718 9.208712
## 2005.090 8.746645 8.403404 9.089887 8.221293 9.271998
## 2005.110 8.783003 8.439850 9.126156 8.257785 9.308221
## 2005.129 8.833517 8.490370 9.176664 8.308309 9.358725
## 2005.148 8.917498 8.574348 9.260647 8.392285 9.442710
## 2005.167 8.997861 8.654729 9.340992 8.472676 9.523046
## 2005.186 9.029128 8.686005 9.372252 8.503956 9.554301
## 2005.205 9.018507 8.675381 9.361633 8.493330 9.543684
## 2005.225 9.017572 8.674439 9.360705 8.492385 9.542760
## 2005.244 9.056052 8.712918 9.399187 8.530863 9.581242
## 2005.263 9.105523 8.762395 9.448651 8.580344 9.630703
## 2005.282 9.121048 8.777925 9.464172 8.595876 9.646221
## 2005.301 9.105879 8.762754 9.449005 8.580703 9.631055
## 2005.320 9.112513 8.769382 9.455644 8.587329 9.637697
## 2005.340 9.175080 8.831948 9.518211 8.649895 9.700265
## 2005.359 9.259000 8.915873 9.602127 8.733822 9.784178
## 2005.378 9.296291 8.953167 9.639415 8.771118 9.821464
## 2005.397 9.268208 8.925083 9.611334 8.743032 9.793385
## 2005.416 9.234527 8.891397 9.577657 8.709345 9.759710
## 2005.435 9.270246 8.927115 9.613376 8.745062 9.795429
## 2005.455 9.381554 9.038427 9.724681 8.856376 9.906731
## 2005.474 9.497887 9.154762 9.841011 8.972713 10.023060
## 2005.493 9.546877 9.203751 9.890002 9.021700 10.072053
## 2005.512 9.524722 9.181592 9.867851 8.999539 10.049904
## 2005.531 9.487091 9.143960 9.830221 8.961908 10.012274
## 2005.550 9.482387 9.139260 9.825513 8.957209 10.007564
## 2005.570 9.509124 9.165999 9.852248 8.983950 10.034297
## 2005.589 9.536544 9.193418 9.879670 9.011367 10.061720
## 2005.608 9.546848 9.203718 9.889978 9.021665 10.072030
## 2005.627 9.541950 9.198820 9.885081 9.016767 10.067133
## 2005.646 9.517407 9.174280 9.860533 8.992229 10.042584
## 2005.665 9.453438 9.110314 9.796562 8.928265 9.978612
## 2005.685 9.344405 9.001279 9.687531 8.819228 9.869581
## 2005.704 9.226892 8.883762 9.570023 8.701709 9.752076
## 2005.723 9.160315 8.817184 9.503446 8.635131 9.685499
## 2005.742 9.174078 8.830951 9.517205 8.648901 9.699255
## 2005.761 9.241612 8.898488 9.584736 8.716439 9.766785
## 2005.780 9.310432 8.967306 9.653559 8.785255 9.835609
## 2005.800 9.348790 9.005658 9.691922 8.823604 9.873976
## 2005.819 9.354217 9.011085 9.697350 8.829031 9.879404
## 2005.838 9.326591 8.983464 9.669718 8.801413 9.851768
## 2005.857 9.258087 8.914964 9.601211 8.732915 9.783260
## 2005.876 9.163285 8.820158 9.506412 8.638107 9.688463
## 2005.895 9.102531 8.759394 9.445669 8.577338 9.627725
## 2005.915 9.142445 8.799307 9.485583 8.617250 9.667640
## 2005.934 9.276104 8.932975 9.619234 8.750923 9.801286
## 2005.953 9.396214 9.053089 9.739340 8.871039 9.921390
## 2005.972 9.374939 9.031803 9.718075 8.849747 9.900131
## 2005.991 9.184286 8.841055 9.527518 8.658948 9.709624
gas_actual = window(gasoline, start = 2005, end = 2006)
autoplot(gas_actual) + autolayer(fit$mean)
Answer: There is a huge dip in the later half of 2005 that the forecast does not completely catch, but in the beginning of the year, the forecast catches majority of the trend.
plot(huron)
Answer: Some seasonality changes throughout the years. Looks like the water level has been fairly constant throughout the years.
lm1 = tslm(huron~trend)
summary(lm1)
##
## Call:
## tslm(formula = huron ~ trend)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.50997 -0.72726 0.00083 0.74402 2.53565
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 10.202037 0.230111 44.335 < 2e-16 ***
## trend -0.024201 0.004036 -5.996 3.55e-08 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.13 on 96 degrees of freedom
## Multiple R-squared: 0.2725, Adjusted R-squared: 0.2649
## F-statistic: 35.95 on 1 and 96 DF, p-value: 3.545e-08
time = time(huron)
t2 = ts(pmax(0, time-1915), start = 1875)
pw1 = tslm(huron~time+t2)
summary(pw1)
##
## Call:
## tslm(formula = huron ~ time + t2)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.49626 -0.66240 -0.07139 0.85163 2.39222
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 132.90870 19.97687 6.653 1.82e-09 ***
## time -0.06498 0.01051 -6.181 1.58e-08 ***
## t2 0.06486 0.01563 4.150 7.26e-05 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.045 on 95 degrees of freedom
## Multiple R-squared: 0.3841, Adjusted R-squared: 0.3711
## F-statistic: 29.62 on 2 and 95 DF, p-value: 1.004e-10
fit1 = forecast(lm1, newdata = data.frame(time = time))
newdata = cbind(time = time[length(time)] + seq(8),
t2 = t2[length(t2)] + seq(8)) %>% as.data.frame()
fit2 = forecast(pw1, newdata = newdata)
fit1
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## 1973 7.806127 6.317648 9.294605 5.516501 10.095752
## 1974 7.781926 6.292536 9.271315 5.490899 10.072952
## 1975 7.757724 6.267406 9.248043 5.465269 10.050179
## 1976 7.733523 6.242259 9.224788 5.439613 10.027434
## 1977 7.709322 6.217094 9.201550 5.413929 10.004715
## 1978 7.685121 6.191912 9.178331 5.388219 9.982024
## 1979 7.660920 6.166712 9.155128 5.362481 9.959359
## 1980 7.636719 6.141494 9.131943 5.336717 9.936721
## 1981 7.612518 6.116259 9.108776 5.310925 9.914110
## 1982 7.588317 6.091007 9.085626 5.285107 9.891526
## 1983 7.564116 6.065738 9.062494 5.259263 9.868968
## 1984 7.539914 6.040451 9.039378 5.233391 9.846438
## 1985 7.515713 6.015146 9.016280 5.207493 9.823933
## 1986 7.491512 5.989825 8.993200 5.181569 9.801456
## 1987 7.467311 5.964486 8.970136 5.155618 9.779004
## 1988 7.443110 5.939130 8.947090 5.129640 9.756580
## 1989 7.418909 5.913757 8.924061 5.103637 9.734181
## 1990 7.394708 5.888367 8.901048 5.077606 9.711809
## 1991 7.370507 5.862960 8.878053 5.051550 9.689463
## 1992 7.346306 5.837536 8.855075 5.025468 9.667144
## 1993 7.322104 5.812095 8.832114 4.999359 9.644850
## 1994 7.297903 5.786636 8.809170 4.973224 9.622582
## 1995 7.273702 5.761161 8.786243 4.947064 9.600341
## 1996 7.249501 5.735670 8.763333 4.920877 9.578125
## 1997 7.225300 5.710161 8.740439 4.894665 9.555935
## 1998 7.201099 5.684636 8.717562 4.868427 9.533771
## 1999 7.176898 5.659093 8.694702 4.842163 9.511633
## 2000 7.152697 5.633535 8.671859 4.815873 9.489520
## 2001 7.128496 5.607959 8.649032 4.789558 9.467433
## 2002 7.104294 5.582367 8.626222 4.763217 9.445372
## 2003 7.080093 5.556759 8.603428 4.736851 9.423335
## 2004 7.055892 5.531134 8.580651 4.710460 9.401325
## 2005 7.031691 5.505492 8.557890 4.684043 9.379339
## 2006 7.007490 5.479834 8.535146 4.657601 9.357379
## 2007 6.983289 5.454160 8.512418 4.631134 9.335444
## 2008 6.959088 5.428469 8.489706 4.604642 9.313534
## 2009 6.934887 5.402763 8.467011 4.578125 9.291649
## 2010 6.910686 5.377040 8.444331 4.551582 9.269789
## 2011 6.886484 5.351300 8.421668 4.525015 9.247953
## 2012 6.862283 5.325545 8.399021 4.498424 9.226143
## 2013 6.838082 5.299774 8.376391 4.471807 9.204357
## 2014 6.813881 5.273986 8.353776 4.445166 9.182596
## 2015 6.789680 5.248183 8.331177 4.418500 9.160860
## 2016 6.765479 5.222364 8.308594 4.391810 9.139148
## 2017 6.741278 5.196529 8.286027 4.365096 9.117460
## 2018 6.717077 5.170678 8.263475 4.338357 9.095797
## 2019 6.692876 5.144811 8.240940 4.311594 9.074158
## 2020 6.668674 5.118929 8.218420 4.284806 9.052543
## 2021 6.644473 5.093031 8.195916 4.257995 9.030952
## 2022 6.620272 5.067117 8.173427 4.231159 9.009385
## 2023 6.596071 5.041188 8.150954 4.204300 8.987842
## 2024 6.571870 5.015243 8.128497 4.177417 8.966323
## 2025 6.547669 4.989283 8.106055 4.150510 8.944828
## 2026 6.523468 4.963307 8.083628 4.123579 8.923356
## 2027 6.499267 4.937316 8.061217 4.096625 8.901909
## 2028 6.475066 4.911310 8.038821 4.069647 8.880484
## 2029 6.450864 4.885289 8.016440 4.042646 8.859083
## 2030 6.426663 4.859252 7.994075 4.015621 8.837706
## 2031 6.402462 4.833200 7.971724 3.988573 8.816351
## 2032 6.378261 4.807133 7.949389 3.961502 8.795020
## 2033 6.354060 4.781051 7.927069 3.934408 8.773712
## 2034 6.329859 4.754954 7.904764 3.907290 8.752427
## 2035 6.305658 4.728842 7.882473 3.880150 8.731166
## 2036 6.281457 4.702716 7.860198 3.852987 8.709926
## 2037 6.257256 4.676574 7.837937 3.825801 8.688710
## 2038 6.233054 4.650418 7.815691 3.798592 8.667517
## 2039 6.208853 4.624247 7.793460 3.771361 8.646346
## 2040 6.184652 4.598061 7.771243 3.744107 8.625197
## 2041 6.160451 4.571861 7.749041 3.716831 8.604071
## 2042 6.136250 4.545646 7.726854 3.689532 8.582968
## 2043 6.112049 4.519417 7.704681 3.662211 8.561886
## 2044 6.087848 4.493173 7.682522 3.634868 8.540827
## 2045 6.063647 4.466915 7.660378 3.607503 8.519790
## 2046 6.039446 4.440643 7.638248 3.580116 8.498775
## 2047 6.015244 4.414356 7.616133 3.552707 8.477782
## 2048 5.991043 4.388055 7.594032 3.525276 8.456811
## 2049 5.966842 4.361740 7.571944 3.497823 8.435862
## 2050 5.942641 4.335411 7.549871 3.470348 8.414934
## 2051 5.918440 4.309068 7.527812 3.442852 8.394028
## 2052 5.894239 4.282711 7.505767 3.415335 8.373143
## 2053 5.870038 4.256340 7.483736 3.387796 8.352280
## 2054 5.845837 4.229955 7.461719 3.360235 8.331438
## 2055 5.821636 4.203556 7.439715 3.332654 8.310618
## 2056 5.797434 4.177143 7.417726 3.305051 8.289818
## 2057 5.773233 4.150717 7.395750 3.277427 8.269040
## 2058 5.749032 4.124277 7.373787 3.249782 8.248282
## 2059 5.724831 4.097824 7.351839 3.222116 8.227546
## 2060 5.700630 4.071357 7.329903 3.194430 8.206830
## 2061 5.676429 4.044876 7.307982 3.166722 8.186136
## 2062 5.652228 4.018382 7.286073 3.138994 8.165461
## 2063 5.628027 3.991875 7.264179 3.111246 8.144808
## 2064 5.603826 3.965354 7.242297 3.083477 8.124174
## 2065 5.579624 3.938820 7.220429 3.055687 8.103562
## 2066 5.555423 3.912273 7.198573 3.027878 8.082969
## 2067 5.531222 3.885713 7.176731 3.000048 8.062397
## 2068 5.507021 3.859140 7.154902 2.972198 8.041845
## 2069 5.482820 3.832554 7.133086 2.944328 8.021313
## 2070 5.458619 3.805954 7.111284 2.916437 8.000800
fit2
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## 1973 8.455119 7.063583 9.846655 6.314483 10.59575
## 1974 8.454992 7.061518 9.848467 6.311374 10.59861
## 1975 8.454866 7.059398 9.850333 6.308182 10.60155
## 1976 8.454739 7.057225 9.852253 6.304906 10.60457
## 1977 8.454612 7.054998 9.854227 6.301549 10.60768
## 1978 8.454486 7.052717 9.856254 6.298109 10.61086
## 1979 8.454359 7.050384 9.858334 6.294587 10.61413
## 1980 8.454232 7.047997 9.860467 6.290984 10.61748
Answer: The linear model forecasts lower water levels than the piecewise function.
Chapter 6 1. Show that a 3×5 MA is equivalent to a 7-term weighted moving average with weights of 0.067, 0.133, 0.200, 0.200, 0.200, 0.133, and 0.067. Answer: 1/15(y + 2y + 3y + 3y + 2y + y). 1/15 = 0.067
autoplot(plastics)
Answer: The plot shows an increasing seasonal trend on the sales over time.
mult = decompose(plastics, type = "multiplicative")
autoplot(mult)
c) Do the results support the graphical interpretation from part a? Answer: yes, the trend is shoing the increase we discussed and the seasonal chart is showing the seasonality trends.
adjusted = mult$x/mult$seasonal
autoplot(adjusted)
e) Change one observation to be an outlier, and recompute the seasonally adjusted data. What is the effect of the outlier?
out_plastic = plastics
out_plastic[15] = out_plastic[15]+500
out_mult = decompose(out_plastic, type = "multiplicative")
out_adjusted = out_mult$x/out_mult$seasonal
autoplot(out_adjusted)
Answer: The addition of the outlier to the data is obvious in the plot. It also takes away from the upward trend slightly.
library(seasonal)
retaildata <- readxl::read_excel("C:/Users/Jake/Documents/PredAnalytics/retail.xlsx", skip=1)
retail_ts = ts(retaildata[,"A3349627V"], frequency = 12, start = c(1982,4))
x11_test = seas(retail_ts, x11 = "")
autoplot(x11_test)
Answer: The irregular graph highlights some outliers in the data. Also, we can still see a slight increase in the data and trend.
autoplot(cangas)
ggsubseriesplot(cangas)
ggseasonplot(cangas)
Answer: Over the years, we can see a growing increase in the production of gas, which tells me there must be a demend increase. We can also see there is less of a demand for gas in the summer months, using the same logic. I find this interesting, because the summer is when people travel more, so I would think there would be a higher demand for gas. Whereas, in the winder months, we see an increase of gas production so that makes me think that people in Canada are using gas to stay warm during these times.
cangas_stl = stl(cangas, s.window = 15)
autoplot(cangas_stl)
c) Compare the results with those obtained using SEATS and X11. How are they different?
cangas_ts = ts(cangas, frequency = 12, start = c(1960,45))
x11_cangas = seas(cangas_ts, x11 = "")
autoplot(x11_cangas)
Answer: With STL, we can see clear seasonality in the 1980s, where with the X11 and SEATS techniques, we see more movement in the seasonality in the 1960s.
stl1 = stl(bricksq, s.window = 35, t.window = 15)
autoplot(stl1)
stl2 = stl(bricksq, s.window = "periodic", t.window = 15)
autoplot(stl2)
b) Compute and plot the seasonally adjusted data
season = seasadj(stl2)
autoplot(season)
c) Use a naive method to produce forecasts of the seasonally adjusted data.
autoplot(naive(season))
d) Use stlf() to reasonalise the results, giving forecast for the orginal data.
stlf_brick = stlf(bricksq, method = "naive")
forecast_stlf = forecast(stlf_brick)
forecast_stlf
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## 1994 Q4 467.0649 441.7635 492.3662 428.3698 505.7599
## 1995 Q1 426.0748 390.2933 461.8563 371.3517 480.7979
## 1995 Q2 484.0357 440.2125 527.8590 417.0139 551.0576
## 1995 Q3 494.0000 443.3973 544.6027 416.6098 571.3902
## 1995 Q4 467.0649 410.4893 523.6404 380.5400 553.5897
## 1996 Q1 426.0748 364.0994 488.0502 331.2916 520.8580
## 1996 Q2 484.0357 417.0946 550.9768 381.6582 586.4133
## 1996 Q3 494.0000 422.4370 565.5630 384.5538 603.4462
checkresiduals(forecast_stlf$residuals)
## Warning in modeldf.default(object): Could not find appropriate degrees of
## freedom for this model.
Answer: Some correlation issues seen in the ACF chart at Lag=5 & 8.
stl_robust = stl(bricksq, t.window = 15, s.window = "periodic", robust = T)
stl_robust1 = seasadj(stl_robust)
stl_robust2 = naive(stl_robust1)
forecast_robust = forecast(stl_robust2)
forecast_robust
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## 1994 Q4 471.673 442.1422 501.2037 426.5095 516.8364
## 1995 Q1 471.673 429.9101 513.4358 407.8022 535.5437
## 1995 Q2 471.673 420.5242 522.8218 393.4476 549.8983
## 1995 Q3 471.673 412.6114 530.7345 381.3461 561.9998
## 1995 Q4 471.673 405.6401 537.7058 370.6845 572.6614
## 1996 Q1 471.673 399.3376 544.0083 361.0456 582.3003
## 1996 Q2 471.673 393.5419 549.8040 352.1818 591.1641
## 1996 Q3 471.673 388.1473 555.1986 343.9315 599.4144
## 1996 Q4 471.673 383.0806 560.2653 336.1827 607.1632
## 1997 Q1 471.673 378.2885 565.0575 328.8537 614.4922
checkresiduals(forecast_robust$residuals)
## Warning in modeldf.default(object): Could not find appropriate degrees of
## freedom for this model.
Answer: Not much of a difference sighted when performing robust forecast.
train = window(bricksq, end = c(1992,3))
test = window(bricksq, start = c(1992,4), end = c(1994,3))
train_naive = snaive(train)
train_stlf = stlf(train)
train_fc_naive = forecast(train_naive, h=8)
train_fc_stlf = forecast(train_stlf, h = 8)
train_fc_naive
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## 1992 Q4 430 366.2905 493.7095 332.5647 527.4353
## 1993 Q1 387 323.2905 450.7095 289.5647 484.4353
## 1993 Q2 413 349.2905 476.7095 315.5647 510.4353
## 1993 Q3 451 387.2905 514.7095 353.5647 548.4353
## 1993 Q4 430 339.9011 520.0989 292.2056 567.7944
## 1994 Q1 387 296.9011 477.0989 249.2056 524.7944
## 1994 Q2 413 322.9011 503.0989 275.2056 550.7944
## 1994 Q3 451 360.9011 541.0989 313.2056 588.7944
train_fc_stlf
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## 1992 Q4 424.7340 398.8399 450.6280 385.1325 464.3354
## 1993 Q1 380.2198 343.5813 416.8584 324.1860 436.2537
## 1993 Q2 441.8892 396.9916 486.7867 373.2243 510.5540
## 1993 Q3 450.9971 399.1249 502.8693 371.6655 530.3287
## 1993 Q4 424.7340 366.7065 482.7614 335.9887 513.4792
## 1994 Q1 380.2198 316.6182 443.8215 282.9495 477.4902
## 1994 Q2 441.8892 373.1527 510.6256 336.7658 547.0125
## 1994 Q3 450.9971 377.4732 524.5210 338.5520 563.4422
test
## Qtr1 Qtr2 Qtr3 Qtr4
## 1992 420
## 1993 394 462 476 443
## 1994 421 472 494
Answer: When compring to the test data, the STLF technique produces better results.
autoplot(writing)
We see a lot of seasonality here with a slight increase in the trend. The best method here is naive.
writing_stlf = stlf(writing, s.window = "periodic", robust = T, method = "naive")
autoplot(writing_stlf)
8. Do the same thing for the fancy dataset.
autoplot(fancy)
There seems to be increasing seasonality, we are going to try a Box-Cox transformation and method“rwdrift”
fancy_stlf = stlf(fancy, s.window = "periodic", lambda = BoxCox.lambda(fancy),
method = "rwdrift", robust = T)
autoplot(fancy_stlf)