Hyndman Ch5/6

R Markdown

CH5 Excerscise 1

1A) As indicated in the plot above, there is a clear correlation between electricity demand and temperature. This can be explained by AC usage, as temperature increases, more electricity is used.

## 
## Call:
## tslm(formula = Demand ~ Temperature, data = daily20)
## 
## Coefficients:
## (Intercept)  Temperature  
##      39.212        6.757

1B) Using the check residual function, we see there are some violation of assumption in the residual. The top residual graph does not show any significant outlier. The histogram shows the residual is not normally distributed but slightly left skewed.The ACF plot also shows larger autocorrelation at time 4 and 6. The Breusch-Godfrey test reassures us it is not significant at 5% so the autocorrelation is no large. They will unlikely have large impact on the forecast results. The model is adequate.

## 
##  Breusch-Godfrey test for serial correlation of order up to 5
## 
## data:  Residuals from Linear regression model
## LM test = 3.8079, df = 5, p-value = 0.5774

1C)I think the forecast for 35 degrees is reasonable and in range. The maximum temperature in the data set is 43 degrees and 32 degrees is the 3rd standard deviation. The forecasted temperature looks in range. The 15 degrees forecast is lower than my expectation. Heating should start to increase the demand for electricity as well. The lowest temperature given in the data set is 19.6, and 15 is less than the lower point. That’s why the prediction is not as accurate.

## `geom_smooth()` using formula 'y ~ x'

## 
##  Breusch-Godfrey test for serial correlation of order up to 5
## 
## data:  Residuals from Linear regression model
## LM test = 3.8079, df = 5, p-value = 0.5774

##          Point Forecast    Lo 80    Hi 80     Lo 95    Hi 95
## 4.285714       140.5701 108.6810 172.4591  90.21166 190.9285
## 4.428571       275.7146 245.2278 306.2014 227.57056 323.8586

1D)The given prediction interval for 15 degrees on the top is lower than my expectation, and 35 degress is within my expectation. 1e) As expected, the electricity demand also increases as temperature decreases pass 20 degrees. This make sense as more electricity is need for heating. This shows my previous forecasting model did not capture this characteristics. It is not a good predictor for low temperature dates. The data used only contained high temperature dates resulted in a flawed model.

Question 2 2A) As shown in the graph above, there is a inverse relationship between time and olympic year. This indicates the speed for 400 meter has increased, and the time it took to complete has decreased. Canadids are faster as time goes on. The missing data might indicate the event was not held or speed was not recorded.

## 
## Call:
## tslm(formula = mens400 ~ time400, data = mens400)
## 
## Coefficients:
## (Intercept)      time400  
##   172.48148     -0.06457

## `geom_smooth()` using formula 'y ~ x'

## Warning: Removed 3 rows containing non-finite values (stat_smooth).

2B) The winning speed is decreasing on average 0.06457 seconds per year.

## `geom_smooth()` using method = 'loess' and formula 'y ~ x'

## Warning: Removed 3 rows containing non-finite values (stat_smooth).

## Warning: Removed 3 rows containing missing values (geom_point).

2C)The plot shows the residual is trending upwards and it is not likely the winning time to continue to decrease for a long time. The decrease in time is suitable. This means the fitted line is accurate short-term predictor but not a good long term predictor.

##      Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 2020       42.04231 40.44975 43.63487 39.55286 44.53176

2D) I assumed the time reduction will continue.

Question 3

##      Qtr1 Qtr2 Qtr3 Qtr4
## 1956 0.67 0.33 0.00 0.00
## 1957 0.00 1.00 0.00 0.00
## 1958 0.00 1.00 0.00 0.00
## 1959 1.00 0.00 0.00 0.00
## 1960 0.00 1.00 0.00 0.00
## 1961 0.33 0.67 0.00 0.00
## 1962 0.00 1.00 0.00 0.00
## 1963 0.00 1.00 0.00 0.00
## 1964 1.00 0.00 0.00 0.00
## 1965 0.00 1.00 0.00 0.00
## 1966 0.00 1.00 0.00 0.00
## 1967 1.00 0.00 0.00 0.00
## 1968 0.00 1.00 0.00 0.00
## 1969 0.00 1.00 0.00 0.00
## 1970 1.00 0.00 0.00 0.00
## 1971 0.00 1.00 0.00 0.00
## 1972 0.33 0.67 0.00 0.00
## 1973 0.00 1.00 0.00 0.00
## 1974 0.00 1.00 0.00 0.00
## 1975 1.00 0.00 0.00 0.00
## 1976 0.00 1.00 0.00 0.00
## 1977 0.00 1.00 0.00 0.00
## 1978 1.00 0.00 0.00 0.00
## 1979 0.00 1.00 0.00 0.00
## 1980 0.00 1.00 0.00 0.00
## 1981 0.00 1.00 0.00 0.00
## 1982 0.00 1.00 0.00 0.00
## 1983 0.00 1.00 0.00 0.00
## 1984 0.00 1.00 0.00 0.00
## 1985 0.00 1.00 0.00 0.00
## 1986 1.00 0.00 0.00 0.00
## 1987 0.00 1.00 0.00 0.00
## 1988 0.00 1.00 0.00 0.00
## 1989 1.00 0.00 0.00 0.00
## 1990 0.00 1.00 0.00 0.00
## 1991 1.00 0.00 0.00 0.00
## 1992 0.00 1.00 0.00 0.00
## 1993 0.00 1.00 0.00 0.00
## 1994 0.00 1.00 0.00 0.00
## 1995 0.00 1.00 0.00 0.00
## 1996 0.00 1.00 0.00 0.00
## 1997 1.00 0.00 0.00 0.00
## 1998 0.00 1.00 0.00 0.00
## 1999 0.00 1.00 0.00 0.00
## 2000 0.00 1.00 0.00 0.00
## 2001 0.00 1.00 0.00 0.00
## 2002 1.00 0.00 0.00 0.00
## 2003 0.00 1.00 0.00 0.00
## 2004 0.00 1.00 0.00 0.00
## 2005 1.00 0.00 0.00 0.00
## 2006 0.00 1.00 0.00 0.00
## 2007 0.00 1.00 0.00 0.00
## 2008 1.00 0.00 0.00 0.00
## 2009 0.00 1.00 0.00 0.00
## 2010 0.00 1.00

The output is data quarterly data from 1956 -2010 ending in 2Q2010. There are 214 data points. Using ?Ausbeer, the help function describes the data as quarterly beer production in australia.The result is the count of easter occurance using the dates given in the austraila beer dataset. The fraction indicates easter was spans into 2 quarters.

Question 4 log(y)=beta0 +beta1log(x)+e beta1Log(x)= -beta0+ log(y)-e beta1=(-beta0+ log(y)-e)/log(X) beta1=-(beta0/log(x))+log(y)/log(x)+e*

Question5 5A)As shown in the graph, large spikes at the end of the year during christmas time and smaller spike in the beginning of March during spring break. Tourists are more likely to travel to austrailia during their summer time(winter elsewhere) which will boost sales. As time past, travel is easily accessible, leads to increase in tourists. 5B) It is neccessary to log transform the data because the large spike are hard to compaire, log transformation will allow easy comparison. The distribution is also non linear, transformation will help with heteroscedasticity in error term. 5C)

## 
## Call:
## tslm(formula = log(fancy) ~ trend + season + surf, data = fancy)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.33673 -0.12757  0.00257  0.10911  0.37671 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 7.6196670  0.0742471 102.626  < 2e-16 ***
## trend       0.0220198  0.0008268  26.634  < 2e-16 ***
## season2     0.2514168  0.0956790   2.628 0.010555 *  
## season3     0.2660828  0.1934044   1.376 0.173275    
## season4     0.3840535  0.0957075   4.013 0.000148 ***
## season5     0.4094870  0.0957325   4.277 5.88e-05 ***
## season6     0.4488283  0.0957647   4.687 1.33e-05 ***
## season7     0.6104545  0.0958039   6.372 1.71e-08 ***
## season8     0.5879644  0.0958503   6.134 4.53e-08 ***
## season9     0.6693299  0.0959037   6.979 1.36e-09 ***
## season10    0.7473919  0.0959643   7.788 4.48e-11 ***
## season11    1.2067479  0.0960319  12.566  < 2e-16 ***
## season12    1.9622412  0.0961066  20.417  < 2e-16 ***
## surf        0.5015151  0.1964273   2.553 0.012856 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.179 on 70 degrees of freedom
## Multiple R-squared:  0.9567, Adjusted R-squared:  0.9487 
## F-statistic:   119 on 13 and 70 DF,  p-value: < 2.2e-16

## Don't know how to automatically pick scale for object of type ts. Defaulting to continuous.
## Don't know how to automatically pick scale for object of type ts. Defaulting to continuous.

As shown in the plot above, the residual seems to have some seasonal pattern. This might cause autocorrelation

## Don't know how to automatically pick scale for object of type ts. Defaulting to continuous.

5E) the box plot does not show any problems

## (Intercept)       trend     season2     season3     season4     season5 
##  7.61966701  0.02201983  0.25141682  0.26608280  0.38405351  0.40948697 
##     season6     season7     season8     season9    season10    season11 
##  0.44882828  0.61045453  0.58796443  0.66932985  0.74739195  1.20674790 
##    season12        surf 
##  1.96224123  0.50151509

5F)The coefficents shows the November, December seasons are the mostly important. Surfing feastival also increases sales quite a bit. Large increase started on season 11. The increase is much faster than in the begining.

## 
##  Breusch-Godfrey test for serial correlation of order up to 17
## 
## data:  Residuals from Linear regression model
## LM test = 37.954, df = 17, p-value = 0.002494

5G) looking at the Breusch-Godfrey test, the p value is very low, meaning the null-hypothesis is rejected. heteroscedasticity exist in the model. Large spikes exist at lags 1,2,3,10,21,24,26,28, meaning autocorrelation is present 5H)

## Warning in forecast.lm(fancyreg, Newthreeyr): newdata column names not
## specified, defaulting to first variable required.

##          Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## Jan 1994       1006.002 501.0539 1510.950 227.5849 1784.419
## Feb 1994       1006.317 501.3479 1511.286 227.8675 1784.767
## Mar 1994       1006.396 501.6231 1511.168 228.2491 1784.542
## Apr 1994       1006.577 501.5658 1511.589 228.0624 1785.092
## May 1994       1006.667 501.6339 1511.699 228.1190 1785.214
## Jun 1994       1006.770 501.7159 1511.824 228.1896 1785.350
## Jul 1994       1006.995 501.9201 1512.070 228.3823 1785.608
## Aug 1994       1007.036 501.9403 1512.133 228.3910 1785.682
## Sep 1994       1007.182 502.0643 1512.299 228.5036 1785.860
## Oct 1994       1007.324 502.1850 1512.462 228.6128 1786.034
## Nov 1994       1007.847 502.6870 1513.006 229.1033 1786.590
## Dec 1994       1008.666 503.4851 1513.847 229.8900 1787.442
## Jan 1995       1006.768 501.5678 1511.967 227.9624 1785.573
## Feb 1995       1007.083 501.8618 1512.304 228.2450 1785.921
## Mar 1995       1007.161 502.1369 1512.186 228.6266 1785.696
## Apr 1995       1007.343 502.0797 1512.606 228.4399 1786.246
## May 1995       1007.432 502.1478 1512.717 228.4965 1786.368
## Jun 1995       1007.535 502.2298 1512.841 228.5670 1786.504
## Jul 1995       1007.761 502.4340 1513.088 228.7598 1786.762
## Aug 1995       1007.802 502.4542 1513.150 228.7685 1786.836
## Sep 1995       1007.947 502.5782 1513.317 228.8810 1787.014
## Oct 1995       1008.089 502.6989 1513.480 228.9903 1787.188
## Nov 1995       1008.612 503.2009 1514.024 229.4808 1787.744
## Dec 1995       1009.432 503.9990 1514.865 230.2674 1788.596
## Jan 1996       1007.533 502.0817 1512.985 228.3399 1786.727
## Feb 1996       1007.849 502.3757 1513.321 228.6225 1787.075
## Mar 1996       1007.927 502.6508 1513.203 229.0041 1786.850
## Apr 1996       1008.109 502.5936 1513.624 228.8174 1787.400
## May 1996       1008.198 502.6617 1513.734 228.8740 1787.522
## Jun 1996       1008.301 502.7437 1513.859 228.9445 1787.658
## Jul 1996       1008.527 502.9479 1514.105 229.1373 1787.916
## Aug 1996       1008.568 502.9681 1514.168 229.1460 1787.990
## Sep 1996       1008.713 503.0921 1514.334 229.2585 1788.168
## Oct 1996       1008.855 503.2128 1514.497 229.3678 1788.342
## Nov 1996       1009.378 503.7148 1515.042 229.8583 1788.898
## Dec 1996       1010.198 504.5129 1515.882 230.6449 1789.750

##              [,1]     [,2]
## Jan 1994 1510.950 1784.419
## Feb 1994 1511.286 1784.767
## Mar 1994 1511.168 1784.542
## Apr 1994 1511.589 1785.092
## May 1994 1511.699 1785.214
## Jun 1994 1511.824 1785.350
## Jul 1994 1512.070 1785.608
## Aug 1994 1512.133 1785.682
## Sep 1994 1512.299 1785.860
## Oct 1994 1512.462 1786.034
## Nov 1994 1513.006 1786.590
## Dec 1994 1513.847 1787.442
## Jan 1995 1511.967 1785.573
## Feb 1995 1512.304 1785.921
## Mar 1995 1512.186 1785.696
## Apr 1995 1512.606 1786.246
## May 1995 1512.717 1786.368
## Jun 1995 1512.841 1786.504
## Jul 1995 1513.088 1786.762
## Aug 1995 1513.150 1786.836
## Sep 1995 1513.317 1787.014
## Oct 1995 1513.480 1787.188
## Nov 1995 1514.024 1787.744
## Dec 1995 1514.865 1788.596
## Jan 1996 1512.985 1786.727
## Feb 1996 1513.321 1787.075
## Mar 1996 1513.203 1786.850
## Apr 1996 1513.624 1787.400
## May 1996 1513.734 1787.522
## Jun 1996 1513.859 1787.658
## Jul 1996 1514.105 1787.916
## Aug 1996 1514.168 1787.990
## Sep 1996 1514.334 1788.168
## Oct 1996 1514.497 1788.342
## Nov 1996 1515.042 1788.898
## Dec 1996 1515.882 1789.750

##              [,1]     [,2]
## Jan 1994 501.0539 227.5849
## Feb 1994 501.3479 227.8675
## Mar 1994 501.6231 228.2491
## Apr 1994 501.5658 228.0624
## May 1994 501.6339 228.1190
## Jun 1994 501.7159 228.1896
## Jul 1994 501.9201 228.3823
## Aug 1994 501.9403 228.3910
## Sep 1994 502.0643 228.5036
## Oct 1994 502.1850 228.6128
## Nov 1994 502.6870 229.1033
## Dec 1994 503.4851 229.8900
## Jan 1995 501.5678 227.9624
## Feb 1995 501.8618 228.2450
## Mar 1995 502.1369 228.6266
## Apr 1995 502.0797 228.4399
## May 1995 502.1478 228.4965
## Jun 1995 502.2298 228.5670
## Jul 1995 502.4340 228.7598
## Aug 1995 502.4542 228.7685
## Sep 1995 502.5782 228.8810
## Oct 1995 502.6989 228.9903
## Nov 1995 503.2009 229.4808
## Dec 1995 503.9990 230.2674
## Jan 1996 502.0817 228.3399
## Feb 1996 502.3757 228.6225
## Mar 1996 502.6508 229.0041
## Apr 1996 502.5936 228.8174
## May 1996 502.6617 228.8740
## Jun 1996 502.7437 228.9445
## Jul 1996 502.9479 229.1373
## Aug 1996 502.9681 229.1460
## Sep 1996 503.0921 229.2585
## Oct 1996 503.2128 229.3678
## Nov 1996 503.7148 229.8583
## Dec 1996 504.5129 230.6449

5I)

##          Point Forecast         Lo 80 Hi 80         Lo 95 Hi 95
## Jan 1994            Inf 4.026550e+217   Inf  6.900235e+98   Inf
## Feb 1994            Inf 5.403041e+217   Inf  9.153517e+98   Inf
## Mar 1994            Inf 7.114174e+217   Inf  1.340685e+99   Inf
## Apr 1994            Inf 6.718528e+217   Inf  1.112402e+99   Inf
## May 1994            Inf 7.191774e+217   Inf  1.177179e+99   Inf
## Jun 1994            Inf 7.806169e+217   Inf  1.263175e+99   Inf
## Jul 1994            Inf 9.575194e+217   Inf  1.531765e+99   Inf
## Aug 1994            Inf 9.770044e+217   Inf  1.545112e+99   Inf
## Sep 1994            Inf 1.105985e+218   Inf  1.729146e+99   Inf
## Oct 1994            Inf 1.247865e+218   Inf  1.928718e+99   Inf
## Nov 1994            Inf 2.061482e+218   Inf  3.149922e+99   Inf
## Dec 1994            Inf 4.579334e+218   Inf  6.917378e+99   Inf
## Jan 1995            Inf 6.731515e+217   Inf  1.006478e+99   Inf
## Feb 1995            Inf 9.032709e+217   Inf  1.335145e+99   Inf
## Mar 1995            Inf 1.189335e+218   Inf  1.955543e+99   Inf
## Apr 1995            Inf 1.123192e+218   Inf  1.622565e+99   Inf
## May 1995            Inf 1.202308e+218   Inf  1.717050e+99   Inf
## Jun 1995            Inf 1.305022e+218   Inf  1.842485e+99   Inf
## Jul 1995            Inf 1.600764e+218   Inf  2.234254e+99   Inf
## Aug 1995            Inf 1.633339e+218   Inf  2.253722e+99   Inf
## Sep 1995            Inf 1.848967e+218   Inf  2.522157e+99   Inf
## Oct 1995            Inf 2.086158e+218   Inf  2.813256e+99   Inf
## Nov 1995            Inf 3.446349e+218   Inf  4.594521e+99   Inf
## Dec 1995            Inf 7.655650e+218   Inf 1.008979e+100   Inf
## Jan 1996            Inf 1.125362e+218   Inf  1.468063e+99   Inf
## Feb 1996            Inf 1.510072e+218   Inf  1.947461e+99   Inf
## Mar 1996            Inf 1.988309e+218   Inf  2.852381e+99   Inf
## Apr 1996            Inf 1.877731e+218   Inf  2.366696e+99   Inf
## May 1996            Inf 2.009997e+218   Inf  2.504513e+99   Inf
## Jun 1996            Inf 2.181711e+218   Inf  2.687474e+99   Inf
## Jul 1996            Inf 2.676128e+218   Inf  3.258913e+99   Inf
## Aug 1996            Inf 2.730586e+218   Inf  3.287310e+99   Inf
## Sep 1996            Inf 3.091069e+218   Inf  3.678853e+99   Inf
## Oct 1996            Inf 3.487601e+218   Inf  4.103454e+99   Inf
## Nov 1996            Inf 5.761543e+218   Inf  6.701631e+99   Inf
## Dec 1996            Inf 1.279858e+219   Inf 1.471710e+100   Inf

5J) I can try using a box-cox transformation instead of log transformation. Current model still shows autocorrelation. Maybe box-cox can fully capture the exponential growth in sales.

##  Time-Series [1:1355] from 1991 to 2017: 6.62 6.43 6.58 7.22 6.88 ...

##  Time-Series [1:1355] from 1991 to 2017: 6.62 6.43 6.58 7.22 6.88 ...

6a) as shown above, as the fourier pair increases, the fitted line is more like original data.

## [1] 247.007
## [1] 241.1854
## [1] 197.2137
## [1] 187.0649
## [1] 145.2851
## [1] 152.2815

##            CV           AIC          AICc           BIC         AdjR2 
##  8.201921e-02 -1.813292e+03 -1.813208e+03 -1.790354e+03  8.250858e-01 
##            CV           AIC          AICc           BIC         AdjR2 
##  8.136854e-02 -1.819113e+03 -1.818957e+03 -1.787001e+03  8.269569e-01 
##            CV           AIC          AICc           BIC         AdjR2 
##  7.658846e-02 -1.863085e+03 -1.862834e+03 -1.821797e+03  8.375702e-01 
##            CV           AIC          AICc           BIC         AdjR2 
##  7.553646e-02 -1.873234e+03 -1.872723e+03 -1.813596e+03  8.406928e-01 
##            CV           AIC          AICc           BIC         AdjR2 
##  7.135754e-02 -1.915014e+03 -1.913441e+03 -1.809500e+03  8.516102e-01 
##            CV           AIC          AICc           BIC         AdjR2 
##  7.223834e-02 -1.908017e+03 -1.902469e+03 -1.710753e+03  8.540588e-01

6B)The model with Fourier term k=10 performs the best Min AIC and CV 6C)

## 
##  Breusch-Godfrey test for serial correlation of order up to 104
## 
## data:  Residuals from Linear regression model
## LM test = 155.45, df = 104, p-value = 0.0008135

6D)

## Scale for 'x' is already present. Adding another scale for 'x', which will
## replace the existing scale.

## Warning: Removed 674 row(s) containing missing values (geom_path).

6E) The prediction is mostly inline with the actual data. I failed to predict the large drop in 3Q2005. In general, the model did a good job.

7A)Looking at the plot, there seems to be a declining trend. Also, seasonalityis not seen.

## 
## Call:
## tslm(formula = huron ~ trend, data = huron)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -2.50997 -0.72726  0.00083  0.74402  2.53565 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 10.202037   0.230111  44.335  < 2e-16 ***
## trend       -0.024201   0.004036  -5.996 3.55e-08 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.13 on 96 degrees of freedom
## Multiple R-squared:  0.2725, Adjusted R-squared:  0.2649 
## F-statistic: 35.95 on 1 and 96 DF,  p-value: 3.545e-08

## 
## Call:
## tslm(formula = huron ~ laketime + t1)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -2.49626 -0.66240 -0.07139  0.85163  2.39222 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 132.90870   19.97687   6.653 1.82e-09 ***
## laketime     -0.06498    0.01051  -6.181 1.58e-08 ***
## t1            0.06486    0.01563   4.150 7.26e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.045 on 95 degrees of freedom
## Multiple R-squared:  0.3841, Adjusted R-squared:  0.3711 
## F-statistic: 29.62 on 2 and 95 DF,  p-value: 1.004e-10

7b) The piecewise regression model breaking up at 1915 produce a better adj Rsqr and smaller residual error. The time break is also shown to be significant variable. However, intercept term is a lot larger in piecewise model.

7c) Linear model shows a downward trend, the piecewise model shows a flat trend after the break.

Chapter 6

Question2 2A) the plot shows clear seaonaility, more sales in the summer, less sales at the beginning and end of the year.There is also an increasing trend. 2B) 2C) yes the results support my answer in part a). There is a clear increasing trend and seasonality. 2D) 2E) the outlier is not adjusted by the seasonality aadjustment, it is still present in the data. 2F) it would make a difference to the trendline but not to the seasonality since classical multiplicative decomposition have the same seasonality trends.

Question 3 3) the seasonal component decreases as time goes on. Outliers are still visible in beginning of the dataset and around 2000s. 4A) The magnitude seasonal component increases as time increases.There is a strong positive trend component. This shows the number of worker increases as time increases. Large outliers exists around the 1992 time period.The second chart shows the seasonal component by month. There is a large increase in May, July and December, corresponds to summer and Christmas holiday. Large drop in August might be due to back to school. 4B)The recession in 1991-1992 is clearly visible from the large outliers shown in the same period.

Question 5

5A) The plot shows clear increasing trend and seasonality. The cause of increase is due to higher demand, better extraction tool. Better extraction tool increase production to continue during cold seasons. price will also increase the seasonality production. For example, if in July we can forecast a colder winter or shortage of supply, production will increase.

5C) Both with an increasing trend, but very different seasonality component. STL ranges from -1 to 2, X11 ranges from 0.8 to 1.2. The seasonal distribution is also quite different. Thus outliers is different were there are more outliers in the middle using the STL model while X11 have more evenly distributed outliers.

Question 6 6A) the trend looks very similar in the two stl model, the seasonality is different as expected. changing seasonality model have larger magnitude later on. 6B) 6C) 6D)

## Warning in checkresiduals(bricksqseason): The fitted degrees of freedom is based
## on the model used for the seasonally adjusted data.

## 
##  Ljung-Box test
## 
## data:  Residuals from STL +  ETS(M,N,N)
## Q* = 41.128, df = 6, p-value = 2.733e-07
## 
## Model df: 2.   Total lags used: 8

6E)The residual does show some correlation. The ACF shows lag 4 and 8 with larger correlation.

## Warning in checkresiduals(bricksqrobust): The fitted degrees of freedom is based
## on the model used for the seasonally adjusted data.

## 
##  Ljung-Box test
## 
## data:  Residuals from STL +  ETS(M,N,N)
## Q* = 28.163, df = 6, p-value = 8.755e-05
## 
## Model df: 2.   Total lags used: 8

6F) robust does not seem to make much difference

## Scale for 'x' is already present. Adding another scale for 'x', which will
## replace the existing scale.

6G)The stlf() function does a better job than the snaive method.

Question 7

Question 8

Hyndman Ch5/6

Yu Mu

11/8/2020

R Markdown