#Loading necessary libraries:

library(forecast)
library(tidytext)
library(stringr)
library(ggplot2)
library(readr)
library(dplyr)
library(tidyr)
library(knitr)
library(pander)

Chapter 3: Performance Evaluation

For this assignment I will be completing the assigned problems from Chapter 3 in the Practical Time Series Forecasting with R textbook by, Galit Schmueli and Kenneth C. Lichtendahl Jr.

Problem 1: Souvenir Sales

First I will import the csv file that contains the Souvenir Sales data needed for the following questions. The Souvenir Sales data file contains monthly sales for a souvenir shop at the beach resort town in Queensland, Australia, between 1995 and 2001.

#Importing the Souvenir Sales Data

Souv_Sales <- read.csv("SouvenirSales.csv", stringsAsFactors = FALSE)

str(Souv_Sales)

## 'data.frame':    84 obs. of  2 variables:
##  $ Date : chr  "Jan-95" "Feb-95" "Mar-95" "Apr-95" ...
##  $ Sales: num  1665 2398 2841 3547 3753 ...

head(Souv_Sales)

##     Date   Sales
## 1 Jan-95 1664.81
## 2 Feb-95 2397.53
## 3 Mar-95 2840.71
## 4 Apr-95 3547.29
## 5 May-95 3752.96
## 6 Jun-95 3714.74

tail(Souv_Sales)

##     Date     Sales
## 79 1-Jul  26155.15
## 80 1-Aug  28586.52
## 81 1-Sep  30505.41
## 82 1-Oct  30821.33
## 83 1-Nov  46634.38
## 84 1-Dec 104660.67

By plotting the time series we are able to get a general idea what the data looks like before moving forward in the analysis.

Figure 1. Time Series of Souvnier Sales (in thousands)

SSales <- ts(Souv_Sales$Sales/1000, start=c(1995,1), frequency=12)

plot(SSales, ylim = c(0,120), ylab = "Sales (in thousands)", xlab = "Time", main = "Time Series of Souvenir Sales", bty = "l")

pander(SSales)

Table continues below
	Jan	Feb	Mar	Apr	May	Jun	Jul	Aug
1995	1.665	2.398	2.841	3.547	3.753	3.715	4.35	3.566
1996	2.5	5.198	7.225	4.806	5.901	4.951	6.179	4.752
1997	4.717	5.703	9.958	5.305	6.492	6.631	7.35	8.177
1998	5.921	5.815	12.42	6.37	7.609	7.225	8.121	7.979
1999	4.827	6.47	9.639	8.821	8.722	10.21	11.28	12.55
2000	7.615	9.85	14.56	11.59	9.333	13.08	16.73	19.89
2001	10.24	11.27	21.83	17.36	16	18.6	26.16	28.59

	Sep	Oct	Nov	Dec
1995	5.022	6.423	7.601	19.76
1996	5.496	5.835	12.6	28.54
1997	8.573	9.691	15.15	34.06
1998	8.093	8.477	17.91	30.11
1999	11.64	13.61	21.82	45.06
2000	23.93	25.39	36.02	80.72
2001	30.51	30.82	46.63	104.7

Generally speaking, Figure 1 shows some sort of upward trend over time and possible seasonality. In order to check if seasonality does exist in the data set we will break out the sales data by month.

Figure 2. Souvenir Sales Broken Out by Month

par(oma = c(0,0,0,2))

xrange <- c(1995, 2001)
yrange <- range(SSales)

plot(xrange, yrange, type="n", xlab = "Year", ylab="Souvenir Sales (in thousands)", axes=F)

colors <- rainbow(12)
linetype <- c(1:12)
plotchar <-c(1:12)

axis(1, at=seq(1995,2001,1), labels=format(seq(1995,2001,1)))

for ( i in 1:12) {
  currentMonth <- subset(SSales, cycle(SSales)==i)
  lines(seq(1995, 1995+length(currentMonth)-1,1), currentMonth, type="b", lwd=1,
        lty=linetype[i], col=colors[i], pch=plotchar[i])
  
}
title("Souvenir Sales Broken Out by Month")
legend(2001.35, 120, 1:12, cex=0.8, col=colors, pch=plotchar, lty=linetype, title="Month", xpd=NA)

Figure 3. Souvenir Sales Broken Out by Month (ysing ggseasonplot)

ggseasonplot(SSales)

There is definitely seasonality looking at the plots in both Figure 2 and Figure 3, it is clear that December has the highest sales for the souvenir shop every year.

Using the $decompose()$ function, the estimates for the systematic noise and components of a time series can be found.

componentsOfSSales <- decompose(SSales)

componentsOfSSales

## $x
##            Jan       Feb       Mar       Apr       May       Jun       Jul
## 1995   1.66481   2.39753   2.84071   3.54729   3.75296   3.71474   4.34961
## 1996   2.49981   5.19824   7.22514   4.80603   5.90088   4.95134   6.17912
## 1997   4.71702   5.70263   9.95758   5.30478   6.49243   6.63080   7.34962
## 1998   5.92110   5.81458  12.42125   6.36977   7.60912   7.22475   8.12122
## 1999   4.82664   6.47023   9.63877   8.82117   8.72237  10.20948  11.27655
## 2000   7.61503   9.84969  14.55840  11.58733   9.33256  13.08209  16.73278
## 2001  10.24324  11.26688  21.82684  17.35733  15.99779  18.60153  26.15515
##            Aug       Sep       Oct       Nov       Dec
## 1995   3.56634   5.02182   6.42348   7.60060  19.75621
## 1996   4.75215   5.49643   5.83510  12.60008  28.54172
## 1997   8.17662   8.57317   9.69050  15.15184  34.06101
## 1998   7.97925   8.09306   8.47670  17.91466  30.11441
## 1999  12.55222  11.63739  13.60689  21.82211  45.06069
## 2000  19.88861  23.93338  25.39135  36.02480  80.72171
## 2001  28.58652  30.50541  30.82133  46.63438 104.66067
## 
## $seasonal
##             Jan        Feb        Mar        Apr        May        Jun
## 1995 -6.6501615 -5.5621051 -0.6918707 -4.6018646 -5.0742387 -4.8274476
## 1996 -6.6501615 -5.5621051 -0.6918707 -4.6018646 -5.0742387 -4.8274476
## 1997 -6.6501615 -5.5621051 -0.6918707 -4.6018646 -5.0742387 -4.8274476
## 1998 -6.6501615 -5.5621051 -0.6918707 -4.6018646 -5.0742387 -4.8274476
## 1999 -6.6501615 -5.5621051 -0.6918707 -4.6018646 -5.0742387 -4.8274476
## 2000 -6.6501615 -5.5621051 -0.6918707 -4.6018646 -5.0742387 -4.8274476
## 2001 -6.6501615 -5.5621051 -0.6918707 -4.6018646 -5.0742387 -4.8274476
##             Jul        Aug        Sep        Oct        Nov        Dec
## 1995 -2.4526359 -2.0894193 -1.3095168 -0.4258064  6.3416020 27.3434647
## 1996 -2.4526359 -2.0894193 -1.3095168 -0.4258064  6.3416020 27.3434647
## 1997 -2.4526359 -2.0894193 -1.3095168 -0.4258064  6.3416020 27.3434647
## 1998 -2.4526359 -2.0894193 -1.3095168 -0.4258064  6.3416020 27.3434647
## 1999 -2.4526359 -2.0894193 -1.3095168 -0.4258064  6.3416020 27.3434647
## 2000 -2.4526359 -2.0894193 -1.3095168 -0.4258064  6.3416020 27.3434647
## 2001 -2.4526359 -2.0894193 -1.3095168 -0.4258064  6.3416020 27.3434647
## 
## $trend
##            Jan       Feb       Mar       Apr       May       Jun       Jul
## 1995        NA        NA        NA        NA        NA        NA  5.421133
## 1996  6.517855  6.643493  6.712677  6.707937  6.891732  7.466107  7.924554
## 1997  8.566258  8.757715  9.028598  9.317437  9.584402  9.920696 10.200837
## 1998 10.729094 10.753020 10.724792 10.654212 10.718755 10.669431 10.459387
## 1999 10.913802 11.235815 11.574035 11.935474 12.312042 13.097614 13.836559
## 2000 15.392422 15.925448 16.743464 17.746816 18.829614 20.907268 22.502653
## 2001 25.224785 25.979797 26.616045 27.116128 27.784443 29.223966        NA
##            Aug       Sep       Oct       Nov       Dec
## 1995  5.572621  5.872002  6.107134  6.249078  6.390100
## 1996  8.037954  8.172822  8.307455  8.352884  8.447509
## 1997 10.255671 10.362989 10.510016 10.600920 10.672196
## 1998 10.441103 10.352485 10.338690 10.487217 10.657966
## 1999 14.093553 14.439348 14.759589 14.900270 15.045387
## 2000 22.671211 23.033112 23.576381 24.094515 24.602210
## 2001        NA        NA        NA        NA        NA
## 
## $random
##              Jan         Feb         Mar         Apr         May
## 1995          NA          NA          NA          NA          NA
## 1996   2.6321169   4.1168522   1.2043337   2.6999580   4.0833862
## 1997   2.8009240   2.5070205   1.6208524   0.5892071   1.9822662
## 1998   1.8421673   0.6236647   2.3883287   0.3174221   1.9646037
## 1999   0.5629994   0.7965205  -1.2433947   1.4875609   1.4845666
## 2000  -1.1272306  -0.5136528  -1.4931930  -1.5576212  -4.4228150
## 2001  -8.3313839  -9.1508120  -4.0973338  -5.1569337  -6.7124146
##              Jun         Jul         Aug         Sep         Oct
## 1995          NA   1.3811125   0.0831381   0.4593348   0.7421522
## 1996   2.3126805   0.7072021  -1.1963844  -1.3668748  -2.0465482
## 1997   1.5375514  -0.3985808   0.0103681  -0.4803019  -0.3937099
## 1998   1.3827668   0.1144692  -0.3724336  -0.9499082  -1.4361836
## 1999   1.9393134  -0.1073729   0.5480868  -1.4924411  -0.7268928
## 2000  -2.9977307  -3.3172370  -0.6931819   2.2097843   2.2407755
## 2001  -5.7949882          NA          NA          NA          NA
##              Nov         Dec
## 1995  -4.9900804 -13.9773547
## 1996  -2.0944058  -7.2492538
## 1997  -1.7906816  -3.9546509
## 1998   1.0858409  -7.8870209
## 1999   0.5802375   2.6718382
## 2000   5.5886825  28.7760353
## 2001          NA          NA
## 
## $figure
##  [1] -6.6501615 -5.5621051 -0.6918707 -4.6018646 -5.0742387 -4.8274476
##  [7] -2.4526359 -2.0894193 -1.3095168 -0.4258064  6.3416020 27.3434647
## 
## $type
## [1] "additive"
## 
## attr(,"class")
## [1] "decomposed.ts"

Figure 4. Decomposition of Additive Time Series

plot(componentsOfSSales)

Figure 4 shows us that the souvenir sales data has some type of upward trend, and that the seasonality seems to be additive because the spikes in the data occur at fairly equal distances apart.

Partition the data into the training and validation period as explained in the question.

Training Period for Souvenir Sales (in thousands)

validLen<- 12
TrainLeng <- length(SSales) - validLen
  
SSales_Train <- window(SSales, start =c(1995,1), end=c(1995, TrainLeng))
pander(SSales_Train)

Table continues below
	Jan	Feb	Mar	Apr	May	Jun	Jul	Aug
1995	1.665	2.398	2.841	3.547	3.753	3.715	4.35	3.566
1996	2.5	5.198	7.225	4.806	5.901	4.951	6.179	4.752
1997	4.717	5.703	9.958	5.305	6.492	6.631	7.35	8.177
1998	5.921	5.815	12.42	6.37	7.609	7.225	8.121	7.979
1999	4.827	6.47	9.639	8.821	8.722	10.21	11.28	12.55
2000	7.615	9.85	14.56	11.59	9.333	13.08	16.73	19.89

	Sep	Oct	Nov	Dec
1995	5.022	6.423	7.601	19.76
1996	5.496	5.835	12.6	28.54
1997	8.573	9.691	15.15	34.06
1998	8.093	8.477	17.91	30.11
1999	11.64	13.61	21.82	45.06
2000	23.93	25.39	36.02	80.72

Vaidation Period for Souvenir Sales (in thousands)

SSales_Valid <- window(SSales, start=c(1995,TrainLeng+1), end=c(1995, TrainLeng+validLen))
pander(SSales_Valid)

Table continues below
	Jan	Feb	Mar	Apr	May	Jun	Jul	Aug	Sep
2001	10.24	11.27	21.83	17.36	16	18.6	26.16	28.59	30.51

	Oct	Nov	Dec
2001	30.82	46.63	104.7

(a) Why was the data partitioned?

Data partitioning is done as a preliminary step to make sure to address the problem of overfitting and to use one period of the data to develop our forecasting model or method, then we use the other period to test how the model/method performs. The data in this example was partitioned into two periods where the earlier time period will be used as the training period and the later period will be used as the validation period. Because we are looking at a time series forecasting example there is no third or additional test set, as there typically is when building predictive models using cross-sectional data. The third set in a time series would be the actual future forecasts after we valid how good the model is on the validation period.

(b) Why did the analyst choose a 12-month validation period?

The analyst chose a 12-month validation period to mimic the forecast horizon, as stated in the assignment she was given, which was to forecast for 12 months (i.e. for year 2002). The validation period selected needed to be at least a year, if it were longer than 12-months than the training period would contain less recent information and the model would miss out on the most recent information available at the time of prediction.

(c) What is the naive forecast for the validation period? (assume that you must provide forecasts for 12 months ahead)

Since the data for full-year 2001 represents the validation period, we will use that to create the naive forecast for the validation period. The naive forecast will simply be the most recent value of the series, or in other words it will be the last data point in the training period. Therefore, the analysis below should show that the sales value for December 2000 (which was $80,721.71), repeated over the full naive forecast. When plotted the naive forecast will be a straight line. However, since our previous analysis showed seasonality we will show and plot the seasonal naive forecast for the validation period as well.

Naive Forecast for the Validation Period

naive_Valid <- naive(SSales_Train, h=validLen)
naive_Valid

##          Point Forecast    Lo 80     Hi 80     Lo 95    Hi 95
## Jan 2001       80.72171 67.31575  94.12767 60.219059 101.2244
## Feb 2001       80.72171 61.76282  99.68060 51.726583 109.7168
## Mar 2001       80.72171 57.50190 103.94152 45.210077 116.2333
## Apr 2001       80.72171 53.90978 107.53364 39.716409 121.7270
## May 2001       80.72171 50.74507 110.69835 34.876389 126.5670
## Jun 2001       80.72171 47.88394 113.55948 30.500677 130.9427
## Jul 2001       80.72171 45.25287 116.19055 26.476795 134.9666
## Aug 2001       80.72171 42.80392 118.63950 22.731457 138.7120
## Sep 2001       80.72171 40.50382 120.93960 19.213758 142.2297
## Oct 2001       80.72171 38.32833 123.11509 15.886636 145.5568
## Nov 2001       80.72171 36.25916 125.18426 12.722110 148.7213
## Dec 2001       80.72171 34.28209 127.16133  9.698445 151.7450

pander(naive_Valid$mean)

Table continues below
	Jan	Feb	Mar	Apr	May	Jun	Jul	Aug
2001	80.72	80.72	80.72	80.72	80.72	80.72	80.72	80.72

	Sep	Oct	Nov	Dec
2001	80.72	80.72	80.72	80.72

Seasonal Naive Forecast for the Validation Period

snaive_valid <- snaive(SSales_Train, h=validLen)
snaive_valid

##          Point Forecast      Lo 80    Hi 80      Lo 95    Hi 95
## Jan 2001        7.61503 -0.6738117 15.90387 -5.0616594 20.29172
## Feb 2001        9.84969  1.5608483 18.13853 -2.8269994 22.52638
## Mar 2001       14.55840  6.2695583 22.84724  1.8817106 27.23509
## Apr 2001       11.58733  3.2984883 19.87617 -1.0893594 24.26402
## May 2001        9.33256  1.0437183 17.62140 -3.3441294 22.00925
## Jun 2001       13.08209  4.7932483 21.37093  0.4054006 25.75878
## Jul 2001       16.73278  8.4439383 25.02162  4.0560906 29.40947
## Aug 2001       19.88861 11.5997683 28.17745  7.2119206 32.56530
## Sep 2001       23.93338 15.6445383 32.22222 11.2566906 36.61007
## Oct 2001       25.39135 17.1025083 33.68019 12.7146606 38.06804
## Nov 2001       36.02480 27.7359583 44.31364 23.3481106 48.70149
## Dec 2001       80.72171 72.4328683 89.01055 68.0450206 93.39840

pander(snaive_valid$mean)

Table continues below
	Jan	Feb	Mar	Apr	May	Jun	Jul	Aug
2001	7.615	9.85	14.56	11.59	9.333	13.08	16.73	19.89

	Sep	Oct	Nov	Dec
2001	23.93	25.39	36.02	80.72

Figure 5. Plot of Seasonal Naive Forecast for the Validation Period

plot(snaive_valid, main="Seasonal Naive Forecast for the Validation Period", xlab="Year", ylab="Souvenir Sales (in thousands)")

(d) Compute the RMSE and MAPE for the naive forecasts

Using the $accuracy()$ function from the forecast package we can look at the RSME and MAPE of the naive forecast for the validation period quite easily.

Accuracy of the Naive Forecast for the Validation Period

accuracy(naive_Valid, SSales_Valid)

##                      ME     RMSE       MAE        MPE      MAPE     MASE
## Training set   1.113477 10.46073  5.506879  -25.27554  61.16191  1.47054
## Test set     -50.500287 56.09907 54.490114 -287.13834 290.95050 14.55087
##                    ACF1 Theil's U
## Training set -0.1968879        NA
## Test set      0.3182456  6.649124

The seasonal naive forecast for the validation period RMSE is $56,099.07 in sales and the MAPE is 290.95%. These values tell us how accurate the training period was in predicting the validation period using a naive forecast. The measures seem high, but they serve as benchmark for us to now look at the accuracy of the seasonal naive forecast and at a later date different forecast models.

Accuracy of the Seasonal Naive Forecast for the Validation Period

accuracy(snaive_valid, SSales_Valid)

##                    ME     RMSE      MAE      MPE     MAPE     MASE
## Training set 3.401361 6.467818 3.744801 22.39270 25.64127 1.000000
## Test set     7.828278 9.542346 7.828278 27.27926 27.27926 2.090439
##                   ACF1 Theil's U
## Training set 0.4140974        NA
## Test set     0.2264895 0.7373759

For the seasonal naive forecast for the validation period the RMSE is $9,542.35 in sales and the MAPE is 27.28%. This means that on average our forecast is $\approx$ 27% off from the actual data points in the validation period. These measures are lower than the naive forecast and therefore, the deviation from the actual values is lower. The seasonal naive forecast has more predicting accuracy than the naive forecast, this makes sense because from the very first figure in this assignment we saw that seasonality existed.

(e) Plot the histogram of the forecast errors that result from the naive forecasts (for the validation period) Plot also a time plot for the naive forecasts and the actual sales numbers in the validation period. What can you say about the behavior of the naive forecasts?

Figure 6. Forecast Errors of the Naive Forecast on the Validation Period

NF_Hist_valid <-hist(naive_Valid$residuals/1000, breaks=12, ylim = c(0,70), ylab="Frequency", xlab="Forecast Error (in thousands)", main="", bty="l")

multiplier <- NF_Hist_valid$counts / NF_Hist_valid$density
NF_Density <- density(naive_Valid$residuals/1000, na.rm=TRUE)
NF_Density$y <- NF_Density$y * multiplier[1]

lines(NF_Density, col="red", lwd=2)

NF_Hist_valid$counts

##  [1]  1  0  2  1  1  0  1 14 39  5  3  2  1  0  0  0  1

Figure 7. Forecast Errors of the Seasonal Naive Forecast on the Validation Period

SNF_Hist_valid <-hist(snaive_valid$residuals/1000, breaks = 12, ylim=c(1,50), ylab="Frequency", xlab="Forecast Error (in thousands)", main="", bty="l")

multiplier <- SNF_Hist_valid$counts / SNF_Hist_valid$density
SNF_Density <- density(snaive_valid$residuals/1000, na.rm=TRUE)
SNF_Density$y <- SNF_Density$y * multiplier[1]

lines(SNF_Density, col="red", lwd=2)

SNF_Hist_valid$counts

## [1]  7 43  5  4  0  0  0  0  1

From Figure 7, we can see that the forecast error terms of the seasonal naive forecast for the validation period are right skewed. Of the 60 error terms we have, only 7 are negative, which means 53 are positive. This indicates that we are under predicting much more often than we are over predicting ($\approx$ 7.5 times more likely to be under predicting than over predicting).

Figure 8. Time Plot for the Naive Forecast, Seasonal Naive Forecast, and Actual Sales in the Validation Period

plot(SSales_Valid, bty="l", xaxt="n", xlab="Year: 2001", main="Naive Forecasts vs. Actual Sales", yaxt="n", ylab="Souvenir Sales (in thousands)", col=1, lty=1, lwd=2.5, ylim=c(0,125))

axis(1, at=seq(2001,2001.917,0.08333), labels=c("Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov", "Dec"))
axis(2, las=2)

# Naive forecast red line
lines(naive_Valid$mean, col=2, lty=1, lwd=2.5)

#Seasonal Naive Forecast blue
lines(snaive_valid$mean, col=3, lty=1, lwd=2.5)


# Add legend
legend(2001.3,125, c("Actual", "Naive Forecast", "Seasonal Naive Forecast"), col=(1:3), lty=1, lwd=2.5)

Figures 7 and 8 confirm that the seasonal naive forecast is under predicting sales.

Figure 9. Plotting the Residuals by Year for the Seasonal Naive Forecast

plot(snaive_valid$residuals, bty="l", ylab="Residuals", xlab="Year", main="Residuals by Year")

Many of the residuals are above 0 which provides further evidence that our seasonal naive forecast is under predicting.

Figure 10. Normality Plot for Seasonal Naive Forecast Error Terms

qqnorm(snaive_valid$residuals[13:72])
qqline(snaive_valid$residuals[13:72], col="blue", lwd=1.5)

Figure 10 shows that the error terms are not normally distributed. For the error terms to be normally distributed, the line would need to be at a 45 degree angle from the bottom lefthand corner to the upper righthand corner or all of the dots hitting the line, which is not the case.

Figure 11. Normality Plot for Validation Period Error Terms

qqnorm(SSales_Valid - snaive_valid$mean)
qqline(SSales_Valid - snaive_valid$mean, col="red", lwd=1.5)

Figure 11 shows us that the residuals for the validation period are not normally distributed either.

(f) The analyst found a forecasting model that gives satisfcatory performance on the validation set. What must she do to use the forecasting model for generating forecasts or year 2002?

The analyst must recombine the data from the training period and validation period in order to use all available data, as well as the most recent data, to forecast for 2002. Developing a forecast based only on the training period would make it more difficult to generate an accurate forecast for 2002 because the analyst would not have the most recent data to help create the forecast.

Problem 2: Forecasting Shampoo Sales

If the goal is forecasting sales in future months, which of the following steps should be taken (choose one or more)

Partition the data into training and validation periods

Yes, the analyst should partition the data into the training and validation periods. Although three years of data is not much data to go off, the analyst would want to use the first 24 months as the training period and the last 12 months as the validation period. The data partitioning allows the analyst to develop models which will be used to test for accuracy and fit.

Examine time plots of the series and of model forecasts only for the training period

No, the analyst needs to examine time plots of the series and of the model forecasts for both the training and validation periods. Then both periods should be examined against the actual data in order to see how accurate the forecast was, similar to Figure 8.

Look at MAPE and RSME values for the training period

Yes, the analyst will often do this to help determine if the chosen model is “good enough” and is viable for use. The analyst may also use the training period metrics to compare different models (e.g. seasonal naive vs. Holt’s). The analyst should be interested in these values for the validation period, as well, to examine how accurately the training period was able to predict the validation period using the MAPE and RSME as measures.

Look at MAPE and RSME values for the validation period

Yes, the analyst should look at the MAPE and RMSE values for the validation period to see how accurately the predicted validation period matched the actual data based on the training set. The MAPE allows us to have a better understanding of how the forecast deviated (on average) from the actual values. While the RSME allows us to measure the deviation in the same units as the data series. By looking at both of these measures we can gauge how accurate our forecast model is.

Compute naive forecasts

Yes, the analyst should definitely compute naive and seasonal naive forecasts. The naive forecast uses the most recent data points to help forecast a model. Although sometimes the naive forecast is not the best forecasting model it at least provides a baseline or starting point for other forecasting models to be benchmarked against in the future.

Appendix

autoplot(snaive_valid)

autoplot(SSales, series="Data") + autolayer(fitted(snaive_valid), series="Fitted") +autolayer(snaive_valid$mean, series="Seasonal Naive")

## Warning: Removed 12 rows containing missing values (geom_path).

autoplot(residuals(snaive_valid))

Marissa Valente - Assignment 2

MBA 678: Marissa Valente

February 7, 2018