MBA 678 Assignment 1

Problem 1

Bring in the file and look at the structure.

Travel <- read.csv("Sept11Travel.csv")
str(Travel)

## 'data.frame':    172 obs. of  4 variables:
##  $ Month: Factor w/ 172 levels "1-Apr","1-Aug",..: 86 75 119 42 130 108 97 53 163 152 ...
##  $ Air  : int  35153577 32965187 39993913 37981886 38419672 42819023 45770315 48763670 38173223 39051877 ...
##  $ Rail : int  454115779 435086002 568289732 568101697 539628385 570694457 618571581 609210368 488444939 514253920 ...
##  $ VMT  : num  163 153 178 179 189 ...

Look at the start date of the data

head(Travel)

##    Month      Air      Rail    VMT
## 1 Jan-90 35153577 454115779 163.28
## 2 Feb-90 32965187 435086002 153.25
## 3 Mar-90 39993913 568289732 178.42
## 4 Apr-90 37981886 568101697 178.68
## 5 May-90 38419672 539628385 188.88
## 6 Jun-90 42819023 570694457 189.16

Air Miles

Plot the Air miles data as a time series. We will pick August 2001 as the end date since we are looking at pre-event data.

library(forecast)

## Warning: package 'forecast' was built under R version 3.4.3

library(ggplot2)

## Warning: package 'ggplot2' was built under R version 3.4.3

library(ggfortify)

## 
## Attaching package: 'ggfortify'

## The following object is masked from 'package:forecast':
## 
##     gglagplot

Air.ts <- ts(Travel$Air, start=c(1990,1), end=c(2001,8), freq=12)
plot(Air.ts, xlab="Time", ylab="Air Travellers", bty="l")

Before evaluating the time series components, we will add a trend line. We will check the correlation of linear, quadratic, and cubic trend lines to see which fits best.

AirLinear <- tslm(Air.ts ~ trend)
summary(AirLinear)

## 
## Call:
## tslm(formula = Air.ts ~ trend)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -9466409 -3410590  -681183  3360750 11823514 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 35728435     834749   42.80   <2e-16 ***
## trend         177097      10272   17.24   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 4912000 on 138 degrees of freedom
## Multiple R-squared:  0.6829, Adjusted R-squared:  0.6806 
## F-statistic: 297.2 on 1 and 138 DF,  p-value: < 2.2e-16

AirQuad <- tslm(Air.ts ~ trend + I(trend^2))
summary(AirQuad)

## 
## Call:
## tslm(formula = Air.ts ~ trend + I(trend^2))
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -10706560  -3385499   -494312   3334001  11901573 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 3.745e+07  1.253e+06  29.897   <2e-16 ***
## trend       1.042e+05  4.102e+04   2.541   0.0122 *  
## I(trend^2)  5.169e+02  2.818e+02   1.834   0.0688 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 4870000 on 137 degrees of freedom
## Multiple R-squared:  0.6905, Adjusted R-squared:  0.686 
## F-statistic: 152.8 on 2 and 137 DF,  p-value: < 2.2e-16

AirCube <- tslm(Air.ts ~ trend + I(trend^3))
summary(AirCube)

## 
## Call:
## tslm(formula = Air.ts ~ trend + I(trend^3))
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -10824789  -3360490   -515235   3334962  11766582 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 3.705e+07  1.110e+06  33.370  < 2e-16 ***
## trend       1.351e+05  2.559e+04   5.279 4.98e-07 ***
## I(trend^3)  2.355e+00  1.315e+00   1.791   0.0754 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 4873000 on 137 degrees of freedom
## Multiple R-squared:  0.6902, Adjusted R-squared:  0.6857 
## F-statistic: 152.6 on 2 and 137 DF,  p-value: < 2.2e-16

We can see from the r-squared values that the quadratic trend line is the closest approximation. Next we plot it.

plot(Air.ts, xlab="Time", ylab="Air Miles", bty="l")
lines(AirQuad$fitted, lwd=2)

Next we will adjust the scale to get a different view of the data by making the y axis logarithmic

plot(Air.ts, log="y", xlab="Time", ylab="Air Miles (log scale)", bty="l")

The log scale doesn’t change the pattern much. We will now try suppressing seasonality by grouping first by quarter then by year.

quarterly <- aggregate(Air.ts, nfrequency=4, FUN=sum)
plot(quarterly, bty="l")

The quarterly data looks very similar but a bit smoother.

yearly <- aggregate(Air.ts, nfrequency=1, FUN=sum)
plot(yearly, bty="l")

Here we can see that the seasonal adjustments are gone and the resulting graphs shows that the trend is upward exponential. One final plot will be a seasonal plot of the data for each year.

ggseasonplot(Air.ts)

This confirms the seasonality and that it’s additive.

(a) For the Air data, the time series components include level, noise, and thanks to the trend line we see that the trend is upward exponential. The fact that the seasonality seems be constant between different periods would mean that it is additive.

(b) After using the trend lines and suppressing the seasonality, we can see that the trend is upward exponential.

Rail Miles

First we plot the time series.

Rail.ts <- ts(Travel$Rail, start=c(1990,1), end=c(2001,8), freq=12)
plot(Rail.ts, xlab="Time", ylab="Rail Travellers", bty="l")

We will try fitting linear, quadratic, and cubic trend lines to see which has the strongest correlation.

RailLinear <- tslm(Rail.ts ~ trend)
summary(RailLinear)

## 
## Call:
## tslm(formula = Rail.ts ~ trend)
## 
## Residuals:
##        Min         1Q     Median         3Q        Max 
## -142225897  -40750574   -4370229   41272192  133135874 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 547632872   10511773  52.097  < 2e-16 ***
## trend         -837744     129357  -6.476 1.53e-09 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 61860000 on 138 degrees of freedom
## Multiple R-squared:  0.2331, Adjusted R-squared:  0.2275 
## F-statistic: 41.94 on 1 and 138 DF,  p-value: 1.532e-09

RailQuad <- tslm(Rail.ts ~ trend + I(trend^2))
summary(RailQuad)

## 
## Call:
## tslm(formula = Rail.ts ~ trend + I(trend^2))
## 
## Residuals:
##        Min         1Q     Median         3Q        Max 
## -137634923  -37327572   -1559409   41652351  125112936 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 576828666   15618390   36.93  < 2e-16 ***
## trend        -2071369     511394   -4.05 8.52e-05 ***
## I(trend^2)       8749       3513    2.49    0.014 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 60720000 on 137 degrees of freedom
## Multiple R-squared:  0.2663, Adjusted R-squared:  0.2556 
## F-statistic: 24.86 on 2 and 137 DF,  p-value: 6.14e-10

RailCube <- tslm(Rail.ts ~ trend + I(trend^3))
summary(RailCube)

## 
## Call:
## tslm(formula = Rail.ts ~ trend + I(trend^3))
## 
## Residuals:
##        Min         1Q     Median         3Q        Max 
## -137779487  -36088332   -2979367   42425943  122227602 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  5.764e+08  1.366e+07  42.201  < 2e-16 ***
## trend       -1.749e+06  3.147e+05  -5.559 1.37e-07 ***
## I(trend^3)   5.107e+01  1.617e+01   3.158  0.00195 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 59940000 on 137 degrees of freedom
## Multiple R-squared:  0.2851, Adjusted R-squared:  0.2747 
## F-statistic: 27.32 on 2 and 137 DF,  p-value: 1.035e-10

Looking at the data, the cubic trend appeared to be closest, and the r-squared value confirms this. This will be added to the plot.

plot(Rail.ts, xlab="Time", ylab="Rail Miles", bty="l")
lines(RailCube$fitted, lwd=2)

We will change the scale to be logarithmic to see how that effects the trend.

plot(Rail.ts, log="y", xlab="Time", ylab="Rail Miles (log scale)", bty="l")

Here we see no real change in the trend. Next we will group by quarter to remove some of the seasonality.

quarterly <- aggregate(Rail.ts, nfrequency=4, FUN=sum)
plot(quarterly, bty="l")

This plot looks closer to the cubic trend line. Lastly we will add a seasonal plot.

ggseasonplot(Rail.ts)

We can see the seasonality and the additive nature of the pattern.

(a)For this time series, there is level, noise, and third level polynomial trend. For seasonality, it looks like the values in each season stay constant, making the seasonality additive.

(b) After analyzing the trend data, we see this is a third level polynomial trend.

Vehicle Miles

Plot the time series

VMT.ts <- ts(Travel$VMT, start=c(1990,1), end=c(2001,8), freq=12)
plot(VMT.ts, xlab="Time", ylab="Auto Miles", bty="l")

This looks pretty upward linear as a trend, but we will compare linear, quadratic, and cubic trend lines to see which has the strongest correlation.

VMTLinear <- tslm(VMT.ts ~ trend)
summary(VMTLinear)

## 
## Call:
## tslm(formula = VMT.ts ~ trend)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -32.554  -9.185   0.559  11.087  23.272 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 173.17137    2.42094   71.53   <2e-16 ***
## trend         0.44249    0.02979   14.85   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 14.25 on 138 degrees of freedom
## Multiple R-squared:  0.6152, Adjusted R-squared:  0.6124 
## F-statistic: 220.6 on 1 and 138 DF,  p-value: < 2.2e-16

VMTQuad <- tslm(VMT.ts ~ trend + I(trend^2))
summary(VMTQuad)

## 
## Call:
## tslm(formula = VMT.ts ~ trend + I(trend^2))
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -33.503  -8.745   0.798  10.974  23.752 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 1.745e+02  3.674e+00  47.487  < 2e-16 ***
## trend       3.867e-01  1.203e-01   3.214  0.00163 ** 
## I(trend^2)  3.956e-04  8.266e-04   0.479  0.63297    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 14.29 on 137 degrees of freedom
## Multiple R-squared:  0.6158, Adjusted R-squared:  0.6102 
## F-statistic: 109.8 on 2 and 137 DF,  p-value: < 2.2e-16

VMTCube <- tslm(VMT.ts ~ trend + I(trend^3))
summary(VMTCube)

## 
## Call:
## tslm(formula = VMT.ts ~ trend + I(trend^3))
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -33.632  -8.783   0.801  10.878  23.848 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 1.742e+02  3.255e+00  53.523  < 2e-16 ***
## trend       4.091e-01  7.500e-02   5.455 2.22e-07 ***
## I(trend^3)  1.869e-06  3.854e-06   0.485    0.629    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 14.29 on 137 degrees of freedom
## Multiple R-squared:  0.6158, Adjusted R-squared:  0.6102 
## F-statistic: 109.8 on 2 and 137 DF,  p-value: < 2.2e-16

The correlations are close, but the linear trend line has the best r-squared value. We’ll add the trend line to the plot.

plot(VMT.ts, xlab="Time", ylab="Auto Miles", bty="l")
lines(VMTLinear$fitted, lwd=2)

(a) Based on this plot, besides level and noise, the trend is upward linear and given that the values seem uniform between the seasons, it is additive seasonality.

We will change the y values to logarithmic to see how it changes the trend.

plot(VMT.ts, log="y", xlab="Time", ylab="Auto Miles (log scale)", bty="l")

The trend for auto miles appears to be the same. Next to see the seasonality we will group the data by quarter.

quarterly <- aggregate(VMT.ts, nfrequency=4, FUN=sum)
plot(quarterly, bty="l")

yearly <- aggregate(VMT.ts, nfrequency=1, FUN=sum)
plot(yearly, bty="l")

Removing the seasonality further points to the trend being upward linear. We will add a seasonal plot to confirm the pattern.

ggseasonplot(VMT.ts)

This plot confirms the additive nature of the seasonality.

(a)Based on this plot, besides level and noise, the trend is upward linear and given that the values seem uniform between the seasons, it is additive seasonality.

(b) Our work on supressing seasonality and using trend lines shows that the trend is upward linear.

Problem 3

(a)

Bring in the Appliance data and look at the structure.

AppShip <- read.csv("ApplianceShipments.csv")
str(AppShip)

## 'data.frame':    20 obs. of  2 variables:
##  $ Quarter  : Factor w/ 20 levels "Q1-1985","Q1-1986",..: 1 6 11 16 2 7 12 17 3 8 ...
##  $ Shipments: int  4009 4321 4224 3944 4123 4522 4657 4030 4493 4806 ...

We will look at the beginning and end of the data to get the range for our time series.

head(AppShip)

##   Quarter Shipments
## 1 Q1-1985      4009
## 2 Q2-1985      4321
## 3 Q3-1985      4224
## 4 Q4-1985      3944
## 5 Q1-1986      4123
## 6 Q2-1986      4522

tail(AppShip)

##    Quarter Shipments
## 15 Q3-1988      4417
## 16 Q4-1988      4258
## 17 Q1-1989      4245
## 18 Q2-1989      4900
## 19 Q3-1989      4585
## 20 Q4-1989      4533

Now we create the time series and plot

AppShip.ts <- ts(AppShip$Shipments, start=c(1985,1), end=c(1989,4), freq=4)
plot(AppShip.ts, xlab="Time", ylab="Shipments", bty="l")
title("Shipments of Household Appliances")

(b)

To determine the forecast components we will first look at trend lines to see what has the best fit.

AppShipLinear <- tslm(AppShip.ts ~ trend)
summary(AppShipLinear)

## 
## Call:
## tslm(formula = AppShip.ts ~ trend)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -339.06 -171.14    3.13  137.27  393.40 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 4167.663    111.035  37.535   <2e-16 ***
## trend         24.494      9.269   2.643   0.0165 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 239 on 18 degrees of freedom
## Multiple R-squared:  0.2795, Adjusted R-squared:  0.2395 
## F-statistic: 6.983 on 1 and 18 DF,  p-value: 0.01655

AppShipQuad <- tslm(AppShip.ts ~ trend + I(trend^2))
summary(AppShipQuad)

## 
## Call:
## tslm(formula = AppShip.ts ~ trend + I(trend^2))
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -398.39 -155.53   32.63  181.62  346.62 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 3982.931    173.764  22.921 3.19e-14 ***
## trend         74.876     38.109   1.965    0.066 .  
## I(trend^2)    -2.399      1.763  -1.361    0.191    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 233.6 on 17 degrees of freedom
## Multiple R-squared:  0.3503, Adjusted R-squared:  0.2739 
## F-statistic: 4.583 on 2 and 17 DF,  p-value: 0.02559

AppShipCube <- tslm(AppShip.ts ~ trend + I(trend^3))
summary(AppShipCube)

## 
## Call:
## tslm(formula = AppShip.ts ~ trend + I(trend^3))
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -385.45 -160.94   36.84  186.72  341.98 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 4030.71272  151.38648  26.625 2.67e-15 ***
## trend         52.71986   23.49578   2.244   0.0385 *  
## I(trend^3)    -0.07230    0.05549  -1.303   0.2100    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 234.5 on 17 degrees of freedom
## Multiple R-squared:  0.3449, Adjusted R-squared:  0.2679 
## F-statistic: 4.476 on 2 and 17 DF,  p-value: 0.02745

The third level polynomial trend appears to be the best fit but we will remove seasonality to see how that effects the trend.

yearly <- aggregate(AppShip.ts, nfrequency=1, FUN=sum)
plot(yearly, bty="l")

Grouped by year, the trend looks more logarithmic, we will try comparing the trend to our aggregated plot.

AppShipLog <- tslm(yearly ~ trend + I(log(trend)))
summary(AppShipLog)

## 
## Call:
## tslm(formula = yearly ~ trend + I(log(trend)))
## 
## Residuals:
## Time Series:
## Start = 1985 
## End = 1989 
## Frequency = 1 
##        1        2        3        4        5 
##   50.779 -229.606  314.210 -142.720    7.337 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)    16911.5      379.2  44.594 0.000502 ***
## trend           -464.3      406.9  -1.141 0.372104    
## I(log(trend))   2277.5     1012.4   2.250 0.153397    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 295.3 on 2 degrees of freedom
## Multiple R-squared:  0.9284, Adjusted R-squared:  0.8568 
## F-statistic: 12.97 on 2 and 2 DF,  p-value: 0.0716

The r-squared value is much better, and we can add the trend to the aggregated plot to confirm.

yearly <- aggregate(AppShip.ts, nfrequency=1, FUN=sum)
plot(yearly, bty="l")
lines(AppShipLog$fitted, lwd=2)

After suppressing the seasonality, we can see that the trend is logarithmic.

To find the seasonality, we will create a seasonal plot.

ggseasonplot(AppShip.ts)

Looking at the year over year trend, we can see that generally the pattern is seasonal, with more shipments in Q2 and Q3. Looking at the different in the values, they look to be the same pattern but generally not differing by a constant so I would mark them as multiplicative.

The series has level, noise, logarithmic trend, and multiplicative seasonality.

Problem 6

(a)

Bring in the data and view the start and end of the data.

Sham <- read.csv("ShampooSales.csv")
head(Sham)

##    Month Shampoo.Sales
## 1 Jan-95         266.0
## 2 Feb-95         145.9
## 3 Mar-95         183.1
## 4 Apr-95         119.3
## 5 May-95         180.3
## 6 Jun-95         168.5

tail(Sham)

##     Month Shampoo.Sales
## 31 Jul-97         575.5
## 32 Aug-97         407.6
## 33 Sep-97         682.0
## 34 Oct-97         475.3
## 35 Nov-97         581.3
## 36 Dec-97         646.9

Plot the time series

Sham.ts <- ts(Sham$Shampoo.Sales, start=c(1995,1), end=c(1997,12), freq=12)
autoplot(Sham.ts, xlab="Time", ylab="Shampoo Sales ($)")

(b)

This looks to be quadratic in trend, but we will compare it to linear and cubic trend lines as well.

ShamLinear <- tslm(Sham.ts ~ trend)
summary(ShamLinear)

## 
## Call:
## tslm(formula = Sham.ts ~ trend)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -108.74  -52.12  -16.13   43.48  194.25 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)    89.14      26.72   3.336  0.00207 ** 
## trend          12.08       1.26   9.590 3.37e-11 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 78.5 on 34 degrees of freedom
## Multiple R-squared:  0.7301, Adjusted R-squared:  0.7222 
## F-statistic: 91.97 on 1 and 34 DF,  p-value: 3.368e-11

ShamQuad <- tslm(Sham.ts ~ trend + I(trend^2))
summary(ShamQuad)

## 
## Call:
## tslm(formula = Sham.ts ~ trend + I(trend^2))
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -104.148  -42.075   -8.438   33.924  144.582 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 202.8789    33.3002   6.092 7.35e-07 ***
## trend        -5.8801     4.1501  -1.417    0.166    
## I(trend^2)    0.4854     0.1088   4.461 8.93e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 62.93 on 33 degrees of freedom
## Multiple R-squared:  0.8316, Adjusted R-squared:  0.8214 
## F-statistic: 81.51 on 2 and 33 DF,  p-value: 1.708e-13

ShamCube <- tslm(Sham.ts ~ trend + I(trend^3))
summary(ShamCube)

## 
## Call:
## tslm(formula = Sham.ts ~ trend + I(trend^3))
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -101.948  -35.904   -8.844   30.245  143.731 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 1.768e+02  2.939e+01   6.017 9.19e-07 ***
## trend       1.651e+00  2.584e+00   0.639  0.52738    
## I(trend^3)  8.542e-03  1.946e-03   4.389  0.00011 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 63.32 on 33 degrees of freedom
## Multiple R-squared:  0.8296, Adjusted R-squared:  0.8193 
## F-statistic: 80.32 on 2 and 33 DF,  p-value: 2.089e-13

The quadratic and cubic trend lines are closest to the data set. We will suppress the seasonality to see how that changes the trend. We will group by quarter.

quarterly <- aggregate(Sham.ts, nfrequency=4, FUN=sum)
autoplot(quarterly, bty="l")

We can compare this seasonality-suppressed plot to the quadratic and cubic trend lines

quarterly <- aggregate(Sham.ts, nfrequency=4, FUN=sum)
quartQuad <- tslm(quarterly ~ trend + I(trend^2))
summary(quartQuad)

## 
## Call:
## tslm(formula = quarterly ~ trend + I(trend^2))
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -97.122 -50.378  -3.034  51.240 131.203 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  618.505     82.024   7.541 3.54e-05 ***
## trend        -57.425     29.010  -1.979 0.079126 .  
## I(trend^2)    12.786      2.172   5.886 0.000233 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 79.36 on 9 degrees of freedom
## Multiple R-squared:  0.9712, Adjusted R-squared:  0.9648 
## F-statistic: 151.7 on 2 and 9 DF,  p-value: 1.17e-07

quartCube <- tslm(quarterly ~ trend + I(trend^3))
summary(quartCube)

## 
## Call:
## tslm(formula = quarterly ~ trend + I(trend^3))
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -118.600  -40.137   -6.262   60.701  136.910 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 521.0660    74.0276   7.039 6.06e-05 ***
## trend        14.3312    18.6188   0.770 0.461190    
## I(trend^3)    0.6382     0.1165   5.478 0.000391 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 83.95 on 9 degrees of freedom
## Multiple R-squared:  0.9678, Adjusted R-squared:  0.9606 
## F-statistic: 135.1 on 2 and 9 DF,  p-value: 1.94e-07

The trends are both better fits for the quarterly grouped plot, given the shape of the plot and the r-squared values this trend looks to be exponential.

To find if this data is seasonal, we will use the seasonal plot.

ggseasonplot(Sham.ts)

There is definitely a seasonal pattern, and given that the variance in the values between periods is not constant, it looks to be multiplicative.

For the shampoo sales, the time series components include level, noise, upward exponential trend with multiplicative seasonality.

(c)

Intuitively I would not think that shampoo sales were seasonal, however after viewing the seasonal plot of the data there is definitely a seasonal pattern to the sales. Interestingly, there seems to be a peak in sales in the fall, though we would need input on the business side to explain this pattern.

MBA 678 Assignment 1

Sean McGowan

January 26, 2018

Problem 1

Bring in the file and look at the structure.

Look at the start date of the data

Air Miles

Plot the Air miles data as a time series. We will pick August 2001 as the end date since we are looking at pre-event data.

Before evaluating the time series components, we will add a trend line. We will check the correlation of linear, quadratic, and cubic trend lines to see which fits best.

We can see from the r-squared values that the quadratic trend line is the closest approximation. Next we plot it.

Next we will adjust the scale to get a different view of the data by making the y axis logarithmic

The log scale doesn’t change the pattern much. We will now try suppressing seasonality by grouping first by quarter then by year.

The quarterly data looks very similar but a bit smoother.

Here we can see that the seasonal adjustments are gone and the resulting graphs shows that the trend is upward exponential. One final plot will be a seasonal plot of the data for each year.

This confirms the seasonality and that it’s additive.

(a) For the Air data, the time series components include level, noise, and thanks to the trend line we see that the trend is upward exponential. The fact that the seasonality seems be constant between different periods would mean that it is additive.

(b) After using the trend lines and suppressing the seasonality, we can see that the trend is upward exponential.

Rail Miles

First we plot the time series.

We will try fitting linear, quadratic, and cubic trend lines to see which has the strongest correlation.

Looking at the data, the cubic trend appeared to be closest, and the r-squared value confirms this. This will be added to the plot.

We will change the scale to be logarithmic to see how that effects the trend.

Here we see no real change in the trend. Next we will group by quarter to remove some of the seasonality.

This plot looks closer to the cubic trend line. Lastly we will add a seasonal plot.

We can see the seasonality and the additive nature of the pattern.

(a)For this time series, there is level, noise, and third level polynomial trend. For seasonality, it looks like the values in each season stay constant, making the seasonality additive.

(b) After analyzing the trend data, we see this is a third level polynomial trend.

Vehicle Miles

Plot the time series

This looks pretty upward linear as a trend, but we will compare linear, quadratic, and cubic trend lines to see which has the strongest correlation.

The correlations are close, but the linear trend line has the best r-squared value. We’ll add the trend line to the plot.

(a) Based on this plot, besides level and noise, the trend is upward linear and given that the values seem uniform between the seasons, it is additive seasonality.

We will change the y values to logarithmic to see how it changes the trend.

The trend for auto miles appears to be the same. Next to see the seasonality we will group the data by quarter.

Removing the seasonality further points to the trend being upward linear. We will add a seasonal plot to confirm the pattern.

This plot confirms the additive nature of the seasonality.

(a)Based on this plot, besides level and noise, the trend is upward linear and given that the values seem uniform between the seasons, it is additive seasonality.

(b) Our work on supressing seasonality and using trend lines shows that the trend is upward linear.

Problem 3

(a)

Bring in the Appliance data and look at the structure.

We will look at the beginning and end of the data to get the range for our time series.

Now we create the time series and plot

(b)

To determine the forecast components we will first look at trend lines to see what has the best fit.

The third level polynomial trend appears to be the best fit but we will remove seasonality to see how that effects the trend.

Grouped by year, the trend looks more logarithmic, we will try comparing the trend to our aggregated plot.

The r-squared value is much better, and we can add the trend to the aggregated plot to confirm.

After suppressing the seasonality, we can see that the trend is logarithmic.

To find the seasonality, we will create a seasonal plot.

Looking at the year over year trend, we can see that generally the pattern is seasonal, with more shipments in Q2 and Q3. Looking at the different in the values, they look to be the same pattern but generally not differing by a constant so I would mark them as multiplicative.

The series has level, noise, logarithmic trend, and multiplicative seasonality.

Problem 6

(a)

Bring in the data and view the start and end of the data.

Plot the time series

(b)

This looks to be quadratic in trend, but we will compare it to linear and cubic trend lines as well.

The quadratic and cubic trend lines are closest to the data set. We will suppress the seasonality to see how that changes the trend. We will group by quarter.

We can compare this seasonality-suppressed plot to the quadratic and cubic trend lines

The trends are both better fits for the quarterly grouped plot, given the shape of the plot and the r-squared values this trend looks to be exponential.

To find if this data is seasonal, we will use the seasonal plot.

There is definitely a seasonal pattern, and given that the variance in the values between periods is not constant, it looks to be multiplicative.

For the shampoo sales, the time series components include level, noise, upward exponential trend with multiplicative seasonality.

(c)