Chapter 1 Problems 1-5

  1. The goal of the study is descriptive. Impact assessment to evaluate the effect of an event is retrospective in nature.
  2. The forecast horizon is 33 periods or months. Monthly forecasts for the full data set are made between September 2001 through June 2004. The study also forecasted from September 2001 to December 2004 to help understand the changes that occurred over the forecast period (prediction interval). Next months forecasts are sufficient for the goal of the study, where monthly forecasts were produced for the time horizon. The data is pre-existing, so there is no issue with needing previous months data.
  3. The series requires little to no automation. There are only three series that need to be forecasted. Forecasting is a one time event. Updated forecasts are not required because there is no new information to incorporate when forecasting events that have already occurred. Data from January 1990 to April 2004 is already available.
  4. t = 1, 2, 3 is an index denoting time period months 1, 2 and 3. These would be January, February and March of 1990 respectively. The first period in the series, t = 1 refers to time period for the month of January in 1990.
  5. The values for y1, y2 and y3 are 35153577, 32965187 and 39993913 respectively.
setwd("~/Documents/MBA /MBA 678/R Stuff")
sept_11 <- read.csv("Sept11Travel.csv")
appl_ship <- read.csv("ApplianceShipments.csv")
shampoo <- read.csv("ShampooSales.csv")

Chapter 2 Problems 1, 3 and 6

## Create plot of each of the three pre-event time series (Air, Rail, Car)

# Pre-event air time series plot
pre_9_11_air <- ts(sept_11$Air, start = c(1990, 1), end = c(2001, 8), freq = 12)
plot(pre_9_11_air, xlab = "Time", ylab = "Air RPM", ylim = c(29000000, 70000000), bty = "l")

# Pre-event rail time series plot
pre_9_11_rail <- ts(sept_11$Rail, start = c(1990, 1), end = c(2001, 8), freq = 12)
plot(pre_9_11_rail, xlab = "Time", ylab = "Rail PM", ylim = c(300000000, 700000000), bty = "l")

# Pre-event car time series plot
pre_9_11_car <- ts(sept_11$VMT, start = c(1990, 1), end = c(2001, 8), freq = 12)
plot(pre_9_11_car, xlab = "Time", ylab = "Car VMT", ylim = c(150, 262), bty = "l")

The level, trend and seasonality time series components appear in all three plots. The air and car plots appear to have an upward linear trend, while the rail plots appear to have an exponential trend.

# Visualizing Air RPM Time Series
air_seasonality <- stl(pre_9_11_air, s.window = "period")
plot(air_seasonality)

seasonplot(pre_9_11_air)

monthplot(pre_9_11_air)

# Suppressing seasonality
air_quarterly <- aggregate(pre_9_11_air, nfrequency=4, FUN=sum)
plot(air_quarterly, bty="l")

air_annually <- aggregate(pre_9_11_air, nfrequency = 1, FUN=sum)
plot(air_annually, bty = "l")

# Changing scale by y-axis transformation to logarithmic scale
plot(pre_9_11_air, log = "y", xlab = "Time", ylab = "Air RPM")
title("plot(ts(pre_9_11_air) y-axis log scale")

#Fitting trend lines
airLinear <- tslm(pre_9_11_air ~ trend)
plot(pre_9_11_air, xlab = "Time", ylab = "Air RPM", ylim = c(29000000, 70000000), bty = "l")
lines(airLinear$fitted, lwd = 2, col = "blue")
summary(airLinear)
## 
## Call:
## tslm(formula = pre_9_11_air ~ trend)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -9466409 -3410590  -681183  3360750 11823514 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 35728434     834749   42.80   <2e-16 ***
## trend         177097      10272   17.24   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 4912000 on 138 degrees of freedom
## Multiple R-squared:  0.6829, Adjusted R-squared:  0.6806 
## F-statistic: 297.2 on 1 and 138 DF,  p-value: < 2.2e-16
airQuad <- tslm(pre_9_11_air ~ trend + I(trend^2))
lines(airQuad$fitted, lwd = 3, lty = 3, col = "red")

summary(airQuad)
## 
## Call:
## tslm(formula = pre_9_11_air ~ trend + I(trend^2))
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -10706560  -3385499   -494312   3334001  11901573 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 3.745e+07  1.253e+06  29.897   <2e-16 ***
## trend       1.042e+05  4.102e+04   2.541   0.0122 *  
## I(trend^2)  5.169e+02  2.818e+02   1.834   0.0688 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 4870000 on 137 degrees of freedom
## Multiple R-squared:  0.6905, Adjusted R-squared:  0.686 
## F-statistic: 152.8 on 2 and 137 DF,  p-value: < 2.2e-16
# Visulaizing Rail PM Time Series
rail_seasonality <- stl(pre_9_11_rail, s.window = "period")
plot(rail_seasonality)

seasonplot(pre_9_11_rail)

monthplot(pre_9_11_rail)

# Changing scale by y-axis transformation to logarithmic scale
plot(pre_9_11_rail, log = "y", xlab = "Time", ylab = "Rail PM")
title("plot(ts(pre_9_11_rail) log scale")

#Fitting trend lines
railLinear <- tslm(pre_9_11_rail ~ trend)
plot(pre_9_11_rail, xlab = "Time", ylab = "Rail PM", ylim = c(300000000, 700000000), bty = "l")
lines(railLinear$fitted, lwd = 2, col = "blue")
summary(railLinear)
## 
## Call:
## tslm(formula = pre_9_11_rail ~ trend)
## 
## Residuals:
##        Min         1Q     Median         3Q        Max 
## -142225897  -40750574   -4370229   41272192  133135874 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 547632872   10511773  52.097  < 2e-16 ***
## trend         -837744     129357  -6.476 1.53e-09 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 61860000 on 138 degrees of freedom
## Multiple R-squared:  0.2331, Adjusted R-squared:  0.2275 
## F-statistic: 41.94 on 1 and 138 DF,  p-value: 1.532e-09
railQuad <- tslm(pre_9_11_rail ~ trend + I(trend^2))
lines(railQuad$fitted, lwd = 3, lty = 3, col = "red")
summary(railQuad)
## 
## Call:
## tslm(formula = pre_9_11_rail ~ trend + I(trend^2))
## 
## Residuals:
##        Min         1Q     Median         3Q        Max 
## -137634923  -37327572   -1559409   41652351  125112936 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 576828666   15618390   36.93  < 2e-16 ***
## trend        -2071369     511394   -4.05 8.52e-05 ***
## I(trend^2)       8749       3513    2.49    0.014 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 60720000 on 137 degrees of freedom
## Multiple R-squared:  0.2663, Adjusted R-squared:  0.2556 
## F-statistic: 24.86 on 2 and 137 DF,  p-value: 6.14e-10
railPoly <- tslm(pre_9_11_rail ~ trend + I(trend^4))
lines(railPoly$fitted, lwd = 3, lty = 3, col = "green")

summary(railPoly)
## 
## Call:
## tslm(formula = pre_9_11_rail ~ trend + I(trend^4))
## 
## Residuals:
##        Min         1Q     Median         3Q        Max 
## -136582491  -35054347   -2853657   41193098  121210435 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  5.749e+08  1.267e+07  45.379  < 2e-16 ***
## trend       -1.607e+06  2.492e+05  -6.448  1.8e-09 ***
## I(trend^4)   3.447e-01  9.685e-02   3.559 0.000511 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 59400000 on 137 degrees of freedom
## Multiple R-squared:  0.298,  Adjusted R-squared:  0.2878 
## F-statistic: 29.08 on 2 and 137 DF,  p-value: 2.978e-11
#Exponential model for level
rail_exp_fit_lvl <- HoltWinters(pre_9_11_rail, beta = FALSE, gamma = FALSE)
plot(rail_exp_fit_lvl)

#Exponential model for level and trend 
rail_exp_fit_lvl_trd <- HoltWinters(pre_9_11_rail, gamma = FALSE)
plot(rail_exp_fit_lvl_trd)

#Exponential model for level, trend and seasonalality
rail_exp_fit <- HoltWinters(pre_9_11_rail)
plot(rail_exp_fit)

# Visualizing Car VMT Time Series
car_seasonality <- stl(pre_9_11_car, s.window = "period")
plot(car_seasonality)

seasonplot(pre_9_11_car)

monthplot(pre_9_11_car)

#Suppressing seasonality
car_annually <- aggregate(pre_9_11_car, nfrequency = 1, FUN=sum)
plot(car_annually, bty = "l")

# Changing scale by y-axis transformation to logarithmic scale
plot(pre_9_11_car, log = "y", xlab = "Time", ylab = "Car VMT")
title("plot(ts(pre_9_11_car) log scale")

#Fitting trend lines
carLinear <- tslm(pre_9_11_car ~ trend)
plot(pre_9_11_car, xlab = "Time", ylab = "Car VMT", ylim = c(150, 262), bty = "l")
lines(carLinear$fitted, lwd = 2, col = "blue")
summary(carLinear)
## 
## Call:
## tslm(formula = pre_9_11_car ~ trend)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -32.554  -9.185   0.559  11.087  23.272 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 173.17137    2.42094   71.53   <2e-16 ***
## trend         0.44249    0.02979   14.85   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 14.25 on 138 degrees of freedom
## Multiple R-squared:  0.6152, Adjusted R-squared:  0.6124 
## F-statistic: 220.6 on 1 and 138 DF,  p-value: < 2.2e-16
carQuad <- tslm(pre_9_11_car ~ trend + I(trend^2))
lines(carQuad$fitted, lwd = 3, lty = 3, col = "red")

summary(carQuad)
## 
## Call:
## tslm(formula = pre_9_11_car ~ trend + I(trend^2))
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -33.503  -8.745   0.798  10.974  23.752 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 1.745e+02  3.674e+00  47.487  < 2e-16 ***
## trend       3.867e-01  1.203e-01   3.214  0.00163 ** 
## I(trend^2)  3.956e-04  8.266e-04   0.479  0.63297    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 14.29 on 137 degrees of freedom
## Multiple R-squared:  0.6158, Adjusted R-squared:  0.6102 
## F-statistic: 109.8 on 2 and 137 DF,  p-value: < 2.2e-16

Problem 3

appl_ship.ts <- ts(appl_ship$Shipments, start = c(1985,1), end = c(1989, 4), freq = 4)
plot(appl_ship.ts, xlab = "Time", ylab = "Number of Appliance Shipments", ylim = c(3800, 5100), bty = "l")

appl_ship_fit <- stl(appl_ship.ts, s.window="period")
plot(appl_ship_fit)

seasonplot(appl_ship.ts)

yearly <- aggregate(appl_ship.ts, nfrequency=1, FUN=sum)
plot(yearly, bty="l")

biannual <- aggregate(appl_ship.ts, nfrequency=2, FUN=sum)
plot(biannual, bty="l")

applQuad <- tslm(biannual ~ trend + I(trend^2))
plot(biannual, xlab = "Time", ylab = "Appliance Shipments", ylim = c(8000, 10000), bty = "l")
lines(applQuad$fitted, lwd = 2, col = "red")

applQuad <- tslm(appl_ship.ts ~ trend + I(trend^2))
plot(appl_ship.ts, xlab = "Time", ylab = "Appliance Shipments", ylim = c(3800, 5100), bty = "l")
lines(applQuad$fitted, lwd = 2, col = "red")

The appliance time series data does not appear to have a seasonality component, but the decomposition suggests otherwise. It also looks like there may be a polynomial trend present. Aggregating the data makes the trendline more quadratic. All series have level (average value). The sample size of the data is very small, so any inaccuracies in measurement or other causes not accounted for could distort the time series.

names(shampoo) <- sub(" ", ".", names(shampoo))
shampoo_sales.ts <- ts(shampoo$Shampoo.Sales, start = c(1995,1), end = c(1997,12), freq = 12)
plot(shampoo_sales.ts, xlab = "Time", ylab = "Shampoo Sales", ylim = c(100, 800), bty = "l")

shampoo_sales.ts <- stl(shampoo_sales.ts, s.window="period")
plot(shampoo_sales.ts)

The Shampoo time series has level and trend. I would expect to see some seasonality in shampoo sales, but it might be more specific than csan be seen by total sales of all shampoos. Different types of shampoo may be used at different times. Also, people may shower more frequently in the summer than the winter or vice versa.