The goal of the Impact of 9/11 on Air Travel in the United States is descriptive, as they are seeking to identify what impact the past event had.
The forecast horizon in this example would be long-term, spanning years. A next-month forecast would not be sufficient.
The level of automation required for this forecast is limited. As it is a one-time process that does not need to be done repeatedly and over time. It will be one forecast using past data and a comparison to actual information.
4.The t = 1,2,3 indicates the time-series in the data set, so for this example it would be the monthly time period. t = 1 refers to January 1990, the first month in the series.
| y | RPM |
|---|---|
| y1 | 35153577 |
| y2 | 32965187 |
| y3 | 39993913 |
setwd("C:/Users/Jordan/Documents/MBA678")
travel <- read.csv("Sept11Travel_Updated.csv")
head(travel)
## Month Air Rail VMT
## 1 Jan-90 35153577 454115779 163.28
## 2 Feb-90 32965187 435086002 153.25
## 3 Mar-90 39993913 568289732 178.42
## 4 Apr-90 37981886 568101697 178.68
## 5 May-90 38419672 539628385 188.88
## 6 Jun-90 42819023 570694457 189.16
I seperated the three modes of travel into their own plot, but selecting just the related data set, I set out to plot them as time series.
flights.ts <- ts(travel$Air, start = c(1990,1), end = c(2001,8), freq =12)
plot(flights.ts, xlab = "Time (year)", ylab = "Revenue Passenger Miles Travelled")
rail.ts <- ts(travel$Rail, start = c(1990,1), end = c(2001,8), freq =12)
plot(rail.ts, xlab = "Time (year)", ylab = "Rail Passenger Miles Travelled")
auto.ts <- ts(travel$VMT, start = c(1990,1), end = c(2001,8), freq =12)
plot(auto.ts, xlab = "Time (year)", ylab = "Vehicle Miles Travelled")
library(forecast)
## Warning: package 'forecast' was built under R version 3.3.2
## Loading required package: zoo
## Warning: package 'zoo' was built under R version 3.3.2
##
## Attaching package: 'zoo'
## The following objects are masked from 'package:base':
##
## as.Date, as.Date.numeric
## Loading required package: timeDate
## Warning: package 'timeDate' was built under R version 3.3.2
## This is forecast 7.3
plot(flights.ts, xlab = "Time (year)", ylab = "Miles Travelled by Air")
plot(flights.ts, log="y", xlab = "Time (year)", ylab = "Miles Travelled by Air")
zoom.flights.ts <- window(flights.ts, start = c(1999,1), end = c(2001,8))
plot(zoom.flights.ts, xlab = "Time", ylab = "Miles Travelled by Air")
travelLinear <- tslm(flights.ts ~ trend)
summary(travelLinear)
##
## Call:
## tslm(formula = flights.ts ~ trend)
##
## Residuals:
## Min 1Q Median 3Q Max
## -9466409 -3410590 -681183 3360750 11823514
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 35728435 834749 42.80 <2e-16 ***
## trend 177097 10272 17.24 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 4912000 on 138 degrees of freedom
## Multiple R-squared: 0.6829, Adjusted R-squared: 0.6806
## F-statistic: 297.2 on 1 and 138 DF, p-value: < 2.2e-16
plot(flights.ts, xlab = "Time (year)", ylab = "Miles Travelled by Air")
lines(travelLinear$fitted, lwd=2)
supseasonality <- aggregate(flights.ts, nfrequency=4, FUN=sum)
plot(supseasonality, ylab ="Miles Travelled by Air (without seasonality)", bty="l")
I then did the trend line and zoomed in plots for the other two data sets. I mixed up the length for the zoomed in time period as well.
zoom.rail.ts <- window(rail.ts, start = c(1999,1), end = c(2001,8))
plot(zoom.flights.ts, xlab = "Time", ylab = "Miles Travelled by Rail")
railLinear <- tslm(rail.ts ~ trend)
summary(railLinear)
##
## Call:
## tslm(formula = rail.ts ~ trend)
##
## Residuals:
## Min 1Q Median 3Q Max
## -142225897 -40750574 -4370229 41272192 133135874
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 547632872 10511773 52.097 < 2e-16 ***
## trend -837744 129357 -6.476 1.53e-09 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 61860000 on 138 degrees of freedom
## Multiple R-squared: 0.2331, Adjusted R-squared: 0.2275
## F-statistic: 41.94 on 1 and 138 DF, p-value: 1.532e-09
plot(rail.ts, xlab = "Time (year)", ylab = "Miles Travelled by Rail")
lines(railLinear$fitted, lwd=2)
zoom.auto.ts <- window(auto.ts, start = c(1999,1), end = c(2001,8))
plot(zoom.auto.ts, xlab = "Time", ylab = "Miles Travelled by Automobiles")
autoLinear <- tslm(auto.ts ~ trend)
summary(autoLinear)
##
## Call:
## tslm(formula = auto.ts ~ trend)
##
## Residuals:
## Min 1Q Median 3Q Max
## -32.554 -9.185 0.559 11.087 23.272
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 173.17137 2.42094 71.53 <2e-16 ***
## trend 0.44249 0.02979 14.85 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 14.25 on 138 degrees of freedom
## Multiple R-squared: 0.6152, Adjusted R-squared: 0.6124
## F-statistic: 220.6 on 1 and 138 DF, p-value: < 2.2e-16
plot(auto.ts, xlab = "Time (year)", ylab = "Miles Travelled by Automobiles")
lines(autoLinear$fitted, lwd=2)
zoomautoLinear <- tslm(zoom.auto.ts ~ trend)
summary(zoomautoLinear)
##
## Call:
## tslm(formula = zoom.auto.ts ~ trend)
##
## Residuals:
## Min 1Q Median 3Q Max
## -34.259 -5.425 2.244 11.962 22.279
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 217.4082 5.6759 38.304 <2e-16 ***
## trend 0.6446 0.3002 2.147 0.04 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 15.68 on 30 degrees of freedom
## Multiple R-squared: 0.1332, Adjusted R-squared: 0.1043
## F-statistic: 4.611 on 1 and 30 DF, p-value: 0.03996
plot(zoom.auto.ts, xlab = "Time (year)", ylab = "Miles Travelled by Automobiles")
lines(zoomautoLinear$fitted, lwd=2)
appliances <- read.csv("ApplianceShipments.csv")
appliance.ts <- ts(appliances$Shipments, start = c(1985), end = c(1989), freq =12)
head(appliance.ts)
## [1] 4009 4321 4224 3944 4123 4522
plot(appliance.ts, xlab = "Time (year)", ylab = "Appliances Shipped", bty="l")
I have manually reordered the csv so that it was in the correct order, as I wasn’t sure how to do this in R easily and with so few observations it was a quick fix in Excel.
appliancesADJ <- read.csv("ApplianceShipmentsADJUSTED.csv")
applianceADJ.ts <- ts(appliancesADJ$Shipments, start = c(1985), end = c(1989), freq =12)
head(applianceADJ.ts)
## [1] 4009 4321 4224 3944 4123 4522
plot(applianceADJ.ts, xlab = "Time (year)", ylab = "Appliances Shipped", bty="l")
6.(a) I began by taking a look at the .csv file in Excel to get a sense of the format, then read in the file. I plotted as time series and set up limits for the y-variable.
shampoo <- read.csv("ShampooSales.csv")
shampoo.ts <- ts(shampoo$Shampoo.Sales, start = c(1995,1), end = c(1997,12), freq =12)
plot(shampoo.ts, xlab = "Time (year)", ylab = "Shampoo Sales", ylim = c(100,700), bty="l")
shampooLinear <- tslm(shampoo.ts ~ trend)
summary(shampooLinear)
##
## Call:
## tslm(formula = shampoo.ts ~ trend)
##
## Residuals:
## Min 1Q Median 3Q Max
## -108.74 -52.12 -16.13 43.48 194.25
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 89.14 26.72 3.336 0.00207 **
## trend 12.08 1.26 9.590 3.37e-11 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 78.5 on 34 degrees of freedom
## Multiple R-squared: 0.7301, Adjusted R-squared: 0.7222
## F-statistic: 91.97 on 1 and 34 DF, p-value: 3.368e-11
plot(shampoo.ts, xlab = "Time (year)", ylab = "Shampoo Sales")
lines(shampooLinear$fitted, lwd=2)
6(b). There is a strong linear trend, with increasing shampoo sales over time. However there isn’t seasonality to the data, which makes sense as shampoo is a product used yearround with likely little differences over a seasonal period. I would imagine sales of most staple items or basics wouldn’t have seasonality, shampoo included. There isn’t very visual noise, but as the book suggests there is always some noise.