Homework 2: Airline Naive Models

Using Dr. Seaver’s airline data, we are to fit the no-change model as well as three other naive models. Evaluations of the various models utilizing MSE, RMSE, MPE, and MAPE are presented, and the best model is then used for a forecast over a short horizon.

setwd("~/Documents/time_series/Homework_2")

library(forecast)

## Loading required package: zoo

## 
## Attaching package: 'zoo'

## The following objects are masked from 'package:base':
## 
##     as.Date, as.Date.numeric

## Loading required package: timeDate

## This is forecast 6.2

library(ggplot2)
library(ggfortify)

# Reading in data
AIRLINE <- read.csv("airline.csv")

We first begin by creating a dataframe from Dr. Seaver’s data that will allow us to work with the time series conveniently:

# Remove 'trend' from data
AIRLINE$Trend <- NULL
# Remove 'commuters'
AIRLINE$Commuters <- NULL # When ready, remove this and reload the csv
# Remove 'Year'
AIRLINE$Year <- NULL # When ready, remove this and reload the csv

After formatting the dataframe, create time series objects for \(R\) and plot the series.

# Creating a time series object.
commuter_series <- ts(data = AIRLINE, start = 1982, end = 2014, frequency = 1)

# Plots time series for both 'Commuters' and 'Majair'
autoplot(commuter_series, ts.colour = "red")

# Creating a ggplot object for commuters
#commuters_plot <-  ggplot(data = air_series, aes(x = Year, y = Commuters))

# Creating ggplot object for majair
#majair_plot <- ggplot(data = majair_series, aes(x = Year, y = Majair))

# Plotting
#commuters_plot + geom_line() # need to rescale x
#majair_plot + geom_line()

Models

We will consider four naive models: the No-Change Model, the Percent Change Model, the Proportion of Most Recent Change Model, and one slightly more complex naive model.

No-Change Model

The No-Change Model is a random walk model given by

\[{ \hat { y } }_{ t }={ y }_{ t-1 }\]

where our estimate, \({ \hat { y} }\), is equal to the previous observation.

# No-change model (also called lag_1)
no_change <- lag(commuter_series, k = 1)

# Append no change to data.frame
#AIRLINE$no_change <- no_change

# Printing accuracy evaluations for model
accuracy(no_change, commuter_series)

##                  ME      RMSE       MAE       MPE     MAPE       ACF1
## Test set -0.0040625 0.9641236 0.7471875 -4.782192 24.80711 -0.5307255
##          Theil's U
## Test set  1.023921

Percent Change Model

The percent change model is small extension of the no change model, where our predicted output is some proportion, \(p\), of the no change model given by

\[{ \hat { y } }_{ t }=(1+p)\cdot { y }_{ t-1 }\]

# Setting p, the percent change
p <- 0.05

# Percent change model
percent_change <- (1+p)*(no_change)

# Append model to data.frame
#AIRLINE$percent_change <- percent_change

# Printing accuracy evaluations for model
accuracy(percent_change, commuter_series)

##                  ME    RMSE       MAE      MPE     MAPE      ACF1
## Test set -0.1597188 1.00149 0.7745625 -10.0213 26.32945 -0.529056
##          Theil's U
## Test set  1.054298

Proportion of Most Recent Change Model

Next we will consider a model as the ‘proportion of most recent change’ given by

\[{ \hat { y } }_{ t }={ y }_{ t-1 }+(1+p)({ y }_{ t-1 }-{ y }_{ t-2 })\]

where \(p\) is the percent change.

# Creating lag two component for next model. 
lag_2 <- lag(commuter_series, 2)

# Initializng percent value
p <- 0.05

# Proportion of most recent change model
proportion_change <- no_change + (1 + p)*(no_change-lag_2)

# Append model to data.frame
#AIRLINE$proportion_change <- proportion_change

# Printing accuracy evaluations for model
accuracy(proportion_change, commuter_series)

##                   ME     RMSE      MAE       MPE     MAPE       ACF1
## Test set -0.01251613 1.731603 1.336903 -5.080532 43.98106 -0.6769156
##          Theil's U
## Test set  1.791769

A More Complex Model

Our final (slightly more complex) model is given by

\[{ \hat { y } }_{ t }={ y }_{ t-1 }\left( \frac { { y }_{ t-1 } }{ { y }_{ t-2 } } \right) +{ y }_{ t-1 }\]

# Final Model
complex_model <- no_change*(no_change/lag_2) + no_change

# Append model to data.frame
#AIRLINE$complex_model <- complex_model

# Printing accuracy evaluations for model
accuracy(complex_model, commuter_series)

##                 ME     RMSE      MAE       MPE     MAPE       ACF1
## Test set -3.460607 4.422238 3.553702 -119.2907 121.1653 -0.4517987
##          Theil's U
## Test set  4.366796

Model Evaluation

For clarity, repeated here are the model evaluations measuring forecast accuracies:

Model	ME	RMSE	MAE	MPE	MAPE	ACF1
No Change Model	-0.203	4.795	3.797	-23.630	56.817	-0.414
Percent Change Model	-0.624	4.956	3.922	-29.812	60.168	-0.412
Proportion Change Model	0.016	8.269	7.073	-22.357	103.335	-0.649
Complex Model	-11.555	17.865	12.816	-190.096	202.439	-0.431

Forecast

Based upon the above measures of accuracy, we find that the No Change Model performs best. This is unsatisfying; the model sets a low threshold to compare with other, more complicated models, predicting each new value to be that of the previous.

Utilizing a \(95\)% confidence interval, the forecast for the No Change Model gives the forllowing immediate forecast horizon:

# Creating a forecast object and printing point estimate
(no_change_forecast <- forecast(no_change, h = 1, level = 95))

##      Point Forecast    Lo 95    Hi 95
## 2014       2.740404 1.438502 4.042306

# Printing a plot of the forecast to illustrate confidence interval
autoplot(no_change_forecast, ts.colour = "red")