Homework 2: Airline Commuters Naive Models

Using Dr. Seaver’s airline data, we are to fit the no-change model as well as three other naive models. Evaluations of the various models utilizing MSE, RMSE, MPE, and MAPE are presented, and the best model is then used for a forecast over a short horizon.

setwd("~/Documents/time_series/Homework_2")

library(forecast)

## Loading required package: zoo

## 
## Attaching package: 'zoo'

## The following objects are masked from 'package:base':
## 
##     as.Date, as.Date.numeric

## Loading required package: timeDate

## This is forecast 6.2

library(ggplot2)
library(ggfortify)

# Reading in data
AIRLINE <- read.csv("airline.csv")

We first begin by creating a dataframe from Dr. Seaver’s data that will allow us to work with the time series conveniently:

# Remove 'trend' from data
AIRLINE$Trend <- NULL
# Remove 'majair'
AIRLINE$Majair <- NULL # When ready, remove this and reload the csv
# Remove 'Year'
AIRLINE$Year <- NULL # When ready, remove this and reload the csv

After formatting the dataframe, create time series objects for \(R\) and plot the series.

# Creating a time series object.
majair_series <- ts(data = AIRLINE, start = 1982, end = 2014, frequency = 1)

# Plots time series for both 'Commuters' and 'Majair'
autoplot(majair_series, ts.colour = "red")

# Creating a ggplot object for commuters
#commuters_plot <-  ggplot(data = air_series, aes(x = Year, y = Commuters))

# Creating ggplot object for majair
#majair_plot <- ggplot(data = majair_series, aes(x = Year, y = Majair))

# Plotting
#commuters_plot + geom_line() # need to rescale x
#majair_plot + geom_line()

Models

We will consider four naive models: the No-Change Model, the Percent Change Model, the Proportion of Most Recent Change Model, and one slightly more complex naive model.

No-Change Model

The No-Change Model is a random walk model given by

\[{ \hat { y } }_{ t }={ y }_{ t-1 }\]

where our estimate, \({ \hat { y} }\), is equal to the previous observation.

# No-change model (also called lag_1)
no_change <- lag(majair_series, k = 1)

# Append no change to data.frame
#AIRLINE$no_change <- no_change

# Printing accuracy evaluations for model
accuracy(no_change, majair_series)

##              ME     RMSE      MAE       MPE     MAPE       ACF1 Theil's U
## Test set 0.2025 4.795409 3.796875 -18.62871 55.65716 -0.4144512 0.9484692

Percent Change Model

The percent change model is small extension of the no change model, where our predicted output is some proportion, \(p\), of the no change model given by

\[{ \hat { y } }_{ t }=(1+p)\cdot { y }_{ t-1 }\]

# Setting p, the percent change
p <- 0.05

# Percent change model
percent_change <- (1+p)*(no_change)

# Append model to data.frame
#AIRLINE$percent_change <- percent_change

# Printing accuracy evaluations for model
accuracy(percent_change, majair_series)

##                  ME     RMSE      MAE       MPE    MAPE       ACF1
## Test set -0.2091875 4.915895 3.918906 -24.56014 58.3819 -0.4134494
##          Theil's U
## Test set 0.9820811

Proportion of Most Recent Change Model

Next we will consider a model as the ‘proportion of most recent change’ given by

\[{ \hat { y } }_{ t }={ y }_{ t-1 }+(1+p)({ y }_{ t-1 }-{ y }_{ t-2 })\]

where \(p\) is the percent change.

# Creating lag two component for next model. 
lag_2 <- lag(majair_series, 2)

# Initializng percent value
p <- 0.05

# Proportion of most recent change model
proportion_change <- no_change + (1 + p)*(no_change-lag_2)

# Append model to data.frame
#AIRLINE$proportion_change <- proportion_change

# Printing accuracy evaluations for model
accuracy(proportion_change, majair_series)

##                  ME     RMSE      MAE       MPE     MAPE       ACF1
## Test set 0.01367742 8.267702 7.073742 -18.97467 103.2504 -0.6498769
##          Theil's U
## Test set  1.845644

A More Complex Model

Our final (slightly more complex) model is given by

\[{ \hat { y } }_{ t }={ y }_{ t-1 }\left( \frac { { y }_{ t-1 } }{ { y }_{ t-2 } } \right) +{ y }_{ t-1 }\]

# Final Model
complex_model <- no_change*(no_change/lag_2) + no_change

# Append model to data.frame
#AIRLINE$complex_model <- complex_model

# Printing accuracy evaluations for model
accuracy(complex_model, majair_series)

##                 ME     RMSE      MAE       MPE    MAPE       ACF1
## Test set -11.74275 19.01858 12.81693 -189.2299 197.681 -0.4033347
##          Theil's U
## Test set  3.624107

Model Evaluation

For clarity, repeated here are the model evaluations measuring forecast accuracies:

Model | ME | RMSE | MAE | MPE | MAPE | ACF1 |
———————- | ———- | ———- | ———- | ———- | ———- | ———- |
No Change Model | 0.004 | 0.964 | 0.747 | -5.011 | 25.876 | -0.531 |
Percent Change Model | -0.151 | 1.000 | 0.777 | -10.262 | 27.569 | -0.530 |
Proportion Change Model| -0.009 | 1.732 | 1.338 | -6.119 | 47.179 | -0.678 |
Complex Model | -3.404 | 4.145 | 1.338 | -122.007 | 122.174 | -0.468 |

Forecast

# Creating a forecast object and printing point estimate
(no_change_forecast <- forecast(no_change, h = 1, level = 95))

##      Point Forecast      Lo 95   Hi 95
## 2014       7.897118 -0.4583648 16.2526

# Printing a plot of the forecast to illustrate confidence interval
autoplot(no_change_forecast, ts.colour = "red")

(percent_change_forecast <- forecast(percent_change, h = 1, level = 95))

##      Point Forecast      Lo 95    Hi 95
## 2014       8.790537 0.04082774 17.54025

autoplot(percent_change_forecast, ts.colour = "red")

(proportion_change_forecast <- forecast(proportion_change, h = 1, level = 95))

##      Point Forecast     Lo 95    Hi 95
## 2013       8.651418 -7.564831 24.86767

autoplot(proportion_change_forecast, ts.colour = "red")

(complex_model_forecast <- forecast(complex_model, h = 1, level = 95))

##      Point Forecast     Lo 95    Hi 95
## 2013       20.59062 -9.792635 50.97387

autoplot(complex_model_forecast, ts.colour = "red")