Course Project Presentation

Shiny App "Flight Departure Delay Predictor for Newark Airport in NY"

ConnieZ

Description of the App

This Shiny App was inspired by nycflights13 R package that contains data about flights departing NYC in 2013, including the historic weather data. The Shiny App can be found here: http://conniez.shinyapps.io/delayPredictor/

It calculates estimated probability of departure delay for a flight for a particular airline based on the provided weather conditions. The input values for the app are as follows: a) airline name, b) temperature (in degrees Fahrenheit), c) precipitation (in inches), d) wind speed (in miles per hour), e) visibility level (numeric value)

In order to help users provide legitimate values, the app gives today's forcast for Newark Airport area, obtained via weatherData R package.

The app runs initially with default data, however, once the user changes any of the input values, the app refreshes instantly and displays the newly provided values, as well as the likelihood of delay estimate and the refreshed chart with average delay for the chosen airline grouped by month.

The Model behind the App

For this model, "dep_delay" variable becomes delay, equal to 0 when "dep_delay" is negative (no delay) and equal to 1 when "dep_delay" is positive (delay occurred)

#read in the clean data set of flights data merged with weather data
merged <- read.csv("data/cleandata.csv", header = TRUE)
#run the binomial regression model
glmfit <- glm(delay ~ carrier + visib + precip + wind + 
                  temp, data = merged, family = binomial)
#display the the p-values for coefficients
as.data.frame(summary(glmfit)$coefficients[,4])
        summary(glmfit)$coefficients[, 4]

(Intercept) 1.087e-09 carrierAA 1.158e-02 carrierAS 8.465e-03 carrierB6 3.824e-07 carrierDL 7.504e-03 carrierEV 8.713e-20 carrierMQ 7.293e-06 carrierOO 8.316e-01 carrierUA 1.539e-28 carrierUS 9.185e-01 carrierVX 6.723e-06 carrierWN 2.847e-33 visib 7.349e-28 precip 5.273e-18 wind 1.439e-07 temp 5.355e-06

Justification of the Model

As you could see p-values for all estimates were significant (<0.05) except for carrierOO and carrierUS. To verify the significance of the model, we compare chi-square for the model to the chi-square of a model without any predictors(the null model).

with(glmfit, null.deviance - deviance) #difference in deviance for the two models
## [1] 1617
with(glmfit, df.null - df.residual) #df for the difference between the two models
## [1] 15
with(glmfit, pchisq(null.deviance-deviance, df.null-df.residual, lower.tail = FALSE))
## [1] 0

Last Notes

The app also includes the confidence interval for the estimate of delay probability. The chart for average departure and arrival delay by airline is built using rCharts.

There is obviously a limitation to this model: the process of random sampling of data doesn't include weighting based on all variables we are passing to the model, however, this is omitted because the course project was calling for simplicity.

To check if the app's prediction is close to reality, check out this link to Flight Stats: http://www.flightstats.com/go/FlightStatus/flightStatusByFlight.do

Hope you enjoy the app.

Thank you!