Author

Penelope Pooler Eisenbies

Published

April 30, 2025

Import and Examine Data

Code
```{r}
#| label: import data

# data source: https://www.transtats.bts.gov/Data_Elements.aspx?Data=1

delta <- read_csv("data/delta_airlines_4_30_2025.csv",
                   show_col_types = F, skip=1)|>
  filter(Month != "TOTAL") |>
  mutate(date_som = ym(paste(Year, Month)),
         Date = ceiling_date(date_som, "month")-1,
         Total = (TOTAL/1000000) |> round(2)) |>
  select(Date, Total) |>
  glimpse(width=60)

delta_pp <- delta |>          # post_pandemic data 
  filter(Date >= "2021-06-01") |>
  glimpse(width=60)
```
Rows: 268
Columns: 2
$ Date  <date> 2002-10-31, 2002-11-30, 2002-12-31, 2003-01…
$ Total <dbl> 7.65, 7.11, 7.75, 6.76, 6.28, 7.45, 6.59, 6.…
Rows: 44
Columns: 2
$ Date  <date> 2021-06-30, 2021-07-31, 2021-08-31, 2021-09…
$ Total <dbl> 10.34, 11.50, 11.09, 9.77, 10.62, 10.63, 10.…

Interactive Time Series Plot

Code
```{r}
#| label: create dygraph

# create interactive plot
delta_xts <- xts(x=delta[,2], order.by=delta$Date)

(delta_dg <- dygraph(delta_xts[,1], 
                      main="Delta Air Lines - Total Passengers") |>
    dySeries("Total", label="Total (Mill.)", color= "blue") |>
    dyAxis("y", label = "", drawGrid = FALSE) |>
    dyAxis("x", label = "", drawGrid = FALSE) |>
    dyShading(from="2020-3-12", to="2021-6-14", color = "lightgrey") |>
    dyRangeSelector())
```

Full and Truncated Data Plots

Plot of Full Time Series

Seasonal pattern was disrupted by the pandemic when air travel was considered dangerous.

Code
```{r}
#| label: Full Time Series
#| warning: false
#| message: false

(full_plot <- delta |>
  ggplot() +
  geom_line(aes(x=Date, y=Total), linewidth=1, color="blue") +
  theme_classic() +
  scale_x_date(date_breaks = "2 years",  
               date_labels = "%Y", 
               limits=c(ymd("2001-12-31"), ymd("2025-01-31"))) +
  #scale_y_continuous(breaks=seq(0,5,.5), limits=c(0,5)) +
  labs(title="delta Airlines: October 2002 - January 2025",
       subtitle="Total Passengers (Domestic and International)",
       x="Date", y="Millions of Passengers",
       caption="Data Source: https://www.bts.gov/" ) +
  theme(plot.title = element_text(size = 15),
        plot.caption = element_text(size = 10),
        axis.title.x = element_text(size = 15),
        axis.title.y = element_text(size = 15),
        axis.text.x = element_text(size = 10),
        axis.text.y = element_text(size = 15)))

ggsave("img/delta_full_plot_2025_04_30.png", width=6, height=4, unit="in")
```

Plot of Truncated Time Series

Once vaccines became readily available, air travel began to resume it’s traditional pattern:

  • Peaks occur at the end of July

  • Low points occur at the end of Jnauary

  • Pattern is not straightforward to discern because post-pandemic data are fairly limited.

  • Data for February and March of 2024 are not available yet.

Code
```{r}
#| label: Truncated Time Series
#| warning: false
#| message: false

(short_plot <- delta_pp |>
  ggplot() +
  geom_line(aes(x=Date, y=Total), linewidth=1, color="blue") +
  theme_classic() +
  scale_x_date(date_breaks = "2 months",  
               date_labels = "%b", 
               limits=c(ymd("2021-05-31"), ymd("2025-1-31")))+
  #scale_y_continuous(limits=c(2,5)) +
  labs(title="Delta Air Lines: June 2021 - January 2025",
       subtitle="Total Passengers (Domestic and International)",
       x="Date", y="Millions of Passengers",
       caption="Data Source: https://www.bts.gov/" ) +
  theme(plot.title = element_text(size = 15),
        plot.caption = element_text(size = 10),
        axis.title.x = element_text(size = 15),
        axis.title.y = element_text(size = 15),
        axis.text.x = element_text(size = 10),
        axis.text.y = element_text(size = 15)))

#ggsave("img/delta_trnc_plot_2024_04_21.png", width=6, height=4, unit="in")
```

Forecast Modeling

Create Time Series (ts)

Code
```{r}
#| label: create time series

delta_ts <- ts(delta_pp$Total, freq=12, start=c(2021,6)) # create time series

delta_ts # display time series
```
       Jan   Feb   Mar   Apr   May   Jun   Jul   Aug   Sep   Oct   Nov   Dec
2021                               10.34 11.50 11.09  9.77 10.62 10.63 10.38
2022  8.81  9.13 12.18 12.05 12.44 12.70 12.95 12.84 12.09 12.43 11.95 12.06
2023 11.49 10.63 13.46 13.14 14.24 14.81 15.40 14.61 13.44 14.11 13.04 13.27
2024 11.97 11.70 14.20 13.97 15.20 15.61 14.89 15.36 13.83 14.43 13.00 13.91
2025 11.92                                                                  

Model 1

Code
```{r}
#| label: Model 1

delta_model1 <- delta_ts |> auto.arima(ic="aic", seasonal=F)

delta_forecast1 <- delta_model1 |> forecast(h=12)

(delta_fcp1 <- autoplot(delta_forecast1) + 
    labs(y = "Number of Passenger (Mill.)") + 
    theme_classic())

delta_forecast1

checkresiduals(delta_forecast1, test=F)

(acr1 <- accuracy(delta_forecast1)) 
```
         Point Forecast     Lo 80    Hi 80    Lo 95    Hi 95
Feb 2025          11.92 10.512578 13.32742 9.767533 14.07247
Mar 2025          11.92  9.929604 13.91040 8.875952 14.96405
Apr 2025          11.92  9.482273 14.35773 8.191818 15.64818
May 2025          11.92  9.105155 14.73484 7.615067 16.22493
Jun 2025          11.92  8.772908 15.06709 7.106938 16.73306
Jul 2025          11.92  8.472534 15.36747 6.647555 17.19244
Aug 2025          11.92  8.196311 15.64369 6.225109 17.61489
Sep 2025          11.92  7.939209 15.90079 5.831905 18.00810
Oct 2025          11.92  7.697733 16.14227 5.462600 18.37740
Nov 2025          11.92  7.469340 16.37066 5.113303 18.72670
Dec 2025          11.92  7.252108 16.58789 4.781076 19.05892
Jan 2026          11.92  7.044546 16.79545 4.463637 19.37636
                     ME     RMSE       MAE         MPE     MAPE      MASE
Training set 0.03614409 1.085666 0.8211441 -0.07313768 6.554353 0.6523488
                   ACF1
Training set -0.2279021

Model 2

Code
```{r}
#| label: Model 2

delta_model2 <- delta_ts |> auto.arima(ic="aic", seasonal=T) 

delta_forecast2 <- delta_model2 |> forecast(h=12)

(delta_fcp2 <- autoplot(delta_forecast2) + 
    labs(y = "Number of Passenger (Mill.)") + 
    theme_classic())

delta_forecast2

checkresiduals(delta_forecast2, test=F)

(acr2 <- accuracy(delta_forecast2)) 
```
         Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
Feb 2025       11.92363 11.24113 12.60612 10.87984 12.96741
Mar 2025       14.42363 13.69401 15.15324 13.30778 15.53947
Apr 2025       14.19363 13.41976 14.96749 13.01010 15.37715
May 2025       15.42363 14.60790 16.23935 14.17608 16.67117
Jun 2025       15.83363 14.97809 16.68916 14.52520 17.14205
Jul 2025       15.11363 14.22005 16.00720 13.74702 16.48023
Aug 2025       15.58363 14.65357 16.51368 14.16122 17.00603
Sep 2025       14.05363 13.08846 15.01879 12.57753 15.52972
Oct 2025       14.65363 13.65459 15.65266 13.12573 16.18152
Nov 2025       13.22363 12.19183 14.25542 11.64562 14.80163
Dec 2025       14.13363 13.07007 15.19718 12.50706 15.76019
Jan 2026       12.14363 11.04924 13.23801 10.46991 13.81734
                     ME      RMSE       MAE        MPE     MAPE      MASE
Training set -0.1163179 0.4397431 0.2979508 -0.9137009 2.266666 0.2367037
                    ACF1
Training set -0.04891274