library(readxl)
library(readr)
library(fpp3)
library(tidyverse)
library(seasonal)
Gasdata <- read_excel("~/Downloads/Gas data .xls")
gast <- Gasdata %>%
mutate(Week=yearweek(`Week of`)) %>%
as_tsibble(index=Week) %>%
mutate(Price=`Weekly U.S. All Grades All Formulations Retail Gasoline Prices Dollars per Gallon`) %>%
select(Week,Price) %>%
filter(year(Week)>=2010)
gast
## # A tsibble: 665 x 2 [1W]
## Week Price
## <week> <dbl>
## 1 2010 W01 2.72
## 2 2010 W02 2.80
## 3 2010 W03 2.79
## 4 2010 W04 2.76
## 5 2010 W05 2.72
## 6 2010 W06 2.71
## 7 2010 W07 2.66
## 8 2010 W08 2.71
## 9 2010 W09 2.76
## 10 2010 W10 2.80
## # … with 655 more rows
gast %>%
autoplot(Price)+
labs(title="Gas Price")
gast %>%
features(Price, features = guerrero)
## # A tibble: 1 × 1
## lambda_guerrero
## <dbl>
## 1 -0.342
gast %>%
autoplot(box_cox(Price,-0.342))+
labs(title="Transformed Gas Prices")
gast %>%
model(STL(Price ~ trend(window=16) + season(window="periodic"), robust = TRUE)) %>%
components() %>% autoplot() +
labs(title = "STL decomposition: Gas")
A large drop happens around 2014 and 2015 which can be attributed to the us production. There is a huge drop in 2020, this is due to covid where demand fell alot. There is a huge jump at the end, this is likely due to large increae in consumer demand.
gas_view <- gast %>%
filter(year(Week)>=2021)
gas_seas <- gast %>%
model(SNAIVE(Price))
gas_mean <- gast %>%
model(MEAN(Price))
gas_drift <- gast %>%
model(RW(Price~drift()))
seasf <- gas_seas %>%
forecast(h=2)
meanf <- gas_mean %>%
forecast(h=2)
driftf <- gas_drift %>%
forecast(h=2)
seasf
## # A fable: 2 x 4 [1W]
## # Key: .model [1]
## .model Week Price .mean
## <chr> <week> <dist> <dbl>
## 1 SNAIVE(Price) 2022 W40 N(3.3, 0.38) 3.28
## 2 SNAIVE(Price) 2022 W41 N(3.4, 0.38) 3.36
meanf
## # A fable: 2 x 4 [1W]
## # Key: .model [1]
## .model Week Price .mean
## <chr> <week> <dist> <dbl>
## 1 MEAN(Price) 2022 W40 N(3, 0.38) 3.01
## 2 MEAN(Price) 2022 W41 N(3, 0.38) 3.01
driftf
## # A fable: 2 x 4 [1W]
## # Key: .model [1]
## .model Week Price .mean
## <chr> <week> <dist> <dbl>
## 1 RW(Price ~ drift()) 2022 W40 N(3.8, 0.0031) 3.83
## 2 RW(Price ~ drift()) 2022 W41 N(3.8, 0.0062) 3.84
seasf %>%
autoplot(gas_view)
meanf %>%
autoplot(gas_view)
driftf %>%
autoplot(gas_view)
gas_seas %>%
accuracy()
## # A tibble: 1 × 10
## .model .type ME RMSE MAE MPE MAPE MASE RMSSE ACF1
## <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 SNAIVE(Price) Training 0.0980 0.615 0.473 0.717 16.0 1 1 0.993
gas_mean %>%
accuracy()
## # A tibble: 1 × 10
## .model .type ME RMSE MAE MPE MAPE MASE RMSSE ACF1
## <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 MEAN(Price) Training -2.16e-16 0.615 0.521 -4.17 17.9 1.10 1.00 0.994
gas_drift %>%
accuracy()
## # A tibble: 1 × 10
## .model .type ME RMSE MAE MPE MAPE MASE RMSSE ACF1
## <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 RW(Price ~ dri… Trai… 6.82e-17 0.0556 0.0395 -0.0224 1.32 0.0834 0.0904 0.589
gas_tr <- gast %>%
stretch_tsibble(.init = 5, .step = 1) %>%
relocate(Week, .id)
gas_tr %>%
model(SNAIVE(Price)) %>%
forecast(h = 1) %>%
accuracy(gast)
## Warning: The future dataset is incomplete, incomplete out-of-sample data will be treated as missing.
## 1 observation is missing at 2022 W40
## # A tibble: 1 × 10
## .model .type ME RMSE MAE MPE MAPE MASE RMSSE ACF1
## <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 SNAIVE(Price) Test 0.0980 0.615 0.473 0.717 16.0 1 1 0.993
gas_tr %>%
model(MEAN(Price)) %>%
forecast(h = 1) %>%
accuracy(gast)
## Warning: The future dataset is incomplete, incomplete out-of-sample data will be treated as missing.
## 1 observation is missing at 2022 W40
## # A tibble: 1 × 10
## .model .type ME RMSE MAE MPE MAPE MASE RMSSE ACF1
## <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 MEAN(Price) Test -0.107 0.618 0.492 -7.65 17.6 1.04 1.01 0.994
gas_tr %>%
model(RW(Price ~ drift())) %>%
forecast(h = 1) %>%
accuracy(gast)
## Warning: The future dataset is incomplete, incomplete out-of-sample data will be treated as missing.
## 1 observation is missing at 2022 W40
## # A tibble: 1 × 10
## .model .type ME RMSE MAE MPE MAPE MASE RMSSE ACF1
## <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 RW(Price ~ dri… Test -7.08e-4 0.0558 0.0396 -0.0270 1.32 0.0837 0.0907 0.591
driftf
## # A fable: 2 x 4 [1W]
## # Key: .model [1]
## .model Week Price .mean
## <chr> <week> <dist> <dbl>
## 1 RW(Price ~ drift()) 2022 W40 N(3.8, 0.0031) 3.83
## 2 RW(Price ~ drift()) 2022 W41 N(3.8, 0.0062) 3.84
hilo(driftf)
## # A tsibble: 2 x 6 [1W]
## # Key: .model [1]
## .model Week Price .mean `80%`
## <chr> <week> <dist> <dbl> <hilo>
## 1 RW(Price ~ drift()) 2022 W40 N(3.8, 0.0031) 3.83 [3.762314, 3.905041]80
## 2 RW(Price ~ drift()) 2022 W41 N(3.8, 0.0062) 3.84 [3.734356, 3.936355]80
## # … with 1 more variable: `95%` <hilo>
The drift model seems to fit the best. The seasonal and naive shows a rather large drop. The accuracy function also states that the drift method is best.
My point forecast for Week 41 is $3.84 with an 80% prediction interval of (3.734356, 3.936355)