1. Load Packages

2. Load Data

PPI_Lobsters <- read.csv("C:/Users/schic/OneDrive/Documents/Predictive Analytics and Forecasting/WPU02230503.csv")

The data that I selected contains monthly data on the Producer Price Index by Commodity for Processed Foods and Feeds (Lobsters). The data contains historical Index Data from December 1991 to February 2024. The original data set is missing a lot of index data between December 1991 and January 2004, so I manipulated the data beforehand to start at January 2004, however, the Index is based on pricing from December 1991 which is index value 100. The source of my data is the FRED website (https://fred.stlouisfed.org/series/WPU02230503)

3. Data Exploration/Manipulation

missmap(PPI_Lobsters)

No data is missing in my data set, so it is not necessary to impute any data points.

summary(PPI_Lobsters)
##      DATE            WPU02230503   
##  Length:242         Min.   : 72.8  
##  Class :character   1st Qu.:119.3  
##  Mode  :character   Median :144.0  
##                     Mean   :152.9  
##                     3rd Qu.:177.6  
##                     Max.   :351.9

Want to change the Date variable to an actual date classification

PPI_Lobsters$DATE <- mdy(PPI_Lobsters$DATE)

4. Historical Data Graph

ggplot(PPI_Lobsters, aes(x = DATE, y = WPU02230503)) +
  geom_line() +
  labs( x = 'Date', y = "PPI Index for Lobsters (DEC 1991 = 100)", title = "Producer Price Index by Commodity: Processed Foods and Feeds: Lobsters") +
  theme_minimal()  

6. ETS Modeling

# Create a time series object for models
PPI_TS <- ts(PPI_Lobsters$WPU02230503, start = c(2004, 1), frequency = 12)

a. Method With Trend

#multiplicative
Trend_model <- ets(PPI_TS, model = "MMN")

print(summary(Trend_model))
## ETS(M,M,N) 
## 
## Call:
##  ets(y = PPI_TS, model = "MMN") 
## 
##   Smoothing parameters:
##     alpha = 0.9999 
##     beta  = 1e-04 
## 
##   Initial states:
##     l = 121.679 
##     b = 1.0178 
## 
##   sigma:  0.1682
## 
##      AIC     AICc      BIC 
## 2888.380 2888.634 2905.825 
## 
## Training set error measures:
##                     ME     RMSE      MAE       MPE     MAPE      MASE
## Training set -1.910486 27.84274 19.13244 -2.801717 12.60604 0.7636451
##                     ACF1
## Training set -0.03851744

b. Method With Seasonality

#additive
Seasonality_model1 <- ets(PPI_TS, model = "AAA")

print(summary(Seasonality_model1))
## ETS(A,A,A) 
## 
## Call:
##  ets(y = PPI_TS, model = "AAA") 
## 
##   Smoothing parameters:
##     alpha = 0.677 
##     beta  = 2e-04 
##     gamma = 2e-04 
## 
##   Initial states:
##     l = 115.6136 
##     b = 0.7264 
##     s = -16.431 -15.7638 -8.2331 -11.0871 -14.7674 -5.8876
##            -17.5165 -1.9738 40.6366 34.0314 18.5632 -1.5708
## 
##   sigma:  22.3818
## 
##      AIC     AICc      BIC 
## 2850.162 2852.894 2909.474 
## 
## Training set error measures:
##                       ME     RMSE      MAE       MPE     MAPE      MASE
## Training set -0.04971876 21.62929 16.14577 -1.297027 10.94511 0.6444363
##                   ACF1
## Training set 0.0755702
#multiplicative
Seasonality_model2 <- ets(PPI_TS, model = "MAM")

print(summary(Seasonality_model2))
## ETS(M,Ad,M) 
## 
## Call:
##  ets(y = PPI_TS, model = "MAM") 
## 
##   Smoothing parameters:
##     alpha = 0.6534 
##     beta  = 1e-04 
##     gamma = 1e-04 
##     phi   = 0.975 
## 
##   Initial states:
##     l = 118.3597 
##     b = 1.4516 
##     s = 0.8907 0.8702 0.9303 0.9197 0.9016 0.9818
##            0.919 1.0012 1.2663 1.2018 1.1087 1.0086
## 
##   sigma:  0.14
## 
##      AIC     AICc      BIC 
## 2805.695 2808.762 2868.496 
## 
## Training set error measures:
##                     ME     RMSE      MAE        MPE    MAPE      MASE
## Training set 0.5597072 20.70642 15.41946 -0.9317797 10.2549 0.6154468
##                    ACF1
## Training set 0.07602288

7. Plotting Models/Forecasting

#setting up forecasts from models
forecasts1 <- forecast(Seasonality_model2, h = 8)

forecasts2 <- forecast(Seasonality_model1, h = 8)

forecasts3 <- forecast(Trend_model, h = 8)
#plotting forecasts with historical data 
autoplot(PPI_TS) +
  autolayer(forecasts1, color = "blue") + labs(title = "Forecast for Multiplicative Holt-Winters' Model") + ylab("Index Value")

autoplot(PPI_TS) +
  autolayer(forecasts2, color = "red") + labs(title = "Forecast for Additive Holt-Winters' Model") + ylab("Index Value")

autoplot(PPI_TS) +
  autolayer(forecasts3, color = "purple") + labs(title = "Forecast for Multiplicative Holt's Model") + ylab("Index Value")

plot(Seasonality_model2)

plot(Seasonality_model1)

plot(Trend_model)

8. Analysis/Comparison

Based on the information gathered above, through the summary of the models, and the forecasts given from the fitted models, I have concluded that the most efficient model is the Multiplicative Holt-Winters’ model. In terms of visual performance in forecasting, the Multiplicative Holt’s model has much greater variability in forecasting compared to the Holt-Winters’ models, therefore, I would be hesitant to use this model compared to the models that consider seasonality. When comparing the Holt-Winters’ models, I have concluded that the multiplicative version of this model yields the lowest BIC and AICc values, as well as the lowest RMSE and MAE values respectively. Therefore, this is the strongest model in this instance. This is likely because the data is not seasonally adjusted, so it contains a within-year seasonal pattern, and the seasonality is also increasing over time.