About the Dataset

This dataset is the Monthly Ice Cream Production in the United States from 1918 to 1940. Here is the link to the original dataset. The dataset is measured in millions of gallons and is not seasonally adjusted.

Load and Clean Data

The CSV file is read into R and prepared for analysis. The observation_date column is converted into a yearmonth format, so the data can be treated as a proper time series. Irrelevant columns are removed, and the dataset is converted to a tsibble, which makes it compatible with forecasting tools in the fpp3 framework. Renaming variables helps keep the workflow clear and reproducible.

# Load in data
setwd('/Users/emilybroderick/Desktop/R/PAF')
ice_cream <- read.csv("old_ice_cream.csv")

# Clean data and convert to tsibble
ice_cream <- ice_cream %>% mutate(
  Months = yearmonth(observation_date) # Indicates time as months
) %>%
  select(-observation_date) %>% # Removes old time column
  as_tsibble(index = Months) %>% # Converts to a tsibble
  rename("production" = "M0174BUSM411NNBR") # Rename for reproducability

Create Training and Test Sets

Training and tests sets are created. The training set are composed of ~80% of the data and is used to train the models. The test set is composed of ~20% of the data and is used to test the models.

# Filter ~80% of observations into a training set (1918-1936)
train <- ice_cream %>%
  filter(year(Months) <= 1936)

# Filter ~20% of observations into a test set (1937-1940)
test <- ice_cream %>%
  filter(year(Months) > 1936)

Initial Visualization of the Training Data

Time Plot

# Create a time plot to get an idea of what the data look like
time_plot <- autoplot(train, production) +
  labs(title = "Monthly Ice Cream Production in the US from 1918 to 1936",
y = "Ice Cream Production (Millions of Gallons)")

time_plot

This plot shows the strong seasonality of these data. There is a positive trend over the first 10 years, with a sharp decrease in production in the early 1930s, likely due to the Great Depression. Around 1934, a positive trend can be seen again.

Seasonal Plot

# Create a seasonal plot of the years by month
seasonal_plot_yr <- train %>% 
  gg_season(production, period = "year") +
  labs(y="Ice Cream Production (Millions of Gallons)", 
       title="Seasonal Plot: Monthly US Ice Cream Production: 1918-1936")

seasonal_plot_yr

This plot reinforces the consistent seasonality of the data.

Seasonal Subseries Plot

seasonal_sub <- train %>%
  gg_subseries(production) +
  labs(
    y = "Ice Cream Production (Millions of Gallons)",
    title = "Seasonal Subseries Plot: Monthly US Ice Cream Production: 1918-1936"
  )

seasonal_sub

This plot shows the slight variability in production year to year, but also the strong seasonality of the data.

ACF Graph

ic_acf_feat <- train %>% 
  features(production, feat_acf)

ic_acf_feat %>%
  gt() %>%
  tab_header(title = "Time Series ACF Features")

acf1	acf10	diff1_acf1	diff1_acf10	diff2_acf1	diff2_acf10	season_acf1
Time Series ACF Features
0.8372187	2.354666	0.6068245	1.631205	-0.05561128	0.2439343	0.8976849

ic_acf <- acf(train$production, main="ACF of Ice Cream Sales")

The ACF graph shows that ice cream production is strongly correlated with its past values, especially at lags 1 through 4, and then again around lag 12. That pattern means there’s persistence in the data (what happens this month is related to what happened in recent months). The big spike at lag 12 suggests a yearly seasonal cycle where production in one month is very similar to production in the same month the previous year.

PACF Graph

ic_pacf_feat <- train %>% 
  features(production, feat_pacf)

ic_pacf_feat %>%
  gt() %>%
  tab_header(title = "Time Series PACF Features")

pacf5	diff1_pacf5	diff2_pacf5	season_pacf
Time Series PACF Features
1.312455	1.371892	1.085238	-0.009753806

ic_pacf <- pacf(train$production, main = "PACF of Ice Cream Sales")

The PACF graph shows a big spike at the first lag, which means the current month is especially influenced by the immediately previous month. There are also some smaller spikes around lag 12, which line up with the seasonal effect we saw in the ACF.

Together, the graphs show that the ice cream production data has both short-term relationships (month to month) and longer-term seasonal patterns (year to year).

Decompose the Training Set

STL Decomposition

Additive Decomposition

# Additive decomp
add_dcmp <- train %>%
  model(stl = STL(production)) # Additive decomp
components(add_dcmp) %>%
  autoplot() + # Plot the decomp
  labs(title = "Additive STL Decomposition of US Ice Cream Production")

Multiplicative Decomposition

To perform a multiplicative decomposition using STL, I took the Log transform of the data. Taking the Log transform of the data converts a multiplicative relationship to an additive one since STL assumes an additive relationship.

# Mult decomp
mult_dcmp <- train %>%
  model(stl = STL(log(production))) # Log transform
components(mult_dcmp) %>%
  autoplot() + # Plot the decomp
  labs(title = "Multiplicative STL Decomposition of US Ice Cream Production")

Which is the best?

I used accuracy() to compare the two STL decompositions because it provides error metrics (like RMSE, MAE, and MAPE) that quantify how well the reconstructed series matches the original data, allowing me to objectively determine which decomposition fit better. STL decomposition is more sophisticated than its classical counterpart, which is why only the STL decompositions were compared.

m1 <- train %>% model(add = STL(production))
m2 <- train %>% model(mult = STL(log(production)))

decomp <- rbind(accuracy(m1),accuracy(m2))

decomp %>%
  gt() %>%
  tab_header(title = "Additive vs. Multiplicative STL Decomposition")

.model	.type	ME	RMSE	MAE	MPE	MAPE	MASE	RMSSE	ACF1
Additive vs. Multiplicative STL Decomposition
add	Training	-0.002892767	1.421927	1.0619461	0.8435918	7.696093	0.5074113	0.5064117	0.2230786
mult	Training	0.064605476	1.130203	0.7606183	-0.1529892	4.829916	0.3634331	0.4025158	-0.1185394

The multiplicative model performed better, with lower RMSE, MAE, and MAPE, as well as smaller autocorrelation in the residuals, showing that it captured the proportional seasonal variation in the data more effectively. This means that the residuals are closer to white noise and that the data is fit more accurately. I could tell it was better both from the improved fit metrics and from inspecting the decomposed seasonal component, which looked more stable under the multiplicative approach. In forecasting, I would use the multiplicative (log-transformed) decomposition since it accounts for seasonality that grows with the trend, making predictions more accurate and realistic for this dataset.

Model and Forecast Using the Training Set with Log Transformation

ETS Model

The ETS Model was chose based on the lowest AICc.

ets_fit <- train %>%
  model(ETS(log(production)))

report(ets_fit)

## Series: production 
## Model: ETS(A,Ad,A) 
## Transformation: log(production) 
##   Smoothing parameters:
##     alpha = 0.2810165 
##     beta  = 0.03944206 
##     gamma = 0.3169409 
##     phi   = 0.8004346 
## 
##   Initial states:
##      l[0]      b[0]       s[0]      s[-1]      s[-2]     s[-3]     s[-4]
##  1.757429 0.1639944 -0.7452937 -0.5360776 -0.1189981 0.3572363 0.7971287
##      s[-5]     s[-6]     s[-7]        s[-8]      s[-9]    s[-10]     s[-11]
##  0.8938615 0.7496235 0.4639395 0.0004961679 -0.3312976 -0.755189 -0.7754298
## 
##   sigma^2:  0.009
## 
##      AIC     AICc      BIC 
## 182.8006 186.0733 244.5288

ets_fc <- forecast(ets_fit, new_data = test)
ets_fc

## # A fable: 48 x 4 [1M]
## # Key:     .model [1]
##    .model                 Months        production .mean
##    <chr>                   <mth>            <dist> <dbl>
##  1 ETS(log(production)) 1937 Jan  t(N(2.4, 0.009))  10.8
##  2 ETS(log(production)) 1937 Feb t(N(2.4, 0.0099))  10.9
##  3 ETS(log(production)) 1937 Mar  t(N(2.8, 0.011))  16.4
##  4 ETS(log(production)) 1937 Apr    t(N(3, 0.012))  20.0
##  5 ETS(log(production)) 1937 May  t(N(3.4, 0.013))  31.1
##  6 ETS(log(production)) 1937 Jun  t(N(3.6, 0.015))  36.8
##  7 ETS(log(production)) 1937 Jul  t(N(3.8, 0.016))  42.9
##  8 ETS(log(production)) 1937 Aug  t(N(3.6, 0.018))  36.3
##  9 ETS(log(production)) 1937 Sep  t(N(3.2, 0.019))  24.0
## 10 ETS(log(production)) 1937 Oct  t(N(2.7, 0.021))  15.7
## # ℹ 38 more rows

ARIMA Model

Stepwise

arst_fit <- train %>%
  model(ARIMA(log(production)))
report(arst_fit)

## Series: production 
## Model: ARIMA(2,0,1)(1,1,1)[12] 
## Transformation: log(production) 
## 
## Coefficients:
##          ar1     ar2      ma1    sar1     sma1
##       0.8838  0.1014  -0.4334  0.0702  -0.7153
## s.e.  0.1378  0.1331   0.1298  0.1060   0.0824
## 
## sigma^2 estimated as 0.009239:  log likelihood=197.8
## AIC=-383.59   AICc=-383.19   BIC=-363.34

arst_fc <- forecast(arst_fit, new_data = test)

Search

arse_fit <- train %>%
  model(ARIMA(log(production)))
report(arse_fit)

## Series: production 
## Model: ARIMA(2,0,1)(1,1,1)[12] 
## Transformation: log(production) 
## 
## Coefficients:
##          ar1     ar2      ma1    sar1     sma1
##       0.8838  0.1014  -0.4334  0.0702  -0.7153
## s.e.  0.1378  0.1331   0.1298  0.1060   0.0824
## 
## sigma^2 estimated as 0.009239:  log likelihood=197.8
## AIC=-383.59   AICc=-383.19   BIC=-363.34

Since both the stepwise and search methods for determining the best ARIMA model produced the same answer, the stepwise method will be used, since it is the preferred choice.

Time Series Linear Regression Model

# Fit the linear regression model
reg_fit <- train %>%
  model(Regression = TSLM(log(production) ~ trend() + season())) # Linear Regression Model
report(reg_fit)

## Series: production 
## Model: TSLM 
## Transformation: log(production) 
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.67145 -0.13995  0.01561  0.16944  0.43458 
## 
## Coefficients:
##                 Estimate Std. Error t value Pr(>|t|)    
## (Intercept)    1.7447859  0.0560915  31.106  < 2e-16 ***
## trend()        0.0016660  0.0002222   7.498 1.68e-12 ***
## season()year2  0.0614677  0.0715493   0.859   0.3912    
## season()year3  0.4175650  0.0715503   5.836 1.95e-08 ***
## season()year4  0.7043024  0.0715521   9.843  < 2e-16 ***
## season()year5  1.1285231  0.0715545  15.772  < 2e-16 ***
## season()year6  1.3849374  0.0715576  19.354  < 2e-16 ***
## season()year7  1.5139482  0.0715614  21.156  < 2e-16 ***
## season()year8  1.3676187  0.0715659  19.110  < 2e-16 ***
## season()year9  0.9876197  0.0715710  13.799  < 2e-16 ***
## season()year10 0.5071412  0.0715769   7.085 1.96e-11 ***
## season()year11 0.1714205  0.0715835   2.395   0.0175 *  
## season()year12 0.0275058  0.0715907   0.384   0.7012    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.2205 on 215 degrees of freedom
## Multiple R-squared: 0.8717,  Adjusted R-squared: 0.8646
## F-statistic: 121.8 on 12 and 215 DF, p-value: < 2.22e-16

# Forecast 4 years
reg_fc <- reg_fit %>% forecast(new_data = test)

Ensemble Model

The Ensemble model was created as the average of the above three models (ETS, ARIMA, and Regression).

ens_fit <- train %>%
  model(((TSLM(log(production) ~ trend() + season())) + ARIMA(log(production)) + 
           ETS(log(production))) / 3)
report(ens_fit)

## Series: production 
## Model: COMBINATION 
## Combination: production * 0.333333333333333
## 
## ===========================================
## 
## Series: production 
## Model: COMBINATION 
## Combination: production + production
## 
## ====================================
## 
## Series: production 
## Model: COMBINATION 
## Combination: production + production
## 
## ====================================
## 
## Series: production 
## Model: TSLM 
## Transformation: log(production) 
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.67145 -0.13995  0.01561  0.16944  0.43458 
## 
## Coefficients:
##                 Estimate Std. Error t value Pr(>|t|)    
## (Intercept)    1.7447859  0.0560915  31.106  < 2e-16 ***
## trend()        0.0016660  0.0002222   7.498 1.68e-12 ***
## season()year2  0.0614677  0.0715493   0.859   0.3912    
## season()year3  0.4175650  0.0715503   5.836 1.95e-08 ***
## season()year4  0.7043024  0.0715521   9.843  < 2e-16 ***
## season()year5  1.1285231  0.0715545  15.772  < 2e-16 ***
## season()year6  1.3849374  0.0715576  19.354  < 2e-16 ***
## season()year7  1.5139482  0.0715614  21.156  < 2e-16 ***
## season()year8  1.3676187  0.0715659  19.110  < 2e-16 ***
## season()year9  0.9876197  0.0715710  13.799  < 2e-16 ***
## season()year10 0.5071412  0.0715769   7.085 1.96e-11 ***
## season()year11 0.1714205  0.0715835   2.395   0.0175 *  
## season()year12 0.0275058  0.0715907   0.384   0.7012    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.2205 on 215 degrees of freedom
## Multiple R-squared: 0.8717,  Adjusted R-squared: 0.8646
## F-statistic: 121.8 on 12 and 215 DF, p-value: < 2.22e-16
## 
## Series: production 
## Model: ARIMA(2,0,1)(1,1,1)[12] 
## Transformation: log(production) 
## 
## Coefficients:
##          ar1     ar2      ma1    sar1     sma1
##       0.8838  0.1014  -0.4334  0.0702  -0.7153
## s.e.  0.1378  0.1331   0.1298  0.1060   0.0824
## 
## sigma^2 estimated as 0.009239:  log likelihood=197.8
## AIC=-383.59   AICc=-383.19   BIC=-363.34
## 
## 
## Series: production 
## Model: ETS(A,Ad,A) 
## Transformation: log(production) 
##   Smoothing parameters:
##     alpha = 0.2810165 
##     beta  = 0.03944206 
##     gamma = 0.3169409 
##     phi   = 0.8004346 
## 
##   Initial states:
##      l[0]      b[0]       s[0]      s[-1]      s[-2]     s[-3]     s[-4]
##  1.757429 0.1639944 -0.7452937 -0.5360776 -0.1189981 0.3572363 0.7971287
##      s[-5]     s[-6]     s[-7]        s[-8]      s[-9]    s[-10]     s[-11]
##  0.8938615 0.7496235 0.4639395 0.0004961679 -0.3312976 -0.755189 -0.7754298
## 
##   sigma^2:  0.009
## 
##      AIC     AICc      BIC 
## 182.8006 186.0733 244.5288

ens_fc <- ens_fit %>% forecast(new_data = test)

All Models

fits <- train %>%

  model(
    ETS = ETS(log(production)),
    ARIMA = ARIMA(log(production)),
    Regression = TSLM(log(production) ~ trend() + season()),
    Ensemble = ((TSLM(log(production) ~ trend() + season())) + ARIMA(log(production)) + 
                  ETS(log(production))) / 3
    )

fc <- forecast(fits, new_data = test)
fc

## # A fable: 192 x 4 [1M]
## # Key:     .model [4]
##    .model   Months        production .mean
##    <chr>     <mth>            <dist> <dbl>
##  1 ETS    1937 Jan  t(N(2.4, 0.009))  10.8
##  2 ETS    1937 Feb t(N(2.4, 0.0099))  10.9
##  3 ETS    1937 Mar  t(N(2.8, 0.011))  16.4
##  4 ETS    1937 Apr    t(N(3, 0.012))  20.0
##  5 ETS    1937 May  t(N(3.4, 0.013))  31.1
##  6 ETS    1937 Jun  t(N(3.6, 0.015))  36.8
##  7 ETS    1937 Jul  t(N(3.8, 0.016))  42.9
##  8 ETS    1937 Aug  t(N(3.6, 0.018))  36.3
##  9 ETS    1937 Sep  t(N(3.2, 0.019))  24.0
## 10 ETS    1937 Oct  t(N(2.7, 0.021))  15.7
## # ℹ 182 more rows

Plot the Forecasts

ETS, ARIMA, Regression, and Ensemble Plot

fc_plot <- autoplot(ice_cream, production) +

  autolayer(fc, level = c(80,95)) +
  
  theme(legend.position = "bottom") +

  labs(title = "Forecasts of Ice Cream Production for 1937-1940",

       y = "Ice Cream Production (Millions of Gallons)") 

fc_plot

zoom_fc_plot <- autoplot(test, production) +

  autolayer(fc, level = c(80,95)) +
  
  theme(legend.position = "bottom") +

  labs(title = "Forecasts of Ice Cream Production for 1937-1940",

       y = "Ice Cream Production (Millions of Gallons)")

zoom_fc_plot

ETS Plot

ets_plot <- ets_fc %>%
  autoplot(ice_cream %>% 
             filter(!is.na(production)), level = c(80, 95)) +
  geom_line(aes(y = .mean), data = ets_fc, linetype = 1) +
  labs(title = "US Monthly Ice Cream Production: 1918-1940 with 4-year ETS Forecast",
       y = "Production (Millions of Gallons) ")

ets_plot

ETS Residuals

ets_fit %>% gg_tsresiduals()

The residuals for the ETS model have no distinct pattern and appear not to have too much autocorrelation (2 significant lags).

ARIMA Plot

ar_plot <- arst_fc %>%
  autoplot(ice_cream %>% 
             filter(!is.na(production)), level = c(80, 95)) +
  geom_line(aes(y = .mean), data = arst_fc, linetype = 1) +
  labs(title = "US Monthly Ice Cream Production: 1918-1940 with 4-year ARIMA Forecast",
       y = "Production (Millions of Gallons) ")

ar_plot

ARIMA Residuals

arst_fit %>% gg_tsresiduals()

The residuals look the best in this model. They appear random and there is only one lag that is significantly autocorrelated.

Regression Plot

reg_plot <- reg_fc %>%
  autoplot(ice_cream %>% 
             filter(!is.na(production)), level = c(80, 95)) +
  geom_line(aes(y = .mean), data = reg_fc, linetype = 1) +
  labs(title = "US Monthly Ice Cream Production: 1918-1940 with 4-year Regression Forecast",
       y = "Production (Millions of Gallons) ")

reg_plot

Regression Residuals

reg_fit %>% gg_tsresiduals()

Overall, this model seems to forecast the data with less accuracy than others. The residuals have a consistent pattern to them around the early 1930s and has significant autocorrelation indicating that the model does not account for all predictors.

Ensemble Plot

ens_plot <- ens_fc %>% autoplot(ice_cream) + 
  geom_line(aes(y=.fitted, col=.model), data=augment(ens_fit)) +
  labs(title = "US Monthly Ice Cream Production: 1918-1940 with 4-year Ensemble Forecast",
       y = "Production (Millions of Gallons) ") +
  theme(legend.position = "bottom")

ens_plot

Ensemble Residuals

ens_fit %>% gg_tsresiduals()

This model appears to handle the trends in the data well, but fails to effectively account for seasonality. This can be seen by the strong seasonal pattern and autocorrelation in the residuals.

Evaluate Model Performance on the Test Set

# Use accuracy() to compare the models
compare <- accuracy(fc, test) 

compare %>%
  gt() %>%
  tab_header(title = "ETS, ARIMA, Regression, Ensemble Models")

.model	.type	ME	RMSE	MAE	MPE	MAPE	MASE	RMSSE	ACF1
ETS, ARIMA, Regression, Ensemble Models
ARIMA	Test	-2.37882565	4.348103	2.762233	-7.085696	9.627587	NaN	NaN	0.6956380
ETS	Test	0.16350884	2.081168	1.569081	2.928771	7.211945	NaN	NaN	0.4013132
Ensemble	Test	0.04061904	2.414650	1.902855	3.258873	8.753621	NaN	NaN	0.5441178
Regression	Test	2.33717392	3.028517	2.755360	13.933544	15.220644	NaN	NaN	0.3938588

Of the four models, the ETS model performs the best overall based on the important accuracy metrics. It has the lowest RMSE (2.08), MAE (1.57), and MAPE (7.21%), indicating that its forecasts are the most accurate and consistently closest to the actual observed values. While the Regression model has a slightly lower ACF1 value (0.39 vs. 0.40), suggesting slightly less autocorrelation in its residuals, its error metrics are much higher, meaning it is less precise overall. The ARIMA and Ensemble models both exhibit higher RMSE, MAE, and MAPE values, reflecting poorer predictive performance, as well as higher autocorrelation. Taken together, the ETS model provides the best balance of accuracy and residual independence, making it the most reliable model for forecasting this data.

Midterm

Emily Broderick

2025-10-06