First, split your time series into a training and a test set, such that you are training on approximately 80% of the data and testing on approximately 20% of the data. Visualize the training and test sets in a single plot.
Code
# Set the seed for reproducibilityset.seed(123)# Determine the number of rows for training (80%) and testing (20%)n_rows <-nrow(milk_price_tbl_ts)train_rows <-round(0.8* n_rows)# Split the data into training and testing setstrain_data <- milk_price_tbl_ts %>%slice(1:train_rows)test_data <- milk_price_tbl_ts %>%slice((train_rows +1):n_rows)# Visualize the training and test setsggplot() +geom_line(data = train_data, aes(x = date, y = value, color ="Training"), linewidth =1) +geom_line(data = test_data, aes(x = date, y = value, color ="Test"), linewidth =1) +scale_color_manual(values =c("Training"="blue", "Test"="red")) +labs(title ="Training and Test Sets", x ="Date", y ="Milk Price") +theme_minimal()
Does it appear that the test set is representative of the training set? The test set does not seem to accurately represent the training set. While the training set exhibits variation over the years with an upward trend, the test set displays minimal variation but features a steep upward trend.
Section 2 - Cross-Validation Scheme
Next, set up a rolling window cross-validation scheme using stretch_tsibble. Be sure to make appropriate choices on the initial training period and the interval at which you will step through the data considering the length of your time series.
Code
milk_train_cv <- train_data %>%stretch_tsibble(.init =7*12, .step =12)milk_train_cv %>%ggplot()+geom_point(aes(date,factor(.id),color=factor(.id)))+ylab('Iteration')+ggtitle('Samples included in each CV Iteration')
Section 3 - Model Selection and Comparison
Fit the selected best ARIMA from Assignment 3 and a naive model to each fold in the cross-validation scheme you created. Then, produce a forecast of each model for each fold. Visualize the actual versus predicted for each cross-validation iteration. Does it seem like one model is likely to outperform the other?
# Assuming you have already computed milk_price_cv_forecastmilk_train_cv_forecast %>%as_tibble() %>%select(-value) %>%left_join(train_data, , by =c("date"="date")) %>%ggplot() +geom_line(aes(date, value)) +geom_line(aes(date, .mean, color =factor(.id), linetype = .model)) +scale_color_discrete(name ='Iteration') +ylab('Milk Prices') +theme_bw()
Code
milk_train_cv_forecast %>%group_by(.id,.model) %>%mutate(h =row_number()) %>%ungroup() %>%as_fable(response ="value", distribution = value) %>%accuracy(train_data, by =c("h", ".model")) %>%ggplot(aes(x = h, y = RMSE,color=.model)) +geom_point()+geom_line()+ylab('Average RMSE at Forecasting Intervals')+xlab('Months in the Future')
Code
milk_train_cv_forecast %>%group_by(.id,.model) %>%mutate(h =row_number()) %>%ungroup() %>%as_fable(response ="value", distribution = value) %>%accuracy(train_data, by =c("h", ".model")) %>%mutate(MAPE = MAPE/100) %>%# Rescaleggplot(aes(x = h, y = MAPE,color=.model)) +geom_point()+geom_line()+theme_bw()+scale_y_continuous(name ='Average MAPE at Forecasting Intervals',labels=scales::percent)
In the training set, there’s approximately 23 years of milk price data. comprising 273 data points. At times, the naive model outperforms the ARIMA model, while in few instances, the ARIMA model surpasses the naive model. However, Naive model models exhibit better performance across various forecast horizons when compared to ARIMA model.
Compare the performance of the two models using the following metrics: RMSE, MAE, MAPE, and MASE. Which model performs better? Does the ARIMA model outperform the naive model? Does this match your intuition from the visualizations? Be sure to include visualizations and tables as appropriate, including an assessment of how the models perform at different forecasting horizons.
.model .type ME RMSE MAE MPE MAPE MASE
1: arima Test 0.04408712 0.1755522 0.1149873 1.1320650 3.373057 0.5067378
2: naive Test 0.02227846 0.1589794 0.1050839 0.5305063 3.109109 0.4630945
RMSSE ACF1
1: 0.5676221 0.8262086
2: 0.5140364 0.8251922
The graph illustrates that the naive model exhibit better performance than ARIMA model across multiple forecast horizons. This observation is reinforced by examining the RMSE and MAPE values provided in the accuracy table. In some forecast horizons, the naive model outperforms the ARIMA model, while in others, the ARIMA model demonstrates superior performance.
Section 4
After identifying the model that performed the best in Section 3, refit that model to the entire training set and produce a forecast for the test set. Visualize the actual versus predicted for the test set, and recalculate your performance metrics on the test set for this selected model.
The forecast fails to replicate the variability observed in the test set. However, it manages to capture the subtle trend evident in the test set.
Does it seem like the model is overfitting or underfitting? Be sure to include visualizations and tables as appropriate, as well as a text description of the performance of the model using appropriate metrics.
Code
# Produce forecasts for the training settrain_forecast <- Milk_model %>%forecast(new_data = train_data)# Visualize actual versus forecasted values for the training setautoplot(train_forecast) +autolayer(train_data) +labs(title ="Actual vs Forecasted Values for Training Set", y ="Milk Prices", x ="Date") +theme_minimal()
Plot variable not specified, automatically selected `.vars = milk_price`
To assess whether the model is underfitting or overfitting, I examined its bias by evaluating how well the model’s forecasts aligned with the training data itself. By plotting the actual versus forecasted values, I observed a high bias within the training data. Additionally, upon comparing it with the test data, I noticed a low variance, indicative of underfitting.