Instructions: Please ensure that your graphs and visuals have proper titles and axis labels, where necessary. Refer to the output, whenever appropriate, when discussing the results. Lastly, remember that creativity (coupled with relevance) will be rewarded.

Question 1: Forecasting Delinquency Rate on Loans to Finance Agricultural Production

The FRED database has some valuable data on the U.S. and global macroeconomy. One such series is the “Delinquency Rate on Loans to Finance Agricultural Production, All Commercial Banks (DRFAPGACBN)”. More details of the series are available here: https://fred.stlouisfed.org/series/DRFAPGACBN

UI TRUST, a local bank, has approached you to forecast the delinquency rate of future loans. However, you have been provided with no additional guidance beyond this request. I have sent you an outline that I believe will be helpful in assisting you with this forecasting task.

  1. Using the quantmod package, pull the DRFAPGACBN series from the FRED website. Be sure to declare this series as a ts() object with the appropriate start and frequency arguments. Save the final ts variable and delrate.
getSymbols("DRFAPGACBN",src="FRED")
## [1] "DRFAPGACBN"
head(DRFAPGACBN)
##            DRFAPGACBN
## 1987-01-01      10.76
## 1987-04-01       8.15
## 1987-07-01       6.96
## 1987-10-01       6.87
## 1988-01-01       7.51
## 1988-04-01       5.67
start_year<-1987
start_quarter<-1
delrate <- ts(DRFAPGACBN, start = c(1987,1), frequency = 4)
plot(delrate,
     main = "Delinquency Rate on Loans for Agricultural Production",ylab = "Delinquency Rate (%)",xlab = "Year",col = "darkgreen",lwd = 2)

  1. Use appropriate graphs to visualize the data and briefly comment on any relevant properties that you deem pertinent.

Using the seasonal plot, there appears to be seasonality when you remove the years before 1990. As for the Trend,using the trend graph on the third graph before 1990 there appears to be a trend. After 1990 the data appears flat with a slight downward slope. But very slight.

Using the initial line graph and the seasonal graph it is additive because there is no sign of the sesonality amplifying over time.

Note:

(i) Be sure to explain how you used your graphs to arrive at your conclusion. (ii) I expect you to use/present at least three (3) graphs to supplement your answers.

autoplot(delrate) +ggtitle("Quarterly Delinquency Rate on Agricultural Loans (1987–Present)") +ylab("Delinquency Rate (%)") +xlab("Year") +theme_minimal()

ggseasonplot(delrate, year.labels = TRUE, year.labels.left = TRUE) +ggtitle("Seasonal Plot: Delinquency Rate by Quarter") +ylab("Delinquency Rate (%)") +xlab("Quarter") + theme_minimal()

  1. Subset the data into a training set, delrate.train, and a test set, delrate.test. Assign the last five (5) years of data to the test set.

Note: Instead of using mental math or the window command, I would like you to use the subset() function to split the data.

A quick Google search should help you with the syntax. My aim is for you to be comfortable exploring functions that you might not be familiar with from our class notes.

delrate.train <- subset(delrate, time(delrate) >= 1987 & time(delrate) < 2020)
delrate.test <- subset(delrate, time(delrate) >= 2020)
plot(delrate, main = "Delinquency Rate: Training vs. Test Sets", ylab = "Delinquency Rate (%)", xlab = "Year", col = "black", lwd = 2)
lines(delrate.train, col = "blue", lwd = 2)
lines(delrate.test, col = "red", lwd = 2)
legend("topright", legend = c("Full Series", "Training Set", "Test Set"), col = c("black", "blue", "red"), lwd = 2)

  1. Use the seasonal naïve model to produce forecasts over the test period. Recall that your data is quarterly.

It does not do well because it is based on the last data point. So it shows the seasonality, but fails to catch that dip in delinquencies.

n<-length(delrate)
delrate.train<-subset(delrate,start=1, end = (n-20))
delrate.test<-subset(delrate,start = (n-19), end = n)
del.snaive <- snaive(delrate.train, h = length(delrate.test))
autoplot(delrate, color = "black") +
  autolayer(delrate.train, series = "Training Data") +
  autolayer(delrate.test, series = "Actual Test Data") +
  autolayer(del.snaive$mean,series="Seasonal Naive Forecast") +
  ggtitle("Seasonal Naïve Forecast: Delinquency Rate on Agricultural Loans") +
  xlab("Year") +
  ylab("Delinquency Rate (%)") +
  scale_colour_manual(values = c("black", "blue", "red"))+ theme_minimal()+
  guides(colour = guide_legend(title = "Series"))

  1. Now, using a simple exponential smoothing model with initial = "optimal", produce the forecasts over the appropriate horizon.

This method just takes the last data point and uses it for the forecast. It does not capture seasonality or anything like the seasonal naive forecast. And since there is a dip in delinquencies, it misses the mark in capturing that shock like seasonal naive forecast.

  1. Print your model results from Part (5) and comment on the alpha value that the model chose. In particular, how does the model weigh the most recent observations in the data relative to the older ones?

With a .6238 the model is slightly weighing the most recent observations stronger than the older values.

You can modify the following syntax to display the model results: xxx$model.

  1. Using Holt's model, forecast the delinquency rate over the appropriate forecast horizon.
del.holt <- holt(delrate.train, h = length(delrate.test), initial = "optimal")
autoplot(delrate, color = "black") +
  autolayer(delrate.train, series = "Training Data") +
  autolayer(delrate.test, series = "Actual Test Data") +
  autolayer(del.holt$mean, series = "Holt Forecast") +
  ggtitle("Holt’s Linear Forecast: Delinquency Rate on Agricultural Loans") +
  xlab("Year") +
  ylab("Delinquency Rate (%)") +
  scale_colour_manual(values = c("black", "blue", "red")) +
  theme_minimal() +
  guides(colour = guide_legend(title = "Series"))

  1. What is your takeaway from your graph in Part (7)?

Holt is a smooth forecast, that does not take seasonality into account much. It seems to have the past trend in there, but indefinitely.

Was this model able to capture potential seasonality in the delinquency rates?

NO.

Do you believe the point at which you subset the data actually mattered for your model predictions above?

Yes, because it picked up the trend in the data. Maybe including the more recent data would have helped this forecast capture the shock in delinquency rates.

  1. Produce the appropriate forecast using a Holt-Winters' model with multiplicative seasonality. Store your model estimation results as del.hw then reproduce the graph above for this model’s results.
del.hw <- hw(delrate.train, seasonal = "multiplicative", h = length(delrate.test), initial = "optimal")
autoplot(delrate, color = "black") +
  autolayer(delrate.train, series = "Training Data") +
  autolayer(delrate.test, series = "Actual Test Data") +
  autolayer(del.hw$mean, series = "Holt-Winters Forecast") +
  ggtitle("Holt-Winters Forecast: Multiplicative Seasonality") +
  xlab("Year") +
  ylab("Delinquency Rate (%)") +
  scale_colour_manual(values = c("black", "blue", "red")) +
  theme_minimal() +
  guides(colour = guide_legend(title = "Series"))

  1. In a single graph, plot the full data and the point estimate of each of your models above. autoplot(delrate, color = “black”) + autolayer(del.snaive\(mean, series = "Seasonal Naïve") + autolayer(del.ses\)mean, series = “SES”) + autolayer(del.holt\(mean, series = "Holt") + autolayer(del.hw\)mean, series = “Holt-Winters”) + ggtitle(“Forecast Comparison: Seasonal Naïve, SES, Holt, and Holt-Winters”) + xlab(“Year”) + ylab(“Delinquency Rate (%)”) + scale_colour_manual(values = c( “black”, # Actual “red”, # Seasonal Naïve “blue”, # SES “green”, # Holt “orange” # Holt-Winters )) + theme_minimal() + guides(colour = guide_legend(title = “Model”))

  2. Which of the four (4) forecasting models (del.snaive, del.ses, del.holt or del.hw) provides the better forecast?

SES is the better forecast because it has the lowest RMSE, MAPE, and MSE.

Use the RMSE, MAPE, and MSE to make your justify your final determination. You are required to create a table using the kable command here. Please ensure that your table is correctly formatted to 2 decimal places and the models are appropriately labeled. acc_snaive <- accuracy(del.snaive, delrate.test) acc_ses <- accuracy(del.ses, delrate.test) acc_holt <- accuracy(del.holt, delrate.test) acc_hw <- accuracy(del.hw, delrate.test) model_comparison <- data.frame( Model = c(“Seasonal Naïve”, “SES”, “Holt”, “Holt-Winters”), RMSE = c(acc_snaive[“Test set”, “RMSE”], acc_ses[“Test set”, “RMSE”], acc_holt[“Test set”, “RMSE”], acc_hw[“Test set”, “RMSE”]), MAPE = c(acc_snaive[“Test set”, “MAPE”], acc_ses[“Test set”, “MAPE”], acc_holt[“Test set”, “MAPE”], acc_hw[“Test set”, “MAPE”]), MSE = c(acc_snaive[“Test set”, “RMSE”]^2, acc_ses[“Test set”, “RMSE”]^2, acc_holt[“Test set”, “RMSE”]^2, acc_hw[“Test set”, “RMSE”]^2)) model_comparison[ , 2:4] <- round(model_comparison[ , 2:4], 2) kable(model_comparison, caption = “Forecast Accuracy Comparison: Seasonal Naïve, SES, Holt, Holt-Winters”)

  1. Using the checkresiduals() command of your preferred model, comment on whether the residuals appear to be white noise. For your answers, be sure to discuss the null of the Ljung-Box statistics and your conclusion at the 1% and 5% levels of significance.

The P value is far below the 1% and 5% level of significance. So we reject the null. The residuals contain patterns meaning the model maybe missing trend, seasonality, or structure.

checkresiduals(del.ses)

  1. Write a brief nontechnical report to the Manager of UI TRUST reporting your findings.

Your report should include an explanation of how you arrived at the preferred model but try to avoid boring them with all the technical details involved.

In the main, this question is free-form, and my attempt to have you explain your model results to a non-scientific audience.

Well, I call this, explaining it to my grandma. What I would say is that we chose model 4, I will tell you that is SES but I wouldn’t tell her to keep things simple. I would just call it model 4 or maybe not mention that to keep down confusion. I would tell my grandmother that over the next three years that farmers will pay their loans the same way they always have, with cash. No, that is a joke to loosen up the executives. I would tell the executives that I would tell my grandma over the next three years the same amount of farmers that paid off their loans the last three years will pay their loans the next three years. Nothing will change. Things will stay the same. I hope this helps.