Forecasting - Part 2
2026-04-15
HW 9 is due on Wednesday, 4/15.
HW 10 is now posted and is due on Monday 4/27.
OPTIONAL GitHub Quarto Dashboard Workshop on Fri., 4/17, at 3:15 PM.
Lecture 27 (4/21): Come to class with a stock symbol that you are interested in.
Stock must be traded on NYSE
This NYSE Directory will help you.
Verify that stock can be found on Yahoo Finance.
Evaluations are VERY Important:
I will provide time in class for evaluations next week.
Please complete evaluations for ALL courses.
Review of Time Series Concepts
Brief Review of Time Series without Seasonality
Seasonality in Time Series Data
Forecasting Trends with Seasonality in R
HW 10 is now posted and is due on 4/28
Guest Speaker - Steven Davis from DLA Piper will speak for 15 minutes.
Poll Everywhere - My User Name: penelopepoolereisenbies685
The AR in ARIMA stands a type of regression when you regress a variable on itself by using previous observations to predict future ones.
This is known as ___-regression.
Shows a Snapshot of One Time Period
Shows Trend over Time
auto-correlation: A variable is correlated with itself
auto-regression (AR): Using previous observations to predict into the future.
R function: auto.arima - ARIMA is an acronym:
AR: auto-regressive - p = number of lags to minimize auto-correlation
I: integrated - d = order of differencing to achieve stationarity
MA: moving average - q = number of terms in moving average
All 3 components are optimized to provide a reliable forecast.
Stationary Time Series:
Consistent mean and variance throughout time series
Time series with trends, or with seasonality, are not stationary.
Separating a time series into different parts is how we analyze it
This is called DECOMPOSITION
Time Series Modeling decomposes the data into:
Trend
Seasonality (repeated pattern)
Residuals (what’s left over)
NEW TERM: SARIMA MODEL
Lecture 25: ARIMA models
Today: ARIMA models with SEASONAL component.
SARIMA: Seasonal Auto-Regressive Integrated Moving Average.
SARIMA models:
optimize p, d, and q for whole time series
Also optimize p, d, and q within season (repeating intervals)
DECOMPOSITION
ARIMA models are decomposed into
SARIMA models are decomposed into
Dashed lines show peaks at irregular intervals.
Forecast Questions:
What will be the estimated stock price be in April of 2027?
What ARIMA model was chosen (p,d,q)?
Model Assessment Questions:
How valid is our model?
How are accurate are our estimates?
Examine Prediction Intervals and Prediction Bands
Check fit statistics
Stock Trend Forecast
Create time series using Netflix Stock data
Specify freq = 12 - 12 observations per year
Specify start = c(2010, 1) - first obs. in dataset is January 2010
Model data using auto.arima function
Specify ic = aic - aic is the information criterion used to determine model.
Specify seasonality = F - no seasonal (repeating) pattern in the data.
This code will create and save the model:
Create forecasts (until April 2027)
h = 12 indicates we want to forecast 12 months
Most recent date in forecast data is April 1, 2026
12 Months until April 1, 2027
Forecasts become less accurate the further into the future you specify.
Darker purple: 80% Prediction Interval Bounds
Lighter purple: 95% Prediction Interval Bounds
Plot shows:
Poll Everywhere - My User Name: penelopepoolereisenbies685
Based on the plot on the previous slide, we can tell how many previous periods (q) are included in each moving average in the model created by the auto.arima function in R.
What is q, the number of previous time periods in the moving average?
Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
May 2026 96.28770 90.87356 101.7018 88.00749 104.5679
Jun 2026 97.88642 90.43508 105.3378 86.49058 109.2823
Jul 2026 99.36907 90.06138 108.6768 85.13418 113.6040
Aug 2026 100.03050 88.78655 111.2745 82.83436 117.2266
Sep 2026 99.90374 86.80360 113.0039 79.86881 119.9387
Oct 2026 99.59476 84.93684 114.2527 77.17741 122.0121
Nov 2026 99.72772 83.85743 115.5980 75.45620 123.9992
Dec 2026 100.48277 83.62884 117.3367 74.70690 126.2586
Jan 2027 101.54850 83.77874 119.3183 74.37199 128.7250
Feb 2027 102.44464 83.71553 121.1737 73.80094 131.0883
Mar 2027 102.90728 83.15997 122.6546 72.70638 133.1082
Apr 2027 103.04149 82.28462 123.7984 71.29660 134.7864
Poll Everywhere - My User Name: penelopepoolereisenbies685
Interpretation of Netflix Prediction Intervals
In March of 2027, the 80% prediction interval width for the Netflix stock price will be $____ wide.
To find this width, subtract the lower bound (Lo 80) from the upper bound (Hi 80) and round to the closest whole dollar.
Top Plot: Spikes get larger over time
ACF: auto-correlation function.
Histogram: Distribution of residuals should be approx. normal
Assessment: Stock prices are very volatile and this is sufficient.
ME RMSE MAE MPE MAPE MASE
Training set 0.0007043361 4.159505 2.555861 -6.764627 13.76645 0.2189833
ACF1
Training set 0.0568807
For BUA 345: We will use MAPE = Mean Absolute Percent Error
Despite increasing volatility, our stock price model is estimated to be 86.23% accurate.
This doesn’t guarantee that forecasts will be 86% accurate but it does improve our chances of accurate forecasting.
Alaska is very far north so there is
summer light (day and night)
winter darkness (day and night)
Alaska Electricity usage has a strong seasonal pattern.
Data are quarterly residential revenues:
Rows: 100
Columns: 2
$ Date <date> 2001-03-31, 2001-06-30, 2001-09-30, 2001-…
$ Revenue <dbl> 542.9275, 424.4111, 394.4869, 529.6425, 57…
Format of Time Series with Quarters:
head(ak_res_ts, 20) shows first 20 observations and format. Qtr1 Qtr2 Qtr3 Qtr4
2001 542.9275 424.4111 394.4869 529.6425
2002 570.5655 439.6120 408.6286 513.4108
2003 571.3253 440.5069 418.7918 556.3850
2004 612.2230 459.5118 430.6615 559.5087
2005 611.2410 447.8371 436.4646 566.1093
Poll Everywhere - My User Name: penelopepoolereisenbies685
If our time series from Alaska were augmented so that it started in February of 1990 (2nd month) and we had data by month (12 observations per year), how would our ts command change in R?
Hint: Our current data, ak_res are quarterly, and begin in the first quarter of 2001. The command we used to create time series is:
ts(ak_res$Revenue, freq=4, start=c(2001,1))
ts(ak_res$Revenue, freq=1, start=c(1, 1990))
ts(ak_res$Revenue, freq=4, start=c(2, 1990))
ts(ak_res$Revenue, freq=12, start=c(1990, 2))
ts(ak_res$Revenue, freq=12, start=c(2, 1990))
ts(ak_res$Revenue, freq=4, start=c(1, 1990))
Incorrect Model: Ignores Seasonality (seasonal = F)
p, d, and q for full time series (0,0,4).Correct Model: Includes Seasonality (seasonal = T)
p, d and q for full time series (1,0,1) and within season (2,1,0).[4]Poll Everywhere - My User Name: penelopepoolereisenbies685
Our data is quarterly and has four observations per year ending in the 4th quarter of 2025.
If the state of Alaska wants to extend the forecast until the Fall of 2027 (3rd Quarter), how would they change the R command?
Hint: Current forecast extends until the 4th quarter of 2026 and command is written as: forecast(ak_res_model2, h=4)
forecast(ak_res_model2, h=6)
forecast(ak_res_model2, h=7)
forecast(ak_res_model2, h=8)
forecast(ak_res_model2, h=9)
forecast(ak_res_model2, h=10)
Incorrect Model: Less precise
| Year | Qtr | Pt | Lo95 | Hi95 |
|---|---|---|---|---|
| 2026 | 1 | 574.22 | 496.62 | 651.81 |
| 2026 | 2 | 457.56 | 379.41 | 535.71 |
| 2026 | 3 | 488.97 | 390.62 | 587.32 |
| 2026 | 4 | 550.52 | 450.82 | 650.21 |
Q4 Width = Hi - Lo = $201
Incorrect Model Forecasts and Prediction Bounds
Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
2026 Q1 574.2158 523.4787 624.9530 496.6201 651.8116
2026 Q2 457.5595 406.4619 508.6571 379.4125 535.7066
2026 Q3 488.9728 424.6662 553.2795 390.6243 587.3214
2026 Q4 550.5177 485.3301 615.7052 450.8220 650.2134
Correct Model Forecasts and Prediction Bounds
Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
2026 Q1 604.2656 585.0237 623.5075 574.8376 633.6935
2026 Q2 458.2622 437.2441 479.2802 426.1179 490.4065
2026 Q3 436.5829 414.4750 458.6908 402.7718 470.3940
2026 Q4 564.5713 541.7754 587.3672 529.7080 599.4346
Interpretation of 95% Prediction Bounds:
We are 95% certain that 4th qtr. revenue in 2026 will fall within:
$614.35 - $413.63 = $201$596.72 - $526.75 = $70Incorrect Model Accuracy:
| ME | RMSE | MAE | MPE | MAPE | MASE | ACF1 | |
|---|---|---|---|---|---|---|---|
| Training set | 0.3034447 | 38.5874 | 31.8747 | -0.8747044 | 6.255521 | 2.420947 | 0.0359741 |
Correct Model Accuracy:
| ME | RMSE | MAE | MPE | MAPE | MASE | ACF1 | |
|---|---|---|---|---|---|---|---|
| Training set | 0.9990916 | 14.40144 | 10.80413 | 0.1497032 | 2.012243 | 0.8205957 | -0.0141138 |
The correct model’s percent accuracy is 98%.
Always plot data, but if seasonality is difficult to discern, run both models and compare them.
Residuals (previous slide) and model accuracy (this slide) of models will indicate which model is correct.
R forecast package - simplifies forecasting.
Plot data FIRST: - Check for seasonality, trend, other patterns
Lecture 27 (Tue. 4/21) - Come to class with a stock symbol from the NYSE that can be found on Yahoo Finance.
Lecture 28 (Thu. 4/23) - 20 min. of lecture with Poll Everywhere Questions then time for Q&A
Evaluations are VERY Important: coursefeedback.syr.edu
HW 10 includes material from Lectures 24-26
HW 9 was due on 4/16.
To submit an Engagement Question or Comment about material from Lecture 26: Submit it by midnight today (day of lecture).