Forecasting - Part 2
2025-04-17
HW 9 was due on Wednesday, 4/16.
HW 10 is now posted and is due on Monday 4/28.
Additional Practice Questions will be posted next week.
Lecture 27 (4/22): Come to class with a stock symbol that you are interested in.
Stock must be traded on NYSE
This NYSE Directory will help you.
Verify that stock can be found on Yahoo Finance.
Lecture 28 (4/24): Course Review and Q&A.
Evaluations are VERY Important:
I will end class a little early next week to give you time to complete evaluations in class.
Please complete evaluations for ALL courses.
Review of Time Series Concepts
Brief Review of Time Series without Seasonality
Seasonality in Time Series Data
Forecasting Trends with Seasonality in R
NEW PACKAGE FOR FORECASTING: forecast
HW 10 is now posted and is due on 4/28
Part of HW 10 pertains to today’s lecture.
Demo videos for HW 10 will be posted this weekend.
In-class Polling (Session ID: bua345s25)
Session ID: bua345s25
The AR
in ARIMA stands a type of regression when you regress a variable on itself by using previous observations to predict future ones.
This is known as ___
-regression.
Shows a Snapshot of One Time Period
Shows Trend over Time
auto-correlation: A variable is correlated with itself
auto-regression (AR): Using previous observations to predict into the future.
R function: auto.arima
- ARIMA is an acronym:
AR: auto-regressive - p
= number of lags to minimize auto-correlation
I: integrated - d
= order of differencing to achieve stationarity
MA: moving average - q
= number of terms in moving average
All 3 components are optimized to provide a reliable forecast.
Stationary Time Series:
Consistent mean and variance throughout time series
Time series with trends, or with seasonality, are not stationary.
Separating a time series into different parts is how we analyze it
This is called DECOMPOSITION
Time Series Modeling decomposes the data into:
Trend
Seasonality (repeated pattern)
Residuals (what’s left over)
NEW TERM: SARIMA MODEL
Lecture 25: ARIMA models
Today: ARIMA models with SEASONAL component.
SARIMA: Seasonal Auto-Regressive Integrated Moving Average.
SARIMA models:
optimize p
, d
, and q
for whole time series
Also optimize p
, d
, and q
within season (repeating intervals)
DECOMPOSITION
ARIMA models are decomposed into
SARIMA models are decomposed into
Dashed lines show peaks at irregular intervals.
Forecast Questions:
What will be the estimated stock price be in April of 2026?
What ARIMA model was chosen (p,d,q)?
Model Assessment Questions:
How valid is our model?
How are accurate are our estimates?
Examine Prediction Intervals and Prediction Bands
Check fit statistics
Stock Trend Forecast
Create time series using Netflix Stock data
Specify freq = 12
- 12 observations per year
Specify start = c(2010, 1)
- first obs. in dataset is January 2010
Model data using auto.arima
function
Specify ic = aic
- aic
is the information criterion used to determine model.
Specify seasonality = F
- no seasonal (repeating) pattern in the data.
This code will create and save the model:
Create forecasts (until April 2026)
h = 12
indicates we want to forecast 12 months
Most recent date in forecast data is April 1, 2025
12 Months until April 1, 2026
Forecasts become less accurate the further into the future you specify.
Darker purple: 80% Prediction Interval Bounds
Lighter purple: 95% Prediction Interval Bounds
Plot shows:
p = 0
), Differencing (d = 1
), Moving Average (q = 3
)Session ID: bua345s25
Based on the plot on the previous slide, we can tell how many previous periods (q) are included in each moving average in the model created by the auto.arima
function in R.
What is q, the number of previous time periods in the moving average?
Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
May 2025 931.5097 887.9080 975.1115 864.8266 998.1929
Jun 2025 917.5952 854.4637 980.7266 821.0439 1014.1464
Jul 2025 909.0272 825.7946 992.2599 781.7339 1036.3206
Aug 2025 909.0272 805.4485 1012.6060 750.6172 1067.4372
Sep 2025 909.0272 788.4891 1029.5653 724.6801 1093.3744
Oct 2025 909.0272 773.6377 1044.4167 701.9668 1116.0876
Nov 2025 909.0272 760.2616 1057.7928 681.5099 1136.5446
Dec 2025 909.0272 747.9928 1070.0616 662.7463 1155.3081
Jan 2026 909.0272 736.5947 1081.4597 645.3145 1172.7400
Feb 2026 909.0272 725.9047 1092.1497 628.9655 1189.0889
Mar 2026 909.0272 715.8053 1102.2492 613.5197 1204.5347
Apr 2026 909.0272 706.2081 1111.8464 598.8421 1219.2124
Session ID: bua345s25
Interpretation of Netflix Prediction Intervals
In February of 2026, the 80% prediction interval width for the Netflix stock price will be $____
wide.
To find this width, subtract the lower bound (Lo 80
) from the upper bound (Hi 80
) and round to the closest whole dollar.
How to input your answer:
Round to closest whole dollar.
Don’t include dollar sign.
Top Plot: Spikes get larger over time
ACF: auto-correlation function.
Histogram: Distribution of residuals should be approx. normal
Assessment: Stock prices are very volatile and this is sufficient.
ME RMSE MAE MPE MAPE MASE ACF1
Training set 3.464136 33.6508 21.59716 1.309471 10.99867 0.213375 -0.006420417
For BUA 345: We will use MAPE = Mean Absolute Percent Error
Despite increasing volatility, our stock price model is estimated to be 89% accurate.
This doesn’t guarantee that forecasts will be 89% accurate but it does improve our chances of accurate forecasting.
Alaska is very far north so there is
summer light (day and night)
winter darkness (day and night)
Alaska Electricity usage has a strong seasonal pattern.
Data are quarterly residential revenues:
Rows: 96
Columns: 2
$ Date <date> 2001-03-31, 2001-06-30, 2001-09-30, 2001-…
$ Revenue <dbl> 542.9275, 424.4111, 394.4869, 529.6425, 57…
Format of Time Series with Quarters:
head(ak_res_ts, 20)
shows first 20 observations and format. Qtr1 Qtr2 Qtr3 Qtr4
2001 542.9275 424.4111 394.4869 529.6425
2002 570.5655 439.6120 408.6286 513.4108
2003 571.3253 440.5069 418.7918 556.3850
2004 612.2230 459.5118 430.6615 559.5087
2005 611.2410 447.8371 436.4646 566.1093
Session ID: bua345s25
If our time series from Alaska were augmented so that it started in February of 1990 (2nd month) and we had data by month (12 observations per year), how would our ts
command change in R?
Hint: Our current data, ak_res
are quarterly, and begin in the first quarter of 2001. The command we used to create time series is:
ts(ak_res$Revenue, freq=4, start=c(2001,1))
ts(ak_res$Revenue, freq=1, start=c(1, 1990))
ts(ak_res$Revenue, freq=4, start=c(2, 1990))
ts(ak_res$Revenue, freq=12, start=c(1990, 2))
ts(ak_res$Revenue, freq=12, start=c(2, 1990))
ts(ak_res$Revenue, freq=4, start=c(1, 1990))
Incorrect Model: Ignores Seasonality (seasonal = F
)
p
, d
, and q
for full time series (0,0,4)
.Correct Model: Includes Seasonality (seasonal = T
)
p
, d
and q
for full time series (1,0,1)
and within season (2,1,0)
.[4]
Session ID: bua345s25
Our data is quarterly and has four observations per year ending in the 4th quarter of 2024.
If the state of Alaska wants to extend the forecast until the Fall of 2026 (3rd Quarter), how would they change the R command?
Hint: Current forecast extends until the 4th quarter of 2025 and command is written as:
forecast(ak_res_model2, h=4)
forecast(ak_res_model2, h=6)
forecast(ak_res_model2, h=7)
forecast(ak_res_model2, h=8)
forecast(ak_res_model2, h=9)
forecast(ak_res_model2, h=10)
Incorrect Model: Less precise
Year | Qtr | Pt | Lo95 | Hi95 |
---|---|---|---|---|
2025 | 1 | 558.12 | 480.25 | 635.99 |
2025 | 2 | 464.34 | 385.69 | 542.99 |
2025 | 3 | 508.10 | 408.97 | 607.24 |
2025 | 4 | 513.99 | 413.64 | 614.35 |
Q4 Width = Hi - Lo = $201
Incorrect Model Forecasts and Prediction Bounds
Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
2025 Q1 558.1209 507.2023 609.0395 480.2476 635.9942
2025 Q2 464.3424 412.9172 515.7677 385.6943 542.9906
2025 Q3 508.1026 443.2806 572.9247 408.9658 607.2394
2025 Q4 513.9923 448.3731 579.6115 413.6364 614.3481
Correct Model Forecasts and Prediction Bounds
Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
2025 Q1 612.2996 592.9063 631.6930 582.6401 641.9592
2025 Q2 449.4930 428.3375 470.6485 417.1384 481.8475
2025 Q3 432.7463 410.5294 454.9631 398.7685 466.7240
2025 Q4 561.7324 538.8579 584.6070 526.7488 596.7160
Interpretation of 95% Prediction Bounds:
We are 95% certain that 4th qtr. revenue in 2025 will fall within:
$614.35 - $413.63 = $201
$596.72 - $526.75 = $70
Incorrect Model Accuracy:
ME | RMSE | MAE | MPE | MAPE | MASE | ACF1 | |
---|---|---|---|---|---|---|---|
Training set | 0.326926 | 38.68311 | 31.71156 | -0.8681514 | 6.216107 | 2.406423 | 0.0289635 |
Correct Model Accuracy:
ME | RMSE | MAE | MPE | MAPE | MASE | ACF1 | |
---|---|---|---|---|---|---|---|
Training set | 0.9936991 | 14.48846 | 10.7029 | 0.1383641 | 1.989122 | 0.8121866 | -0.0175043 |
The correct model’s percent accuracy is 98%.
Always plot data, but if seasonality is difficult to discern, run both models and compare them.
Residuals (previous slide) and model accuracy (this slide) of models will indicate which model is correct.
R forecast
package - simplifies forecasting.
Plot data FIRST: - Check for seasonality, trend, other patterns
Lecture 27 (Tue. 4/22) - Come to class with a stock symbol from the NYSE that can be found on Yahoo Finance.
Lecture 28 (Thu. 4/24) - 20 min. of lecture with Point Solutions, then Q&A
Evaluations are VERY Important: coursefeedback.syr.edu
HW 10 includes material from Lectures 24-26
HW 9 was due on 4/16.
Including today, there are three lectures and engagement questions remaining.
To submit an Engagement Question or Comment about material from Lecture 26: Submit it by midnight today (day of lecture).