For Part 1 of this week’s discussion, I chose the example of an MLB scout evaluating high school or college baseball prospects for the amateur draft. This is a situation where judgmental forecasting can be very useful because there is often limited data available, especially for high school players. Expert opinions play a significant role in the scouting process because scouts typically have years of experience evaluating talent and many have backgrounds as former players, coaches, or evaluators. Their knowledge allows them to identify qualities that may not be fully reflected in statistics alone. The Delphi Method can also be applied in professional baseball. Front offices often gather input from multiple scouts and decision-makers before making a draft selection. By collecting independent evaluations and comparing opinions, organizations can reduce individual biases and make more informed decisions. Scenario analysis is another important tool. For example, two college players may have similar statistics, but one may have played against significantly stronger competition. Evaluating multiple scenarios helps teams determine how a player’s performance may translate to higher levels of competition. One limitation of judgmental forecasting is that it is susceptible to personal biases and subjective opinions. Scouts may overvalue certain traits or rely too heavily on their intuition. To reduce these biases, teams can combine scouting evaluations with statistical analysis. Metrics such as exit velocity, bat speed, pitch velocity, and strikeout rates can provide objective evidence to support scouting reports. Teams may also evaluate a player’s character, work ethic, and ability to respond to coaching, factors that are difficult to measure with data alone. Judgmental forecasting is most useful when there is limited historical data available or when important information cannot be easily quantified. In baseball scouting, combining judgmental forecasting with statistical methods often produces the best results. Statistical analysis provides objective measurements of performance, while scouts contribute valuable context regarding a player’s physical tools, mindset, and long-term development potential. The primary risk of relying too heavily on human intuition is that decisions may be influenced by personal biases or incomplete observations, which can ultimately lead to poor forecasting outcomes.

library(fpp3)
## Warning: package 'fpp3' was built under R version 4.5.3
## ── Attaching packages ──────────────────────────────────────────── fpp3 1.0.3 ──
## ✔ tibble      3.3.1     ✔ tsibble     1.2.0
## ✔ dplyr       1.1.4     ✔ tsibbledata 0.4.1
## ✔ tidyr       1.3.2     ✔ ggtime      0.2.0
## ✔ lubridate   1.9.4     ✔ feasts      0.5.0
## ✔ ggplot2     4.0.2     ✔ fable       0.5.0
## Warning: package 'tibble' was built under R version 4.5.2
## Warning: package 'tidyr' was built under R version 4.5.2
## Warning: package 'ggplot2' was built under R version 4.5.2
## Warning: package 'tsibble' was built under R version 4.5.3
## Warning: package 'tsibbledata' was built under R version 4.5.3
## Warning: package 'ggtime' was built under R version 4.5.3
## Warning: package 'feasts' was built under R version 4.5.3
## Warning: package 'fabletools' was built under R version 4.5.3
## Warning: package 'fable' was built under R version 4.5.3
## ── Conflicts ───────────────────────────────────────────────── fpp3_conflicts ──
## ✖ lubridate::date()    masks base::date()
## ✖ dplyr::filter()      masks stats::filter()
## ✖ tsibble::intersect() masks base::intersect()
## ✖ tsibble::interval()  masks lubridate::interval()
## ✖ dplyr::lag()         masks stats::lag()
## ✖ tsibble::setdiff()   masks base::setdiff()
## ✖ tsibble::union()     masks base::union()
head(vic_elec)
## # A tsibble: 6 x 5 [30m] <Australia/Melbourne>
##   Time                Demand Temperature Date       Holiday
##   <dttm>               <dbl>       <dbl> <date>     <lgl>  
## 1 2012-01-01 00:00:00  4383.        21.4 2012-01-01 TRUE   
## 2 2012-01-01 00:30:00  4263.        21.0 2012-01-01 TRUE   
## 3 2012-01-01 01:00:00  4049.        20.7 2012-01-01 TRUE   
## 4 2012-01-01 01:30:00  3878.        20.6 2012-01-01 TRUE   
## 5 2012-01-01 02:00:00  4036.        20.4 2012-01-01 TRUE   
## 6 2012-01-01 02:30:00  3866.        20.2 2012-01-01 TRUE
fit <- vic_elec %>%
  model(
    tslm = TSLM(Demand ~ Temperature + trend() + season())
  )
report(fit)
## Series: Demand 
## Model: TSLM 
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -1828.81  -681.44   -18.62   551.63  3840.11 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  4.158e+03  1.356e+01 306.550   <2e-16 ***
## Temperature  3.952e+01  6.483e-01  60.962   <2e-16 ***
## trend()     -5.102e-03  2.416e-04 -21.120   <2e-16 ***
## season()    -3.015e+00  7.331e+00  -0.411    0.681    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 840.8 on 52604 degrees of freedom
## Multiple R-squared: 0.07518, Adjusted R-squared: 0.07513
## F-statistic:  1425 on 3 and 52604 DF, p-value: < 2.22e-16
glance(fit)
## # A tibble: 1 × 15
##   .model r_squared adj_r_squared  sigma2 statistic p_value    df  log_lik    AIC
##   <chr>      <dbl>         <dbl>   <dbl>     <dbl>   <dbl> <int>    <dbl>  <dbl>
## 1 tslm      0.0752        0.0751 706928.     1425.       0     4 -428926. 7.09e5
## # ℹ 6 more variables: AICc <dbl>, BIC <dbl>, CV <dbl>, deviance <dbl>,
## #   df.residual <int>, rank <int>
#Forecast
future_data <- new_data(vic_elec, 7 * 48) %>%
  mutate(
    Temperature = mean(vic_elec$Temperature, na.rm = TRUE)
  )

fc <- fit %>%
  forecast(new_data = future_data)

fc %>%
  autoplot(vic_elec)

#Accuracy
fit %>%
  accuracy()
## # A tibble: 1 × 10
##   .model .type           ME  RMSE   MAE   MPE  MAPE  MASE RMSSE  ACF1
##   <chr>  <chr>        <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 tslm   Training -2.80e-14  841.  694. -3.25  15.4  3.26  2.97 0.984
#residuals
fit %>%
  gg_tsresiduals()

The R-squared value is quite low at approximately 0.075, meaning that only about 7.5% of the variation in electricity demand is explained by the variables included in the model: Temperature, trend, and seasonality. This suggests that while these variables have some predictive power, a large portion of the variation in electricity demand is influenced by factors not included in the model. The regression results show that both Temperature and trend were statistically significant predictors of electricity demand, with p-values less than 0.001. However, the seasonality term was not statistically significant (p = 0.681), indicating that it did not contribute much explanatory power in this model. The model also showed that RMSE had a value of approximately 841, meaning that forecasts differed from actual demand by about 841 units on average. Overall, the model was statistically significant, but the low R-squared suggests that additional predictors may be needed to better explain changes in electricity demand.

Overall, I think both methods are useful depending on the situation. In the MLB scouting example, judgmental forecasting is valuable because scouts can evaluate things that statistics cannot fully measure, such as work ethic, leadership, and long-term potential. However, it can also be influenced by personal biases and subjective opinions. The time series regression model provides a more objective approach because it relies on data. In my model, temperature and trend were significant predictors of electricity demand, but the low R² value showed that much of the variation was still unexplained. This suggests that additional variables may be needed to improve the model. Personally, I think the best approach is combining both methods, using data to support decisions while also considering expert knowledge and context.

I would lean more toward judgmental forecasting because the time series regression model produced a relatively low R squared value of 0.075, meaning it only explained about 7.5% of the variation in electricity demand. While the model identified some relationships, much of the variation remained unexplained. I would combine both approaches by using the statistical model as a starting point and then applying expert judgment to account for factors that may not be captured in the data. This would allow me to take advantage of both quantitative analysis and real-world context.