Overview

After creating a prediction model for the results of the 2018 NCAA Men’s Basketball tournament against the spread last year which hit at 54.8% accuracy, I am trying again this year and again keeping track of the results. This document will keep a running track of all the predictions and the model’s accuracy.

Getting Started

Helper Functions

There were seven tasks to be repeated in each round:

  1. Add the model’s predictions for each game on to the bracket
  2. Gather and add the sportsbook spread for each game
  3. Determining if the model’s prediction differed enough from sportsbook to choose a wager
  4. Gathering and adding the final scores from the games
  5. Determining the ATS winner of the game and the model’s accuracy
  6. Advancing the game winners into the next round
  7. Create output tables

The seven functions below accomplish those tasks (not shown).

Gathering of Spreads from Internet

Throughout the tournament, spreads were gathered manually from the internet and put into these tables. The date and time of the spreads are recorded. Spreads are positive if strong seed is favored, negative if underdogs. (Not shown)

Collecting Game Results from Internet

Throughout the tournament, the game results were gathered manually from the internet and put into these tables. This task needs to eventually be automated. (Not shown)

Creating Initial 2018 Bracket

The prediction model is an XGBoost model with 5-fold cross validation. The cleaning and preparation of the data and the training and testing of that model are in a script which can be found here [not releasing at the moment]. The prediction data for all possible pairs of opponents are saved into the file “saferesults2019.Rds”. The prediction data is not Bayesian; that is, the model predictions are strictly based on pre-tournament performance.

Play-in Round

Model Predictions

First, we will append the model predicted spreads, the bookie spreads, and the differences between the two on to the bracket.

# Append model predicted spreads to bracket
playin <- predict_ncaa_round(round = 0)

# Append actual spreads from sportsbook
playin <- attachSpreads(playin, playinspreads)

In the ModelChoice variable, the model will return “No Choice” if the model prediction is within 2 points of the sportsbook spread or otherwise will return which team that the model predicts will cover the spread.

# Determine who model would bet on ATS
playin <- ATSchoice(playin)

# Output of Model Predictions ATS
predictionTable(playin)
StrongSeed WeakSeed TeamName.x TeamName.y Prediction Spread ModelChoice Diff
W16a W16b N Dakota St NC Central -1.2 4.5 NC Central -5.7
X16a X16b F Dickinson Prairie View 4.8 2.0 F Dickinson 2.8
X11a X11b Arizona St St John’s -0.9 1.5 St John’s -2.4
W11a W11b Belmont Temple 2.0 3.5 No choice -1.5
Note:
A positive value in Prediction or Spread indicates TeamName.x being favored by that many points. A negative value indicates TeamName.y being favored by that many points.

The model recommends: NC Central and to a lesser degree Fair Dickinson and St. Johns. For consistent record keeping, we will count those two as official recommendations.

Play-in Game Results and Model Performance

# Append Final Scores of Games
playin <- attachGameResults(playin, playinresults)

# Determine ATS winner and Model's Accuracy ATS
playin <- ATSresults(playin)

# Table of all outcomes
resultsTable(playin)
StrongSeed TeamName.x WeakSeed TeamName.y Prediction Spread ModelChoice Team.x.score Team.y.score ATSWinner Accuracy
W11a Belmont W11b Temple 2.0 3.5 No choice 81 70 Belmont NA
W16a N Dakota St W16b NC Central -1.2 4.5 NC Central 78 74 NC Central Correct
X11a Arizona St X11b St John’s -0.9 1.5 St John’s 74 65 Arizona St Incorrect
X16a F Dickinson X16b Prairie View 4.8 2.0 F Dickinson 82 76 F Dickinson Correct
# Summary of results
table(playin$Accuracy)
## 
##   Correct Incorrect 
##         2         1

Round 1

Model Predictions

# Append model predicted spreads to bracket
round1 <- predict_ncaa_round(1)

# Append actual spreads from sportsbook
round1 <- attachSpreads(round1, round1spreads)

# Determine who model would bet on ATS
round1 <- ATSchoice(round1)

As in the playin round, the model will return “No Choice” if the model prediction is within 2 points of the sportsbook spread. Otherwise, the model returns which team it predicts to cover the spread.

# Output of Model Predictions ATS
predictionTable(round1)
StrongSeed WeakSeed TeamName.x TeamName.y Prediction Spread ModelChoice Diff
Y02 Y15 Kentucky Abilene Chr 14.3 22.0 Abilene Chr -7.7
W01 W16a Duke N Dakota St 20.7 27.0 N Dakota St -6.3
W06 W11a Maryland Belmont -2.7 3.0 Belmont -5.7
X06 X11a Buffalo Arizona St 5.4 0.0 Buffalo 5.4
W07 W10 Louisville Minnesota -0.2 5.0 Minnesota -5.2
X04 X13 Florida St Vermont 4.8 10.0 Vermont -5.2
X02 X15 Michigan Montana 10.8 15.5 Montana -4.7
X01 X16a Gonzaga F Dickinson 22.9 27.5 F Dickinson -4.6
Z06 Z11 Villanova St Mary’s CA 0.8 5.0 St Mary’s CA -4.2
Z02 Z15 Tennessee Colgate 13.4 17.5 Colgate -4.1
Z03 Z14 Purdue Old Dominion 9.1 13.0 Old Dominion -3.9
W08 W09 VA Commonwealth UCF 1.8 -1.5 VA Commonwealth 3.3
W04 W13 Virginia Tech St Louis 12.7 10.0 Virginia Tech 2.7
Y03 Y14 Houston Georgia St 14.0 11.5 Houston 2.5
Z07 Z10 Cincinnati Iowa 6.0 3.5 Cincinnati 2.5
Z08 Z09 Mississippi Oklahoma 4.2 2.0 Mississippi 2.2
X05 X12 Marquette Murray St 2.3 4.0 No choice -1.7
X07 X10 Nevada Florida 0.3 2.0 No choice -1.7
W03 W14 LSU Yale 8.7 7.5 No choice 1.2
Z01 Z16 Virginia Gardner Webb 24.5 23.5 No choice 1.0
X08 X09 Syracuse Baylor -3.4 -2.5 No choice -0.9
W02 W15 Michigan St Bradley 19.3 18.5 No choice 0.8
X03 X14 Texas Tech N Kentucky 13.2 14.0 No choice -0.8
Y05 Y12 Auburn New Mexico St 5.7 6.5 No choice -0.8
Y08 Y09 Utah St Washington 3.2 2.5 No choice 0.7
Y07 Y10 Wofford Seton Hall 3.6 3.0 No choice 0.6
Y04 Y13 Kansas Northeastern 6.6 7.0 No choice -0.4
Y01 Y16 North Carolina Iona 23.6 24.0 No choice -0.4
W05 W12 Mississippi St Liberty 6.7 6.5 No choice 0.2
Z05 Z12 Wisconsin Oregon 1.4 1.5 No choice -0.1
Z04 Z13 Kansas St UC Irvine 4.6 4.5 No choice 0.1
Y06 Y11 Iowa St Ohio St 5.5 5.5 No choice 0.0
Note:
A positive value in Prediction or Spread indicates TeamName.x being favored by that many points. A negative value indicates TeamName.y being favored by that many points.

Buffalo/ASU line is not in yet. The model predicts the following teams to beat the following spreads (in order of confidence):

  • Abilene Christian +22
  • North Dakota +27
  • Belmont +3
  • Minnesota +5
  • Vermont +10
  • Montana +15.5
  • Fairfield Dickinson +27.5
  • St. Marys CA +5
  • Colgate +17.5 `
  • Old Dominion +13 `
  • VA Commonwealth +1.5 `
  • Virginia Tech -10.5
  • Houston -11.5
  • Cincinnati -3.5
  • Mississippi -2

If the line is different from or moves on any of those spreads, you should check to see if that would cause a differential of less than 2 points from the model’s prediction. If so, the model would recommend no wager on that game.

Round 1 Game Results and Model Performance (TBD)

# Append Final Scores of Games
round1 <- attachGameResults(round1, round1results)

# Determine ATS winner and Model's Accuracy ATS
round1 <- ATSresults(round1)

# Table of Outcomes in Round 1
resultsTable(round1)
StrongSeed TeamName.x WeakSeed TeamName.y Prediction Spread ModelChoice Team.x.score Team.y.score ATSWinner Accuracy
W01 Duke W16a N Dakota St 20.7 27.0 N Dakota St 85 62 N Dakota St Correct
W02 Michigan St W15 Bradley 19.3 18.5 No choice 76 65 Bradley NA
W03 LSU W14 Yale 8.7 7.5 No choice 79 74 Yale NA
W04 Virginia Tech W13 St Louis 12.7 10.0 Virginia Tech 66 52 Virginia Tech Correct
W05 Mississippi St W12 Liberty 6.7 6.5 No choice 76 88 Liberty NA
W06 Maryland W11a Belmont -2.7 3.0 Belmont 79 77 Belmont Correct
W07 Louisville W10 Minnesota -0.2 5.0 Minnesota 76 86 Minnesota Correct
W08 VA Commonwealth W09 UCF 1.8 -1.5 VA Commonwealth 58 73 UCF Incorrect
X01 Gonzaga X16a F Dickinson 22.9 27.5 F Dickinson 87 49 Gonzaga Incorrect
X02 Michigan X15 Montana 10.8 15.5 Montana 74 55 Michigan Incorrect
X03 Texas Tech X14 N Kentucky 13.2 14.0 No choice 72 57 Texas Tech NA
X04 Florida St X13 Vermont 4.8 10.0 Vermont 76 69 Vermont Correct
X05 Marquette X12 Murray St 2.3 4.0 No choice 64 83 Murray St NA
X06 Buffalo X11a Arizona St 5.4 0.0 Buffalo 91 74 Buffalo Correct
X07 Nevada X10 Florida 0.3 2.0 No choice 61 70 Florida NA
X08 Syracuse X09 Baylor -3.4 -2.5 No choice 69 78 Baylor NA
Y01 North Carolina Y16 Iona 23.6 24.0 No choice 88 73 Iona NA
Y02 Kentucky Y15 Abilene Chr 14.3 22.0 Abilene Chr 79 44 Kentucky Incorrect
Y03 Houston Y14 Georgia St 14.0 11.5 Houston 84 55 Houston Correct
Y04 Kansas Y13 Northeastern 6.6 7.0 No choice 87 53 Kansas NA
Y05 Auburn Y12 New Mexico St 5.7 6.5 No choice 78 77 New Mexico St NA
Y06 Iowa St Y11 Ohio St 5.5 5.5 No choice 59 62 Ohio St NA
Y07 Wofford Y10 Seton Hall 3.6 3.0 No choice 84 68 Wofford NA
Y08 Utah St Y09 Washington 3.2 2.5 No choice 61 78 Washington NA
Z01 Virginia Z16 Gardner Webb 24.5 23.5 No choice 71 56 Gardner Webb NA
Z02 Tennessee Z15 Colgate 13.4 17.5 Colgate 77 70 Colgate Correct
Z03 Purdue Z14 Old Dominion 9.1 13.0 Old Dominion 61 48 Push Push
Z04 Kansas St Z13 UC Irvine 4.6 4.5 No choice 64 70 UC Irvine NA
Z05 Wisconsin Z12 Oregon 1.4 1.5 No choice 54 72 Oregon NA
Z06 Villanova Z11 St Mary’s CA 0.8 5.0 St Mary’s CA 61 57 St Mary’s CA Correct
Z07 Cincinnati Z10 Iowa 6.0 3.5 Cincinnati 72 79 Iowa Incorrect
Z08 Mississippi Z09 Oklahoma 4.2 2.0 Mississippi 72 95 Oklahoma Incorrect
# Summary of all tourney results
entiretourney <- rbind(playin, round1)
table(entiretourney$Accuracy)
## 
##   Correct Incorrect      Push 
##        11         7         1
# Recording Round 1 Winners into Round 2
bracket <- advance_winners(round1)

Round 2

Model Predictions

Round 2 Game Results and Model Performance

Round 3 - Sweet 16

Model Predictions

Round 3 Game Results and Model Performance

Round 4 - Elite 8

Model Predictions

Round 4 Game Results and Model Performance

Round 5 - Final 4

Model Predictions

Round 5 Game Results and Model Performance

Round 6 - NCAA Championship

Model Predictions

Round 6 Game Results and Model Performance