Overview

After creating a prediction model for the results of the 2018 NCAA Men’s Basketball tournament against the spread last year which hit at 54.8% accuracy, I am trying again this year and again keeping track of the results. This document will keep a running track of all the predictions and the model’s accuracy.

Getting Started

Helper Functions

There were seven tasks to be repeated in each round:

  1. Add the model’s predictions for each game on to the bracket
  2. Gather and add the sportsbook spread for each game
  3. Determining if the model’s prediction differed enough from sportsbook to choose a wager
  4. Gathering and adding the final scores from the games
  5. Determining the ATS winner of the game and the model’s accuracy
  6. Advancing the game winners into the next round
  7. Create output tables

The seven functions below accomplish those tasks (not shown).

Gathering of Spreads from Internet

Throughout the tournament, spreads were gathered manually from the internet and put into these tables. The date and time of the spreads are recorded. Spreads are positive if strong seed is favored, negative if underdogs. (Not shown)

Collecting Game Results from Internet

Throughout the tournament, the game results were gathered manually from the internet and put into these tables. This task needs to eventually be automated. (Not shown)

Creating Initial 2018 Bracket

The prediction model is an XGBoost model with 5-fold cross validation. The cleaning and preparation of the data and the training and testing of that model are in a script which can be found here [not releasing at the moment]. The prediction data for all possible pairs of opponents are saved into the file “saferesults2019.Rds”. The prediction data is not Bayesian; that is, the model predictions are strictly based on pre-tournament performance.

Play-in Round

Model Predictions

First, we will append the model predicted spreads, the bookie spreads, and the differences between the two on to the bracket.

# Append model predicted spreads to bracket
playin <- predict_ncaa_round(round = 0)

# Append actual spreads from sportsbook
playin <- attachSpreads(playin, playinspreads)

In the ModelChoice variable, the model will return “No Choice” if the model prediction is within 2 points of the sportsbook spread or otherwise will return which team that the model predicts will cover the spread.

# Determine who model would bet on ATS
playin <- ATSchoice(playin)

# Output of Model Predictions ATS
predictionTable(playin)
StrongSeed WeakSeed TeamName.x TeamName.y Prediction Spread ModelChoice Diff
W16a W16b N Dakota St NC Central -1.2 4.5 NC Central -5.7
X16a X16b F Dickinson Prairie View 4.8 2.0 F Dickinson 2.8
X11a X11b Arizona St St John’s -0.9 1.5 St John’s -2.4
W11a W11b Belmont Temple 2.0 3.5 No choice -1.5
Note:
A positive value in Prediction or Spread indicates TeamName.x being favored by that many points. A negative value indicates TeamName.y being favored by that many points.

The model recommends: NC Central and to a lesser degree Fair Dickinson and St. Johns. For consistent record keeping, we will count those two as official recommendations.

Play-in Game Results and Model Performance

# Append Final Scores of Games
playin <- attachGameResults(playin, playinresults)

# Determine ATS winner and Model's Accuracy ATS
playin <- ATSresults(playin)

# Table of all outcomes
resultsTable(playin)
StrongSeed TeamName.x WeakSeed TeamName.y Prediction Spread ModelChoice Team.x.score Team.y.score ATSWinner Accuracy
W11a Belmont W11b Temple 2.0 3.5 No choice 81 70 Belmont NA
W16a N Dakota St W16b NC Central -1.2 4.5 NC Central 78 74 NC Central Correct
X11a Arizona St X11b St John’s -0.9 1.5 St John’s 74 65 Arizona St Incorrect
X16a F Dickinson X16b Prairie View 4.8 2.0 F Dickinson 82 76 F Dickinson Correct
# Summary of results
table(playin$Accuracy)
## 
##   Correct Incorrect 
##         2         1

Round 1

Model Predictions

# Append model predicted spreads to bracket
round1 <- predict_ncaa_round(1)

# Append actual spreads from sportsbook
round1 <- attachSpreads(round1, round1spreads)

# Determine who model would bet on ATS
round1 <- ATSchoice(round1)

As in the playin round, the model will return “No Choice” if the model prediction is within 2 points of the sportsbook spread. Otherwise, the model returns which team it predicts to cover the spread.

# Output of Model Predictions ATS
predictionTable(round1)
StrongSeed WeakSeed TeamName.x TeamName.y Prediction Spread ModelChoice Diff
Y02 Y15 Kentucky Abilene Chr 14.3 22.0 Abilene Chr -7.7
W01 W16a Duke N Dakota St 20.7 27.0 N Dakota St -6.3
W06 W11a Maryland Belmont -2.7 3.0 Belmont -5.7
X06 X11a Buffalo Arizona St 5.4 0.0 Buffalo 5.4
W07 W10 Louisville Minnesota -0.2 5.0 Minnesota -5.2
X04 X13 Florida St Vermont 4.8 10.0 Vermont -5.2
X02 X15 Michigan Montana 10.8 15.5 Montana -4.7
X01 X16a Gonzaga F Dickinson 22.9 27.5 F Dickinson -4.6
Z06 Z11 Villanova St Mary’s CA 0.8 5.0 St Mary’s CA -4.2
Z02 Z15 Tennessee Colgate 13.4 17.5 Colgate -4.1
Z03 Z14 Purdue Old Dominion 9.1 13.0 Old Dominion -3.9
W08 W09 VA Commonwealth UCF 1.8 -1.5 VA Commonwealth 3.3
W04 W13 Virginia Tech St Louis 12.7 10.0 Virginia Tech 2.7
Y03 Y14 Houston Georgia St 14.0 11.5 Houston 2.5
Z07 Z10 Cincinnati Iowa 6.0 3.5 Cincinnati 2.5
Z08 Z09 Mississippi Oklahoma 4.2 2.0 Mississippi 2.2
X05 X12 Marquette Murray St 2.3 4.0 No choice -1.7
X07 X10 Nevada Florida 0.3 2.0 No choice -1.7
W03 W14 LSU Yale 8.7 7.5 No choice 1.2
Z01 Z16 Virginia Gardner Webb 24.5 23.5 No choice 1.0
X08 X09 Syracuse Baylor -3.4 -2.5 No choice -0.9
W02 W15 Michigan St Bradley 19.3 18.5 No choice 0.8
X03 X14 Texas Tech N Kentucky 13.2 14.0 No choice -0.8
Y05 Y12 Auburn New Mexico St 5.7 6.5 No choice -0.8
Y08 Y09 Utah St Washington 3.2 2.5 No choice 0.7
Y07 Y10 Wofford Seton Hall 3.6 3.0 No choice 0.6
Y04 Y13 Kansas Northeastern 6.6 7.0 No choice -0.4
Y01 Y16 North Carolina Iona 23.6 24.0 No choice -0.4
W05 W12 Mississippi St Liberty 6.7 6.5 No choice 0.2
Z05 Z12 Wisconsin Oregon 1.4 1.5 No choice -0.1
Z04 Z13 Kansas St UC Irvine 4.6 4.5 No choice 0.1
Y06 Y11 Iowa St Ohio St 5.5 5.5 No choice 0.0
Note:
A positive value in Prediction or Spread indicates TeamName.x being favored by that many points. A negative value indicates TeamName.y being favored by that many points.

Buffalo/ASU line is not in yet. The model predicts the following teams to beat the following spreads (in order of confidence):

  • Abilene Christian +22
  • North Dakota +27
  • Belmont +3
  • Minnesota +5
  • Vermont +10
  • Montana +15.5
  • Fairfield Dickinson +27.5
  • St. Marys CA +5
  • Colgate +17.5 `
  • Old Dominion +13 `
  • VA Commonwealth +1.5 `
  • Virginia Tech -10.5
  • Houston -11.5
  • Cincinnati -3.5
  • Mississippi -2

If the line is different from or moves on any of those spreads, you should check to see if that would cause a differential of less than 2 points from the model’s prediction. If so, the model would recommend no wager on that game.

Round 1 Game Results and Model Performance (TBD)

# Append Final Scores of Games
round1 <- attachGameResults(round1, round1results)

# Determine ATS winner and Model's Accuracy ATS
round1 <- ATSresults(round1)

# Table of Outcomes in Round 1
resultsTable(round1)
StrongSeed TeamName.x WeakSeed TeamName.y Prediction Spread ModelChoice Team.x.score Team.y.score ATSWinner Accuracy
W01 Duke W16a N Dakota St 20.7 27.0 N Dakota St 85 62 N Dakota St Correct
W02 Michigan St W15 Bradley 19.3 18.5 No choice 76 65 Bradley NA
W03 LSU W14 Yale 8.7 7.5 No choice 79 74 Yale NA
W04 Virginia Tech W13 St Louis 12.7 10.0 Virginia Tech 66 52 Virginia Tech Correct
W05 Mississippi St W12 Liberty 6.7 6.5 No choice 76 88 Liberty NA
W06 Maryland W11a Belmont -2.7 3.0 Belmont 79 77 Belmont Correct
W07 Louisville W10 Minnesota -0.2 5.0 Minnesota 76 86 Minnesota Correct
W08 VA Commonwealth W09 UCF 1.8 -1.5 VA Commonwealth 58 73 UCF Incorrect
X01 Gonzaga X16a F Dickinson 22.9 27.5 F Dickinson 87 49 Gonzaga Incorrect
X02 Michigan X15 Montana 10.8 15.5 Montana 74 55 Michigan Incorrect
X03 Texas Tech X14 N Kentucky 13.2 14.0 No choice 72 57 Texas Tech NA
X04 Florida St X13 Vermont 4.8 10.0 Vermont 76 69 Vermont Correct
X05 Marquette X12 Murray St 2.3 4.0 No choice 64 83 Murray St NA
X06 Buffalo X11a Arizona St 5.4 0.0 Buffalo 91 74 Buffalo Correct
X07 Nevada X10 Florida 0.3 2.0 No choice 61 70 Florida NA
X08 Syracuse X09 Baylor -3.4 -2.5 No choice 69 78 Baylor NA
Y01 North Carolina Y16 Iona 23.6 24.0 No choice 88 73 Iona NA
Y02 Kentucky Y15 Abilene Chr 14.3 22.0 Abilene Chr 79 44 Kentucky Incorrect
Y03 Houston Y14 Georgia St 14.0 11.5 Houston 84 55 Houston Correct
Y04 Kansas Y13 Northeastern 6.6 7.0 No choice 87 53 Kansas NA
Y05 Auburn Y12 New Mexico St 5.7 6.5 No choice 78 77 New Mexico St NA
Y06 Iowa St Y11 Ohio St 5.5 5.5 No choice 59 62 Ohio St NA
Y07 Wofford Y10 Seton Hall 3.6 3.0 No choice 84 68 Wofford NA
Y08 Utah St Y09 Washington 3.2 2.5 No choice 61 78 Washington NA
Z01 Virginia Z16 Gardner Webb 24.5 23.5 No choice 71 56 Gardner Webb NA
Z02 Tennessee Z15 Colgate 13.4 17.5 Colgate 77 70 Colgate Correct
Z03 Purdue Z14 Old Dominion 9.1 13.0 Old Dominion 61 48 Push Push
Z04 Kansas St Z13 UC Irvine 4.6 4.5 No choice 64 70 UC Irvine NA
Z05 Wisconsin Z12 Oregon 1.4 1.5 No choice 54 72 Oregon NA
Z06 Villanova Z11 St Mary’s CA 0.8 5.0 St Mary’s CA 61 57 St Mary’s CA Correct
Z07 Cincinnati Z10 Iowa 6.0 3.5 Cincinnati 72 79 Iowa Incorrect
Z08 Mississippi Z09 Oklahoma 4.2 2.0 Mississippi 72 95 Oklahoma Incorrect
# Summary of all tourney results
entiretourney <- rbind(playin, round1)
table(entiretourney$Accuracy)
## 
##   Correct Incorrect      Push 
##        11         7         1
# Recording Round 1 Winners into Round 2
bracket <- advance_winners(round1)

Round 2

Model Predictions

# Append model predicted spreads to bracket
round2 <- predict_ncaa_round(2)

# Append actual spreads from sportsbook
round2 <- attachSpreads(round2, round2spreads)

# Determine who model would bet on ATS
round2 <- ATSchoice(round2)

# Output of model recommendations
predictionTable(round2)
StrongSeed WeakSeed TeamName.x TeamName.y Prediction Spread ModelChoice Diff
R1W2 R1W7 Michigan St Minnesota 1.4 10.5 Minnesota -9.1
R1Y1 R1Y8 North Carolina Washington 3.2 11.5 Washington -8.3
R1Z1 R1Z8 Virginia Oklahoma 3.3 11.0 Oklahoma -7.7
R1Y4 R1Y5 Kansas Auburn 4.3 -1.5 Kansas 5.8
R1Z4 R1Z5 UC Irvine Oregon -0.2 5.0 Oregon -5.2
R1W3 R1W6 LSU Maryland 10.0 5.0 LSU 5.0
R1X2 R1X7 Michigan Florida 1.4 6.0 Florida -4.6
R1X1 R1X8 Gonzaga Baylor 10.7 14.0 Baylor -3.3
R1Z3 R1Z6 Purdue Villanova 0.8 3.5 Villanova -2.7
R1Y3 R1Y6 Houston Ohio St 7.2 5.5 No choice 1.7
R1X3 R1X6 Texas Tech Buffalo 1.9 3.5 No choice -1.6
R1X4 R1X5 Florida St Murray St 6.1 4.5 No choice 1.6
R1W4 R1W5 Virginia Tech Liberty 9.2 8.5 No choice 0.7
R1Y2 R1Y7 Kentucky Wofford 5.6 5.0 No choice 0.6
R1W1 R1W8 Duke UCF 12.9 13.0 No choice -0.1
R1Z2 R1Z7 Tennessee Iowa 7.9 8.0 No choice -0.1
Note:
A positive value in Prediction or Spread indicates TeamName.x being favored by that many points. A negative value indicates TeamName.y being favored by that many points.

Round 2 Game Results and Model Performance

# Append Final Scores of Games
round2 <- attachGameResults(round2, round2results)

# Determine ATS winner and Model's Accuracy ATS
round2 <- ATSresults(round2)

# Table of round 2 results and outcomes
resultsTable(round2)
StrongSeed TeamName.x WeakSeed TeamName.y Prediction Spread ModelChoice Team.x.score Team.y.score ATSWinner Accuracy
R1W1 Duke R1W8 UCF 12.9 13.0 No choice 77 76 UCF NA
R1W2 Michigan St R1W7 Minnesota 1.4 10.5 Minnesota 70 50 Michigan St Incorrect
R1W3 LSU R1W6 Maryland 10.0 5.0 LSU 69 67 Maryland Incorrect
R1W4 Virginia Tech R1W5 Liberty 9.2 8.5 No choice 67 58 Virginia Tech NA
R1X1 Gonzaga R1X8 Baylor 10.7 14.0 Baylor 83 71 Baylor Correct
R1X2 Michigan R1X7 Florida 1.4 6.0 Florida 64 49 Michigan Incorrect
R1X3 Texas Tech R1X6 Buffalo 1.9 3.5 No choice 78 58 Texas Tech NA
R1X4 Florida St R1X5 Murray St 6.1 4.5 No choice 90 62 Florida St NA
R1Y1 North Carolina R1Y8 Washington 3.2 11.5 Washington 81 59 North Carolina Incorrect
R1Y2 Kentucky R1Y7 Wofford 5.6 5.0 No choice 62 56 Kentucky NA
R1Y3 Houston R1Y6 Ohio St 7.2 5.5 No choice 74 59 Houston NA
R1Y4 Kansas R1Y5 Auburn 4.3 -1.5 Kansas 75 89 Auburn Incorrect
R1Z1 Virginia R1Z8 Oklahoma 3.3 11.0 Oklahoma 63 51 Virginia Incorrect
R1Z2 Tennessee R1Z7 Iowa 7.9 8.0 No choice 83 77 Iowa NA
R1Z3 Purdue R1Z6 Villanova 0.8 3.5 Villanova 87 61 Purdue Incorrect
R1Z4 UC Irvine R1Z5 Oregon -0.2 5.0 Oregon 54 73 Oregon Correct
# Summary of all tourney results
entiretourney <- rbind(entiretourney, round2)
table(entiretourney$Accuracy)
## 
##   Correct Incorrect      Push 
##        13        14         1
# Recording Round 2 Winners into Round 3
bracket <- advance_winners(round2)

Round 3 - Sweet 16

Model Predictions

# Append model predicted spreads to bracket
round3 <- predict_ncaa_round(3)

# Append actual spreads from sportsbook
round3 <- attachSpreads(round3, round3spreads)

# Determine who model would bet on ATS
round3 <- ATSchoice(round3)

# Output of model recommendations
predictionTable(round3)
StrongSeed WeakSeed TeamName.x TeamName.y Prediction Spread ModelChoice Diff
R2W2 R2W3 Michigan St LSU -0.6 6.0 LSU -6.6
R2X1 R2X4 Gonzaga Florida St 1.9 7.5 Florida St -5.6
R2X2 R2X3 Michigan Texas Tech 0.0 2.0 No choice -2.0
R2Z2 R2Z3 Tennessee Purdue 2.9 1.0 No choice 1.9
R2Z1 R2Z4 Virginia Oregon 6.7 8.5 No choice -1.8
R2Y1 R2Y4 North Carolina Auburn 5.5 5.0 No choice 0.5
R2W1 R2W4 Duke Virginia Tech 6.9 7.0 No choice -0.1
R2Y2 R2Y3 Kentucky Houston 3.0 3.0 No choice 0.0
Note:
A positive value in Prediction or Spread indicates TeamName.x being favored by that many points. A negative value indicates TeamName.y being favored by that many points.

Round 3 Game Results and Model Performance

Round 4 - Elite 8

Model Predictions

Round 4 Game Results and Model Performance

Round 5 - Final 4

Model Predictions

Round 5 Game Results and Model Performance

Round 6 - NCAA Championship

Model Predictions

Round 6 Game Results and Model Performance