ETR Sports Data Scientist

Question 1: Simulate Super Bowl Score Results (ie 27-23, 30-21, etc.) based on pregame Vegas Odds (Game Total, Spread)

Gambling markets are efficient. Over a large enough sample, actual game results will (roughly) converge to pre-game betting lines. That is, teams favored by 3 will win by an average of 3; games with an O/U of 50 will have an average of 50 points scored. Further, for both game spreads and game totals, actual results are normally distributed around the pre-game lines with a standard deviation of ~13.

Spread	n	Results
NFL Point Spreads Summary Statistics
1999-2022
Spread	n	Avg	Median	Stdev
3.0	1029	2.45	3	13.45
3.5	586	4.12	3	13.37
1.0	519	0.18	1	13.93
7.0	445	8.98	7	13.44
2.5	421	0.96	2	13.67
4.0	316	4.07	4	11.76
6.0	311	5.87	6	12.86
6.5	301	6.42	5	13.47
7.5	246	8.30	8	12.75
5.5	241	5.61	4	13.49
4.5	229	5.20	4	12.45
10.0	199	10.09	9	12.22
5.0	155	5.43	5	13.05
2.0	149	2.72	3	11.01
9.0	146	8.22	7	14.18
1.5	142	-0.58	0	12.42
9.5	140	10.26	8	13.93
10.5	126	11.13	10	13.03
8.0	92	9.63	11	15.03
8.5	89	9.45	8	14.61
11.0	77	11.18	10	11.65
14.0	76	13.37	12	13.69
13.0	67	12.87	14	12.85
13.5	57	12.33	8	14.32
11.5	41	11.12	13	15.40
12.0	32	8.56	6	13.10
12.5	32	14.78	18	15.41
0.0	31	0.97	2	13.97
14.5	22	11.68	8	14.57
17.0	16	19.31	18	10.13
15.5	15	20.93	19	12.37
16.0	14	16.50	16	8.52
16.5	13	15.92	14	14.88
15.0	10	19.60	18	14.67
17.5	4	12.75	7	20.47
18.0	4	25.00	25	16.43
20.5	4	14.75	13	8.38
19.0	3	20.33	18	18.61
22.0	2	23.00	23	2.83
18.5	1	31.00	31	NA
19.5	1	28.00	28	NA
20.0	1	26.00	26	NA
24.0	1	3.00	3	NA
27.0	1	16.00	16	NA
Data from nflseedR

O/U	n	Results
NFL Game Totals Summary Statistics
1999-2022
O/U	n	Avg	Median	Stdev
44.0	341	46.03	45	13.34
41.0	302	42.24	41	12.75
43.0	283	43.97	43	13.95
43.5	266	43.67	44	13.60
47.0	247	45.82	45	12.29
45.0	240	44.73	44	12.58
44.5	231	43.43	43	12.83
42.0	220	43.21	42	13.51
46.0	216	46.71	47	12.84
41.5	209	41.82	41	13.53
45.5	207	45.38	45	14.58
46.5	203	47.67	46	13.48
42.5	188	43.26	43	14.21
40.5	186	41.95	41	13.47
40.0	181	40.56	40	11.68
47.5	179	47.98	46	14.03
37.0	176	35.86	34	12.72
37.5	174	39.23	40	15.00
48.0	171	49.14	47	14.77
38.0	170	39.42	37	14.37
48.5	157	49.85	50	14.34
39.5	155	39.41	38	13.60
39.0	141	39.91	40	13.10
38.5	124	37.72	37	12.60
49.0	118	48.62	47	13.38
36.5	117	37.94	37	14.16
49.5	108	47.62	47	13.27
51.0	91	51.98	51	12.59
36.0	90	38.13	37	14.62
50.0	78	47.94	47	11.95
50.5	73	52.45	51	14.94
35.5	70	39.97	40	11.86
35.0	61	35.69	36	13.45
34.5	59	36.97	36	12.78
34.0	57	39.30	36	14.36
52.0	52	52.75	52	14.17
51.5	48	53.83	54	14.28
52.5	46	51.91	51	10.46
53.0	45	52.78	53	13.91
33.0	44	35.43	36	11.84
53.5	40	55.90	55	13.75
54.0	40	56.17	56	13.41
33.5	35	34.91	33	12.11
55.0	35	49.66	49	11.52
54.5	28	55.43	62	15.45
55.5	17	51.18	50	17.52
56.5	16	58.19	52	18.51
32.0	14	31.21	28	11.39
56.0	14	55.07	52	14.72
57.0	8	54.00	57	8.14
31.0	6	28.50	25	14.22
32.5	6	42.00	42	8.69
58.0	5	55.00	62	20.48
57.5	4	61.25	68	21.96
59.5	4	68.00	69	13.52
58.5	3	67.67	70	7.77
31.5	2	18.50	18	3.54
30.0	1	33.00	33	NA
30.5	1	19.00	19	NA
60.0	1	61.00	61	NA
61.0	1	48.00	48	NA
63.0	1	58.00	58	NA
63.5	1	105.00	105	NA
Data from nflseedR

However, as illustrated below, simply sampling from a normal distribution doesn’t work due to of the concept of key/critical numbers.

To simulate possible game scores, I want to take advantage of these normally distributed pre-game lines while also accounting for key numbers.

To do so, I first randomly generated game results from a normal distribution with mean=1.5 (line= PHI -1.5) and stdev=13, and game totals from a normal distribution with mean=50.5 (O/U=50.5) and stdev=13.

The first few rows of randomly generated results are shown below. Note that result=home score-away score and total=home score+away score

##   result total
## 1      4    43
## 2    -13    29
## 3    -13    54
## 4    -18    51
## 5    -13    45
## 6    -10    34

Next, I created a function that, for a given randomly generated game score and total, does the following:
1. Expands the range of each by 3 in both directions
2. Looks through all historical games that fall within this expanded range
3. Randomly picks one game

For example, if result=-2 and total=50, the function extracts all previously played games where:
result > -6 AND result < 2 AND total > 46 AND total < 54 (this particular example has occurred 210 times since 1999), and then randomly selects one of these rows.

Here are the 10 most common outcomes after running 50k simulations.

PHI	KC	Result	Total	Winner	Occurences
10 Most Common Scores
50,000 simulations
27	24	3	51	PHI	646
24	27	-3	51	KC	478
20	23	-3	43	KC	460
23	20	3	43	PHI	460
31	24	7	55	PHI	458
20	17	3	37	PHI	396
31	28	3	59	PHI	363
34	31	3	65	PHI	352
27	20	7	47	PHI	321
30	27	3	57	PHI	304
Avg Result=1.51, Avg Total=50.38

Question 2: for the Super Bowl using Bayesian Bootstrapping or another methodology

My first step was to examine Kelce’s relevant metrics dating back to 2018–Mahomes’ first year as the starter.

Season	GP	Targets	Tgts/Gm	Yds	Yds/Tgt	Depth of Target		Longest Catch of Game
Travis Kelce Summary Statistics
Season	GP	Targets	Tgts/Gm	Yds	Yds/Tgt	Median	Avg	Median	Avg
2018	18	166	9.2	1467	8.84	7	9.16	24.0	23.78
2019	19	158	8.3	1436	9.09	7	8.67	20.0	22.79
2020	18	185	10.3	1776	9.60	7	8.31	24.5	25.83
2021	19	161	8.5	1424	8.84	6	7.43	20.0	24.68
2022	19	177	9.3	1514	8.55	5	6.89	23.0	25.84

Kelce’s average longest reception actually peaked this year, though his aDOT and yards/target hit a five year low. All the while, his volume has remained consistent.

I deployed a classical bootstrap, resampling yards gained from all Kelce targets in 2021 and 2022 (pre-SB), which seemed to strike a fair balance between representative-ness and sample size. Since we are looking for the median (presumably for purpose of betting the longest reception prop), the fact that this sample does not include every possible reception length (for example, the sample includes a 69 yard and a 52 yard catch with nothing in between) is not an issue.

In each run, the number of targets was sampled from a Poisson distribution, with the lambda parameter set as Kelce’s average targets/game from 2018-2022.

To clarify, I wrote a function that does the following:

Randomly selects the number of targets (t) from a Poisson distribution with lambda=9.1
Randomly selects t plays from the set of Kelce’s 2021-2022 actual targets
Extracts yards gained from each selected play
Stores the longest reception of these t plays

It looks like the line was 22.5 yards while the median from the sampling distribution is 23 yards (his actual longest catch turned out to be 22 yards!)

Question 3: Determine Brock Purdy’s true talent level YPA at this point in his career

Though projecting the “true talent level” of an NFL player is much harder than doing so for an MLB player due to the accuracy and availability of predictive statistics in the respective sports, baseball projection systems provide a useful framework for attacking this type of question. Specifically, I’m an advocate of emulating the approach used by Dan Szymborski in his ZiPS projections.

If you are unfamiliar or need a refresher on how ZiPS works, here is a brief description on the methodology (full articlecan be found here.)

How does “ZiPS project(s) future production? First, using both recent playing data with adjustments for…, ZiPS establishes a baseline estimate for every player being projected. To get an idea of where the player is going, the system compares that baseline to the baselines of all other players in its database…Using a whole lot of stats, information on shape, and player characteristics, ZiPS then finds a large cohort that is most similar to the player. I use Mahalanobis distance extensively for this.

Utilizing sources like game statistics, draft data, PFF grades, Madden ratings (plus any additional data you may have access to), I propose clustering and subsequently formulating projections for various metrics from these clusters (like YPA for a QB).

ETR Sports Data Scientist

Andrew Cohen

2023-02-13

Question 1: Simulate Super Bowl Score Results (ie 27-23, 30-21, etc.) based on pregame Vegas Odds (Game Total, Spread)

Question 2: for the Super Bowl using Bayesian Bootstrapping or another methodology

Question 3: Determine Brock Purdy’s true talent level YPA at this point in his career