Comeback Analysis: Home Team Advantage in Baseball

Results

First Five Innings

Of the 2291 games where one team was down by at least five runs at some point in the first five innings, the away team was more likely to be the team in the deficit (55.74% of games). However, there was a slightly larger proportion of home teams that were in the deficit that made the comeback (5.1% of successful comebacks for the home team compared to 4.8% for the away team). When the games are split into difficulty categories, with each group having similar values for the deficit team’s win proportion of the season minus the opposing team’s win proportion, we see that the home team almost always has a higher proportion of completed comebacks. The exception is games where the team in the deficit was at a strong disadvantage (win proportion difference < -0.188), the away team completed the comeback twice as often than the home team (1.5% for the home team and 3% for the away team).

From the overall data, we can also see that the proportion of successful comebacks diminishes as the remaining innings goes down, which makes sense as the team would have less time to overcome the deficit. As the points deficit gets larger, regardless of advantage type, there are also fewer comebacks being completed, with no successful comebacks completed for a deficit of ten or more, and only one comeback completed for a nine point deficit. Who comes back more often from the same deficits isn’t as consistent, with this sample of home teams doing better in 5, 7, and 8 point deficits and away teams doing better with 6 and 9 point deficits.

Within each group the sample sizes for specific scenarios are small, so making meaningful comparisons is difficult. Some notable differences are that when at a strong disadvantage and in a five-run deficit, away teams complete the comeback almost four times as often as the home team (7.5% for away teams and 2% for home teams). Also, when the teams are roughly balanced, the home team makes the comeback almost twice as often as the away team does (6.5% for home and 3.6% for away

The following tables outline the observed proportion of games won/games total for each scenario. Entries that are just a dash were scenarios that there were no recordings for in this sample.

Overall

(Home)	8 inn. left	7	6	5	4	Total
5 pt max deficit	4/7 (0.571)	4/17 (0.235)	5/29 (0.172)	12/48 (0.250)	14/282 (0.050)	39/383 (0.102)
6 pt	0/3	1/13 (0.077)	5/21 (0.238)	0/34	2/200 (0.010)	8/271 (0.030)
7 pt	0/1	0/2	3/12 (0.250)	0/19	1/115 (0.009)	4/149 (0.027)
8 pt	-	-	0/4	0/10	1/75 (0.013)	1/89 (0.011)
9 pt	-	0/2	0/2	0/8	0/44	0/56
10+ pts	-	-	0/3	0/9	0/54	0/66
Total	4/11 (0.364)	5/34 (0.147)	13/71 (0.183)	12/128 (0.094)	18/770 (0.023)	52/1014 (0.051)

(Away)	8 inn. left	7	6	5	4	Total
5 pt max deficit	3/5 (0.600)	4/25 (0.160)	13/49 (0.265)	14/74 (0.189)	8/369 (0.022)	42/522 (0.080)
6 pt	0/1	2/6 (0.333)	4/27 (0.148)	1/28 (0.036)	6/231 (0.026)	13/293 (0.044)
7 pt	-	0/4	1/12 (0.083)	0/28	3/143 (0.021)	4/187 (0.021)
8 pt	-	0/2	0/6	0/14	1/87 (0.011)	1/109 (0.009)
9 pt	-	0/4	1/5 (0.200)	0/7	0/54	1/70 (0.014)
10+ pts	-	0/1	0/3	0/11	0/81	0/96
Total	3/6 (0.500)	6/42 (0.143)	19/102 (0.186)	15/162 (0.093)	18/965 (0.019)	61/1277 (0.048)

Strong Disadvantage

(WinDiff < -0.188)

(Home)	8 inn. left	7	6	5	4	Total
5 pt max deficit	1/1 (1.000)	0/3	0/6	0/9	1/81 (0.012)	2/100 (0.02)
6 pt	-	0/2	0/5	0/6	1/55 (0.018)	1/68 (0.015)
7 pt	-	-	0/3	0/2	1/37 (0.027)	1/42 (0.024)
8 pt	-	-	-	0/2	0/23	0/25
9 pt	-	-	-	0/2	0/13	0/15
10+ pts	-	-	-	0/5	0/14	0/19
Total	1/1 (1.000)	0/5	0/14	0/26	3/223 (0.013)	4/269 (0.015)

(Away)	8 inn. left	7	6	5	4	Total
5 pt max deficit	1/1 (1.000)	1/5 (0.200)	4/11 (0.364)	2/15 (0.133)	1/88 (0.011)	9/120 (0.075)
6 pt	-	-	0/4	0/9	0/48	0/61
7 pt	-	0/1	-	0/8	0/29	0/38
8 pt	-	0/2	0/3	0/3	0/27	0/35
9 pt	-	-	0/1	0/4	0/21	0/26
10+ pts	-	-	-	0/1	0/23	0/24
Total	1/1 (1.000)	1/8 (0.125)	4/19 (0.211)	2/40 (0.050)	1/236 (0.004)	9/304 (0.030)

Moderate Disadvantage

(-0.043 <= WinDiff < 0.037)

(Home)	8 inn. left	7	6	5	4	Total
5 pt max deficit	0/2	2/4	0/4	5/12 (0.417)	2/57 (0.035)	9/79 (0.114)
6 pt	0/1	0/4	0/3	0/9	1/46 (0.022)	1/63 (0.016)
7 pt	-	0/1	0/2	0/3	0/33	0/39
8 pt	-	-	-	0/4	0/19	0/23
9 pt	-	0/1	0/1	0/2	0/12	0/16
10+ pts	-	-	-	-	0/16	0/16
Total	0/3	2/10 (0.200)	0/10	5/30 (0.167)	3/183 (0.016)	10/236 (0.042)

(Away)	8 inn. left	7	6	5	4	Total
5 pt max deficit	1/1	2/10	2/16	4/19 (0.211)	1/94 (0.011)	10/140 (0.071)
6 pt	-	0/1	0/5	0/6	1/57 (0.018)	1/69 (0.014)
7 pt	-	0/2	0/3	0/5	0/44	0/54
8 pt	-	-	0/1	0/4	0/21	0/26
9 pt	-	0/1	-	-	0/15	0/16
10+ pts	-	-	0/1	0/4	0/19	0/24
Total	1/1	2/14 (0.143)	2/26 (0.077)	4/38 (0.105)	2/250 (0.008)	11/329 (0.033)

Balanced

(-0.043 <= WinDiff < 0.037)

(Home)	8 inn. left	7	6	5	4	Total
5 pt max deficit	1/2 (0.500)	1/3 (0.333)	1/7 (0.143)	4/15 (0.267)	5/70 (0.0714)	12/97 (0.124)
6 pt	0/1	1/4 (0.250)	2/6 (0.333)	0/9	0/56	3/76 (0.039)
7 pt	-	0/1	2/3 (0.667)	0/6	0/22	2/32 (0.063)
8 pt	-	-	0/3	0/2	0/17	0/22
9 pt	-	-	-	0/4	0/9	0/13
10+ pts	-	-	0/2	0/1	0/18	0/21
Total	1/3 (0.333)	2/8 (0.250)	5/21 (0.238)	4/37 (0.108)	5/192 (0.108)	17/261 (0.065)

(Away)	8 inn. left	7	6	5	4	Total
5 pt max deficit	-	1/5 (0.200)	1/8 (0.125)	4/21 (0.190)	0/89	6/123 (0.049)
6 pt	0/1	1/1 (1.000)	1/7 (0.143)	0/7	1/61	3/77 (0.039)
7 pt	-	-	0/5	0/5	0/33	0/43
8 pt	-	-	0/3	0/3	1/24 (0.042)	1/030 (0.033)
9 pt	-	-	1/3 (0.333)	0/2	0/6	1/11 (0.091)
10+ pts	-	0/1	-	0/2	0/18	0/21
Total	0/1	2/7 (0.286)	3/26 (0.115)	4/40 (0.100)	2/231 (0.009)	11/305 (0.036)

Advantage

(WinDiff >= 0.037)

(Home)	8 inn. left	7	6	5	4	Total
5 pt max deficit	2/2 (1.000)	1/7 (0.143)	4/12 (0.333)	3/12 (0.250)	6/74 (0.081)	16/107 (0.150)
6 pt	0/1	0/3	3/7 (0.429)	0/10	0/43	3/64 (0.047)
7 pt	0/1	0/1	1/4 (0.250)	0/8	0/23	1/37 (0.027)
8 pt	-	-	0/1	0/2	1/16 (0.063)	1/19 (0.053)
9 pt	-	-	0/1	-	0/10	0/11
10+ pts	-	-	0/1	0/3	0/6	0/10
Total	2/4 (0.500)	1/11 (0.091)	8/26 (0.308)	3/35 (0.108)	7/172 (0.086)	21/248 (0.085)

(Away)	8 inn. left	7	6	5	4	Total
5 pt max deficit	1/3 (0.333)	1/5 (0.200)	6/14 (0.429)	4/19 (0.211)	6/98 (0.061)	8/139 (0.058)
6 pt	-	1/3 (0.333)	3/11 (0.273)	1/6 (0.167)	4/64 (0.063)	9/84 (0.107)
7 pt	-	0/2	1/1 (1.000)	0/10	3/37 (0.081)	4/50 (0.080)
8 pt	-	-	0/2	0/4	0/15	0/21
9 pt	-	0/3	0/1	0/1	0/12	0/17
10+ pts	-	-	0/2	0/4	0/21	0/27
Total	1/3 (0.333)	2/13 (0.154)	10/31 (0.323)	5/44 (0.114)	13/247 (0.053)	21/338 (0.062)

Down Three then Tied

This sample includes 265 games where the team was in a deficit of three or more in innings 1-3 and tied the game in innings 4-6. The only statistically significant explanatory variables on the outcome of the game were the win differential between teams, and the remaining innings after the tie.

Of these games, the home team was the team in the deficit more often (51.7% of games), but home and away teams made the comeback at similar proportions (44.5% of ties became reversals for both home and away). In both cases the disadvantage group completed the lowest proportion of comebacks, but that number didn’t increase as the win difference went positive, with the strong advantage groups in both home and away not having the highest proportion of completed comebacks.

For innings remaining in the game, for both home and away teams, the greatest number of innings remaining post-tie had the lowest percentage of completed comebacks. When the home team was down, teams with four innings remaining were the most likely to complete the comeback (doing so 50% of the time), with three innings remaining being similar (44.8%). For the away team, teams having three innings to compete the comeback did so 51.2% of the time, followed by four innings left, then three.

(Home)	3 inn. left	4	5	Total
Disadvantage (WinDiff < -0.084)	5/20 (0.250)	2/6 (0.333)	1/3 (0.333)	8/29 (0.276)
Balanced (-0.084 <= WinDiff < 0.012)	12/27 (0.444)	2/7 (0.286)	1/5 (0.200)	15/39 (0.385)
Advantage (0.012 <= WinDiff < 0.093)	16/29 (0.552)	5/9 (0.552)	2/2 (1.000)	23/40 (0.575)
Strong Advantage (0.093 <= WinDiff)	10/20 (0.500)	2/4 (0.500)	3/5 (0.600)	15/29 (0.517)
Total	43/96 (0.448)	11/22 (0.500)	7/17 (0.412)	61/137 (0.445)

(Away)	3 inn. left	4	5	Total
Disadvantage (WinDiff < -0.084)	8/27 (0.296)	0/5	1/4 (0.0.250)	9/36 (0.250)
Balanced (-0.084 <= WinDiff < 0.012)	10/16 (0.625)	3/9 (0.333)	2/2 (1.000)	15/27 (0.556)
Advantage (0.012 <= WinDiff < 0.093)	9/15 (0.600)	3/7(0.429)	0/5	12/27 (0.444)
Strong Advantage (0.093 <= WinDiff)	17/28 (0.607)	3/5 (0.600)	1/5 (0.200)	21/38 (0.553)
Total	44/86 (0.512)	9/26 (0.346)	4/16 (0.250)	57/128(0.445)

Pairwise Comparisons

Looking only at the tables for the First Five innings, and focusing on where the total games for each points deficit are, we can check the statistical significance of the observed differences by doing a comparison test between the home and away comebak proportions. In the overall table, a home team down 5 points at most came back 36/383 times, while the away team made the comeback 42/522 times. The comparison test gives us a p-value of 0.3199, and the confidence interval includes zero, so we can say that the observed difference is not statistically significant.

# Overall
prop.test(x = c(39, 42), n = c(383, 522)) # 5pt

## 
##  2-sample test for equality of proportions with continuity correction
## 
## data:  c(39, 42) out of c(383, 522)
## X-squared = 0.98942, df = 1, p-value = 0.3199
## alternative hypothesis: two.sided
## 95 percent confidence interval:
##  -0.01912887  0.06186468
## sample estimates:
##     prop 1     prop 2 
## 0.10182768 0.08045977

Apply the same test to the rest of the comparisons:

P-values for home/away pairwise comparisons for each scenario

	Overall	Strong Disadvantage	Moderate Disadvantage	Balanced	Advantage
5 run max deficit	0.3199	0.1204	0.4106	0.07747	0.02826
6 run	0.479	1	1	1	0.3045
7 run	1	1	NA	0.3487	0.5594
Total	0.7728	0.3675	0.7426	0.1629	0.377

Of these tests, the only comparison that had a statistically significant p-value, and the only confidence interval that did not include zero was for the five run deficit in the Advantage table. When we adjust the p-values to account for the 20 comparison tests, even this value is no longer statistically significant.

pvalues <- c(0.3199, 0.479, 1, 0.7728, 0.1204, 1, 1, 0.3675, 0.4106, 1, NA, 0.7426, 0.07747, 1, 0.3487, 
             0.1629, 0.02826, 0.3045, 0.5594, 0.377)

p.adjust(pvalues, method = "holm")

##  [1] 1.00000 1.00000 1.00000 1.00000 1.00000 1.00000 1.00000 1.00000 1.00000
## [10] 1.00000      NA 1.00000 1.00000 1.00000 1.00000 1.00000 0.53694 1.00000
## [19] 1.00000 1.00000

If we run the same comparison tests for the Down Three then Tied tables with the totals in each win difference category, we get similar results. None of the p-values (Disadvantage = 1, Balanced = 0.2628, Advantage = 0.4237, Strong Advantage = 0.9676, Total = 1) are significant, and all of the confidence intervals include zero.

# TTT
prop.test(x = c(8, 9), n = c(29, 36)) # Disadvantage
prop.test(x = c(15, 15), n = c(39, 27)) # Balanced
prop.test(x = c(23, 12), n = c(40, 27)) # Advantage
prop.test(x = c(15, 21), n = c(29, 38)) # Strong Advantage
prop.test(x = c(61, 57), n = c(137, 128)) # Total

Since none of the comparisons are significant for any of the scenarios, we can say that the observed differences are likely due to sampling variance, and do not indicate any differences in performance due to being home or away. This goes along with the results of the logistic regression, where home/away was not a significant factor into whether or not the team made the comeback, and also supports an idea that was talked about in the literature review, where baseball has a less observed home team advantage compared to other team sports.