Is “home team advantage” a statistically significant phenomenon in the National Football League (NFL)? If so, what variables in the data can explain this phenomenon? Linear and logistic regression are used to explore the relationship of various game data features on game outcomes. We attempt to control for variables that impact the expected result of a game and identify if there remains any additional advantage for the home team.
The phenomena of home-field-advantage is well documented (Swartz, T. B. et al., Jamieson, J. P.) Review of Cleveland, et al., Swartz, T. B., et al. contributed to rigorous selection and computation of control variables. However, there is evidence that the home-field advantage effect is shrinking (Kilgore, A., & Greenberg, N.), even more so in the age of the pandemic where some games are played with no fans in the stands (Ponzo, M., et al., Mccarrick, D et al.).
For this analysis, multiple data sets have been joined to produce both factor and continuous variables related to a game’s time, location, and conditions as well as a team’s injuries, amount of rest, distance traveled, and relative ability. These variables are regressed on the spread of the score where negative values indicate a home team loss. Additionally, we have implemented a logistic regression model wherein the same explanatory variables are used to explain the log-likelihood of a home-team win. A random forest model is also implemented to explore relative variable importance. . # Analysis
#install.packages("tidyverse")
#install.packages()
#install.packages("devtools")
#devtools::install_github(repo = "maksimhorowitz/nflscrapR")
#install.packages("nflreadr")
#install.packages("car")
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
## Warning: package 'nflreadr' was built under R version 4.1.2
## Loading required package: carData
##
## Attaching package: 'car'
## The following object is masked from 'package:dplyr':
##
## recode
##
## Attaching package: 'lubridate'
## The following objects are masked from 'package:base':
##
## date, intersect, setdiff, union
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.1 ──
## ✓ tibble 3.1.6 ✓ purrr 0.3.4
## ✓ tidyr 1.1.3 ✓ stringr 1.4.0
## ✓ readr 2.1.1 ✓ forcats 0.5.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## x lubridate::as.difftime() masks base::as.difftime()
## x lubridate::date() masks base::date()
## x dplyr::filter() masks stats::filter()
## x lubridate::intersect() masks base::intersect()
## x dplyr::lag() masks stats::lag()
## x car::recode() masks dplyr::recode()
## x lubridate::setdiff() masks base::setdiff()
## x purrr::some() masks car::some()
## x lubridate::union() masks base::union()
## Loading required package: lattice
##
## Attaching package: 'caret'
## The following object is masked from 'package:purrr':
##
## lift
## randomForest 4.6-14
## Type rfNews() to see new features/changes/bug fixes.
##
## Attaching package: 'randomForest'
## The following object is masked from 'package:ggplot2':
##
## margin
## The following object is masked from 'package:dplyr':
##
## combine
The games dataset is the main dataset that will be used for analysis. It contains a row for each NFL games from 1999 to 2021, but only seasons from 2010 to 2019 will be included in the analysis. Also, pre-season games are not included in the analysis.
First, various attributes were selected to use as potential predicting or response variables.
One of those variables is moneyline, a betting wager, which can be converted with the following formula into the win probability based on bets placed before the game. Spread_line is another useful variable based on betting odds that is used to even out two uneven teams. Both of these variables are driven by advanced analytics and actual wagering, so they are complex and do account for any believed effect of the home team advantage. According to sports news outlets, generally, this accounts for as many as 3 but more realistically 1-2 spread points.
## game_id season game_type week gameday weekday gametime away_team
## 1 2010_01_MIN_NO 2010 REG 1 2010-09-09 Thursday 20:30 MIN
## 2 2010_01_MIA_BUF 2010 REG 1 2010-09-12 Sunday 13:00 MIA
## 3 2010_01_DET_CHI 2010 REG 1 2010-09-12 Sunday 13:00 DET
## 4 2010_01_IND_HOU 2010 REG 1 2010-09-12 Sunday 13:00 IND
## 5 2010_01_DEN_JAX 2010 REG 1 2010-09-12 Sunday 13:00 DEN
## 6 2010_01_CIN_NE 2010 REG 1 2010-09-12 Sunday 13:00 CIN
## home_team home_score away_score location result total overtime home_rest
## 1 NO 14 9 Home 5 23 0 7
## 2 BUF 10 15 Home -5 25 0 7
## 3 CHI 19 14 Home 5 33 0 7
## 4 HOU 34 24 Home 10 58 0 7
## 5 JAX 24 17 Home 7 41 0 7
## 6 NE 38 24 Home 14 62 0 7
## away_rest home_moneyline home_prob spread_line total_line div_game roof
## 1 7 -220 0.6875000 4.5 48.5 0 dome
## 2 7 140 0.4166667 -3.0 39.5 1 outdoors
## 3 7 -280 0.7368421 6.5 44.5 1 outdoors
## 4 7 106 0.4854369 -1.0 47.5 1 closed
## 5 7 -185 0.6491228 3.0 41.5 0 outdoors
## 6 7 -230 0.6969697 5.0 44.5 0 outdoors
## surface temp wind
## 1 sportturf NA NA
## 2 astroplay 62 7
## 3 grass 75 10
## 4 grass NA NA
## 5 grass 90 10
## 6 fieldturf 62 10
This matrix of distances in miles will be use to determine the distances traveled (in miles) by the away team. Our hypothesis is that this may contribute to the higher rate loss for the away team.
First, as a gut check, a regression analysis will be run to evaluate the accuracy of spread_line as a predictor of the actual result of the game. Based on the plot below, there does appear to be a strong linear relationship.
## `geom_smooth()` using formula 'y ~ x'
Using result as the response variable and spread_line as the predicting variable, we would expect the regression forumula to indicate that the result plus the spread is zero without an intercept, meaning the coefficient of spread_line should be 1.
##
## Call:
## lm(formula = result ~ spread_line + 0, data = games_select)
##
## Residuals:
## Min 1Q Median 3Q Max
## -52.239 -8.585 -0.361 7.804 47.142
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## spread_line 1.03412 0.03672 28.16 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 13.25 on 3223 degrees of freedom
## Multiple R-squared: 0.1975, Adjusted R-squared: 0.1973
## F-statistic: 793.2 on 1 and 3223 DF, p-value: < 2.2e-16
## [1] 101 2858
## [1] "Is coeff = 1? p-value: 0.352853653933552"
## 2.5 % 97.5 %
## spread_line 0.962127 1.106111
Indeed, the coefficient for spread_line is statistically significant, the overall regression is statistically significant, and there are no obvious violations of regression assumptions. At an alpha level of 0.05, the coefficient for spread_line is not statistically significantly equal to 1, but the confidence interval shows that it is likely between 0.96 and 1.12. It is also not statistically significantly less than or greater than 1 either.
It is reasonable to conclude that spread_line is a fairly accurate predictor of the result of a game, but the R-squared value is not particularly high. It would not be relevant as a controlling variable for evaluating home team advantage under the assumption that it takes into account home team advantage, which would mean controlling for it would remove the ability to identify home team advantage individually in our modeling.
Every year, there are a few regular season games that are played in a neutral location, usually overseas. The following regression analysis will evaluate if this factor (game played at home) is associated with a decrease in the result. The super bowl is also played in a neutral location, but this is excluded from the analysis because it is not a regular season game. Additionally, only seasons in which there actually were games of this type are included.
##
## Call:
## lm(formula = result ~ location, data = games_reg)
##
## Residuals:
## Min 1Q Median 3Q Max
## -51.018 -9.018 0.982 7.982 55.982
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 2.0177 0.2664 7.575 4.71e-14 ***
## locationNeutral -4.1528 2.4333 -1.707 0.088 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 14.71 on 3086 degrees of freedom
## Multiple R-squared: 0.000943, Adjusted R-squared: 0.0006192
## F-statistic: 2.913 on 1 and 3086 DF, p-value: 0.08799
## # A tibble: 2 × 2
## location games
## <fct> <int>
## 1 Home 3051
## 2 Neutral 37
The results of this regression analysis can be interpreted as comparing the average result at a game played at the “home” team’s home field versus a game played at a neutral location. The negative coefficient for locationNeutral supports an association between playing a team at a neutral location and a decrease in the overall result of the game. The regression coefficient and the overall regression are significant at a 0.05 alpha level, but not at a 0.01 alpha level. There are only 31 games (~1%) that are played in a neutral location however. Additionally the R-squared for this model is extremely low. In summary, this model indicates that a more thorough analysis is needed.
To perform a more thorough analysis, additional predicting and controlling variables should be analyzed, with a focus on the regular season games that are played at a home location.
For this analysis, the data will be reformatted to a structure that has the following key variables: team, opponent, home which will be a dummy variable set to 1 if the “team” is the home team, team_score, the end score of the “team”, opponent_score, the end score of the “opponent”, and result, the team_score minus the opponent_score. Each game will appear twice, once for each team and home and away. Other variables will be kept and assessed as predictor variables for additional analysis. Only regular season games will be considered.
Additionally, a controlling variable point_diff will be computed. This variable will be the weighted average point differential (team score - opponent score) for the “team” for that season, weighted ~50% in favor of the prior three games if there are 3 prior games. For the first 3 games of each season, the overall season point differential will be used, but a dummy variable will be added to indicate that this value is computed differently.
## `summarise()` has grouped output by 'season'. You can override using the
## `.groups` argument.
## # A tibble: 6 × 23
## # Groups: season, team [1]
## season week gameday weekday gametime traveled team opponent
## <int> <int> <dttm> <chr> <chr> <int> <chr> <chr>
## 1 2010 1 2010-09-12 00:00:00 Sunday 16:15 1517 ARI STL
## 2 2010 2 2010-09-19 00:00:00 Sunday 13:00 1868 ARI ATL
## 3 2010 3 2010-09-26 00:00:00 Sunday 16:15 745 ARI OAK
## 4 2010 4 2010-10-03 00:00:00 Sunday 16:15 358 ARI SD
## 5 2010 5 2010-10-10 00:00:00 Sunday 16:05 1548 ARI NO
## 6 2010 7 2010-10-24 00:00:00 Sunday 16:05 1513 ARI SEA
## # … with 15 more variables: home <dbl>, team_score <int>, opponent_score <int>,
## # result <int>, team_rest <int>, opponent_rest <int>, roof <chr>,
## # surface <chr>, temp <dbl>, wind <int>, team_game_number <int>,
## # weighted_point_diff <dbl>, first_three_game <dbl>, primetime_game <dbl>,
## # outdoor_game <dbl>
Simple Model The first linear regression model below will use result as the response variable and just weighted_point_diff and home as predicting variables to start.
##
## Call:
## lm(formula = result ~ home + weighted_point_diff, data = team_games)
##
## Residuals:
## Min 1Q Median 3Q Max
## -50.784 -8.707 0.165 8.531 51.337
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -2.0798 0.2480 -8.386 <2e-16 ***
## home 4.1498 0.3507 11.832 <2e-16 ***
## weighted_point_diff 0.7840 0.0264 29.694 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 13.7 on 6099 degrees of freedom
## Multiple R-squared: 0.1426, Adjusted R-squared: 0.1423
## F-statistic: 507.1 on 2 and 6099 DF, p-value: < 2.2e-16
## [1] 2486 1031
The model is statistically significant overall, and the coefficient for home team is 4.65 and statistically significant, meaning that a team’s score played at home is associated with an increase of 4.65 points, holding the weighted point differential constant. The model does not seem to violate normality and constant variance assumptions badly, although there might be a slight tail to the residuals and a possible negative trend in variance. The R-squared value is relatively low, so adding additional predicting variables may help. However, the direction of the result is promising in confirming the home team advantage hypothesis at a high level.
Note: versions of this model removing the first three games or adding an interaction term for the weighted point differential and first three games did not significantly alter the results of this model.
Adding Predicting Variables
##
## Call:
## lm(formula = result ~ home + weighted_point_diff + team_rest +
## opponent_rest + primetime_game + outdoor_game + temp, data = team_games)
##
## Residuals:
## Min 1Q Median 3Q Max
## -50.906 -8.743 0.153 8.545 51.313
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -2.352010 0.900166 -2.613 0.009001 **
## home 4.176994 0.351232 11.892 < 2e-16 ***
## weighted_point_diff 0.794601 0.026567 29.909 < 2e-16 ***
## team_rest 0.108921 0.089452 1.218 0.223404
## opponent_rest -0.047464 0.089467 -0.531 0.595766
## primetime_game -1.803615 0.529212 -3.408 0.000658 ***
## outdoor_game -0.212524 0.600447 -0.354 0.723393
## temp 0.005122 0.008519 0.601 0.547698
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 13.69 on 6094 degrees of freedom
## Multiple R-squared: 0.1445, Adjusted R-squared: 0.1435
## F-statistic: 147 on 7 and 6094 DF, p-value: < 2.2e-16
home variable is still statistically significant.Adding injury data: for this, we collected data on reported injuries, filtered out players who either never started or had questionable report statuses. The regression variable will be number of injured players on the home team and the away team.
## ── nflverse ───────────────────────────────────────────────────────────────────
## # A tibble: 6 × 16
## season season_type team week gsis_id position full_name first_name last_name
## <int> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
## 1 2010 POST BAL 1 00-002… LB Dannell … Dannell Ellerbe
## 2 2010 POST BAL 1 00-002… LB Tavares … Tavares Gooden
## 3 2010 POST BAL 1 00-002… DT Haloti N… Haloti Ngata
## 4 2010 POST BAL 1 00-002… S Ed Reed Ed Reed
## 5 2010 POST BAL 1 00-002… LB Terrell … Terrell Suggs
## 6 2010 POST BAL 1 00-002… CB Josh Wil… Josh Wilson
## # … with 7 more variables: report_primary_injury <chr>,
## # report_secondary_injury <chr>, report_status <chr>,
## # practice_primary_injury <chr>, practice_secondary_injury <chr>,
## # practice_status <chr>, date_modified <dttm>
Adding standings: using standings for current season would be “cheating” since it would include information about whether the team won or lost the game, but using the win rate for the previous season would be reasonable here.
Fit the same regression model
##
## Call:
## lm(formula = result ~ home + weighted_point_diff + team_rest +
## opponent_rest + primetime_game + outdoor_game + temp + n_injured_home +
## n_injured_away + home_wr_prev + away_wr_prev + as.factor(team) +
## as.factor(opponent) + traveled, data = df_combine)
##
## Residuals:
## Min 1Q Median 3Q Max
## -48.163 -8.194 -0.043 8.299 48.205
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 4.750e-04 1.827e+00 0.000 0.99979
## home 4.137e+00 3.434e-01 12.048 < 2e-16 ***
## weighted_point_diff 6.944e-01 3.058e-02 22.709 < 2e-16 ***
## team_rest 1.008e-01 8.794e-02 1.147 0.25154
## opponent_rest -5.529e-02 8.795e-02 -0.629 0.52962
## primetime_game -5.382e-01 5.400e-01 -0.997 0.31898
## outdoor_game 1.555e-01 6.801e-01 0.229 0.81911
## temp -3.355e-03 8.789e-03 -0.382 0.70266
## n_injured_home -1.522e-01 7.531e-02 -2.021 0.04334 *
## n_injured_away 2.164e-01 7.513e-02 2.881 0.00398 **
## home_wr_prev 2.590e+00 9.730e-01 2.662 0.00779 **
## away_wr_prev -5.203e+00 9.650e-01 -5.392 7.25e-08 ***
## as.factor(team)ATL -2.566e-01 1.399e+00 -0.183 0.85443
## as.factor(team)BAL 5.409e-01 1.437e+00 0.376 0.70664
## as.factor(team)BUF -5.426e-01 1.432e+00 -0.379 0.70467
## as.factor(team)CAR -5.117e-01 1.416e+00 -0.361 0.71791
## as.factor(team)CHI -1.611e-01 1.417e+00 -0.114 0.90946
## as.factor(team)CIN -3.444e-01 1.429e+00 -0.241 0.80957
## as.factor(team)CLE -1.112e+00 1.440e+00 -0.772 0.44022
## as.factor(team)DAL 1.643e-01 1.386e+00 0.118 0.90568
## as.factor(team)DEN -6.682e-02 1.427e+00 -0.047 0.96264
## as.factor(team)DET -1.931e-02 1.396e+00 -0.014 0.98896
## as.factor(team)GB 7.197e-01 1.430e+00 0.503 0.61486
## as.factor(team)HOU -1.590e+00 1.393e+00 -1.141 0.25378
## as.factor(team)IND -1.540e+00 1.409e+00 -1.093 0.27441
## as.factor(team)JAX -2.721e+00 1.449e+00 -1.879 0.06033 .
## as.factor(team)KC -5.657e-03 1.440e+00 -0.004 0.99686
## as.factor(team)LA 1.665e+00 1.817e+00 0.916 0.35954
## as.factor(team)LAC 2.411e-01 1.989e+00 0.121 0.90351
## as.factor(team)LV -6.734e-01 3.392e+00 -0.199 0.84264
## as.factor(team)MIA -1.093e+00 1.429e+00 -0.765 0.44437
## as.factor(team)MIN 1.586e-01 1.390e+00 0.114 0.90917
## as.factor(team)NE 1.441e+00 1.489e+00 0.968 0.33315
## as.factor(team)NO 4.866e-01 1.404e+00 0.347 0.72882
## as.factor(team)NYG -8.027e-01 1.417e+00 -0.566 0.57115
## as.factor(team)NYJ -1.232e+00 1.437e+00 -0.857 0.39132
## as.factor(team)OAK -1.299e+00 1.510e+00 -0.861 0.38947
## as.factor(team)PHI -2.787e-01 1.416e+00 -0.197 0.84399
## as.factor(team)PIT 2.380e-01 1.445e+00 0.165 0.86920
## as.factor(team)SD -4.723e-01 1.639e+00 -0.288 0.77327
## as.factor(team)SEA 9.623e-01 1.423e+00 0.676 0.49896
## as.factor(team)SF 3.308e-01 1.410e+00 0.235 0.81457
## as.factor(team)STL -7.918e-01 1.692e+00 -0.468 0.63975
## as.factor(team)TB -2.491e-01 1.416e+00 -0.176 0.86041
## as.factor(team)TEN -1.990e+00 1.426e+00 -1.396 0.16281
## as.factor(team)WAS -8.984e-01 1.422e+00 -0.632 0.52741
## as.factor(opponent)ATL -1.168e+00 1.398e+00 -0.835 0.40355
## as.factor(opponent)BAL -4.502e+00 1.428e+00 -3.153 0.00163 **
## as.factor(opponent)BUF -4.929e-01 1.432e+00 -0.344 0.73063
## as.factor(opponent)CAR 1.771e-01 1.417e+00 0.125 0.90050
## as.factor(opponent)CHI -6.718e-02 1.417e+00 -0.047 0.96218
## as.factor(opponent)CIN -7.395e-01 1.429e+00 -0.518 0.60474
## as.factor(opponent)CLE 2.681e+00 1.438e+00 1.865 0.06223 .
## as.factor(opponent)DAL -2.333e+00 1.385e+00 -1.685 0.09205 .
## as.factor(opponent)DEN -1.505e+00 1.426e+00 -1.056 0.29112
## as.factor(opponent)DET -1.807e-01 1.396e+00 -0.129 0.89702
## as.factor(opponent)GB -4.650e+00 1.422e+00 -3.270 0.00108 **
## as.factor(opponent)HOU 1.071e+00 1.393e+00 0.768 0.44232
## as.factor(opponent)IND 3.658e-01 1.409e+00 0.260 0.79516
## as.factor(opponent)JAX 5.915e+00 1.440e+00 4.107 4.06e-05 ***
## as.factor(opponent)KC -3.075e+00 1.435e+00 -2.142 0.03220 *
## as.factor(opponent)LA -5.818e+00 1.810e+00 -3.214 0.00131 **
## as.factor(opponent)LAC -1.689e+00 1.988e+00 -0.850 0.39551
## as.factor(opponent)LV 2.612e+00 3.390e+00 0.770 0.44114
## as.factor(opponent)MIA 1.620e+00 1.429e+00 1.134 0.25694
## as.factor(opponent)MIN -1.396e+00 1.390e+00 -1.004 0.31546
## as.factor(opponent)NE -8.354e+00 1.460e+00 -5.720 1.11e-08 ***
## as.factor(opponent)NO -4.542e+00 1.396e+00 -3.253 0.00115 **
## as.factor(opponent)NYG 1.338e+00 1.417e+00 0.944 0.34511
## as.factor(opponent)NYJ 3.773e+00 1.432e+00 2.634 0.00846 **
## as.factor(opponent)OAK 3.433e+00 1.506e+00 2.279 0.02268 *
## as.factor(opponent)PHI -1.870e+00 1.415e+00 -1.322 0.18634
## as.factor(opponent)PIT -3.245e+00 1.440e+00 -2.253 0.02428 *
## as.factor(opponent)SD -1.467e+00 1.639e+00 -0.895 0.37080
## as.factor(opponent)SEA -4.528e+00 1.417e+00 -3.196 0.00140 **
## as.factor(opponent)SF -1.790e+00 1.410e+00 -1.270 0.20420
## as.factor(opponent)STL 1.256e+00 1.691e+00 0.743 0.45762
## as.factor(opponent)TB 2.461e-01 1.416e+00 0.174 0.86208
## as.factor(opponent)TEN 1.631e+00 1.426e+00 1.144 0.25275
## as.factor(opponent)WAS 2.023e+00 1.420e+00 1.425 0.15433
## traveled -8.516e-05 2.846e-04 -0.299 0.76478
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 13.27 on 5927 degrees of freedom
## (94 observations deleted due to missingness)
## Multiple R-squared: 0.2037, Adjusted R-squared: 0.193
## F-statistic: 18.96 on 80 and 5927 DF, p-value: < 2.2e-16
## [1] 356 151
This model uses ‘home-win’ as binary response variable, where a home-team win = 1 and a home-team loss = 0. The explanatory variables are the same as above.
##
## Call:
## glm(formula = home_win ~ home + weighted_point_diff + team_rest +
## opponent_rest + primetime_game + outdoor_game + temp + n_injured_home +
## n_injured_away + home_wr_prev + away_wr_prev + as.factor(team) +
## as.factor(opponent) + traveled, family = binomial(link = "logit"),
## data = df_combine)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -2.2817 -1.0322 -0.3953 1.0277 2.2850
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -3.319e-02 2.979e-01 -0.111 0.911274
## home 5.819e-01 5.653e-02 10.294 < 2e-16 ***
## weighted_point_diff 9.136e-02 5.226e-03 17.480 < 2e-16 ***
## team_rest 2.173e-02 1.447e-02 1.502 0.133082
## opponent_rest -1.876e-02 1.446e-02 -1.297 0.194637
## primetime_game -7.068e-02 8.765e-02 -0.806 0.420026
## outdoor_game 1.483e-02 1.116e-01 0.133 0.894278
## temp -4.105e-04 1.450e-03 -0.283 0.777030
## n_injured_home -2.693e-02 1.242e-02 -2.169 0.030095 *
## n_injured_away 3.611e-02 1.241e-02 2.909 0.003628 **
## home_wr_prev 3.993e-01 1.596e-01 2.503 0.012325 *
## away_wr_prev -7.596e-01 1.586e-01 -4.789 1.67e-06 ***
## as.factor(team)ATL -1.511e-01 2.260e-01 -0.669 0.503661
## as.factor(team)BAL -1.182e-01 2.349e-01 -0.503 0.614669
## as.factor(team)BUF -1.118e-01 2.328e-01 -0.480 0.631163
## as.factor(team)CAR -2.103e-01 2.303e-01 -0.913 0.361020
## as.factor(team)CHI -9.976e-02 2.292e-01 -0.435 0.663401
## as.factor(team)CIN -2.495e-01 2.318e-01 -1.077 0.281667
## as.factor(team)CLE -3.611e-01 2.383e-01 -1.515 0.129789
## as.factor(team)DAL -1.407e-01 2.229e-01 -0.631 0.528043
## as.factor(team)DEN -4.517e-02 2.329e-01 -0.194 0.846215
## as.factor(team)DET -2.500e-01 2.259e-01 -1.106 0.268526
## as.factor(team)GB 1.452e-01 2.357e-01 0.616 0.537995
## as.factor(team)HOU -2.771e-01 2.273e-01 -1.219 0.222817
## as.factor(team)IND -1.574e-01 2.296e-01 -0.686 0.492974
## as.factor(team)JAX -6.136e-01 2.464e-01 -2.490 0.012782 *
## as.factor(team)KC 6.978e-02 2.379e-01 0.293 0.769301
## as.factor(team)LA 1.521e-01 2.998e-01 0.507 0.611880
## as.factor(team)LAC -1.024e-01 3.174e-01 -0.323 0.746996
## as.factor(team)LV 7.221e-01 5.425e-01 1.331 0.183135
## as.factor(team)MIA -1.481e-02 2.313e-01 -0.064 0.948933
## as.factor(team)MIN -1.106e-01 2.239e-01 -0.494 0.621232
## as.factor(team)NE 4.407e-02 2.505e-01 0.176 0.860352
## as.factor(team)NO -5.075e-02 2.287e-01 -0.222 0.824392
## as.factor(team)NYG -2.285e-01 2.302e-01 -0.993 0.320789
## as.factor(team)NYJ -8.092e-02 2.357e-01 -0.343 0.731336
## as.factor(team)OAK -7.074e-03 2.454e-01 -0.029 0.977005
## as.factor(team)PHI -2.546e-01 2.286e-01 -1.114 0.265420
## as.factor(team)PIT 5.844e-02 2.366e-01 0.247 0.804898
## as.factor(team)SD -4.132e-01 2.635e-01 -1.568 0.116931
## as.factor(team)SEA 5.053e-03 2.318e-01 0.022 0.982605
## as.factor(team)SF -7.350e-02 2.305e-01 -0.319 0.749838
## as.factor(team)STL -2.331e-01 2.765e-01 -0.843 0.399093
## as.factor(team)TB -1.913e-01 2.299e-01 -0.832 0.405330
## as.factor(team)TEN -2.555e-01 2.309e-01 -1.106 0.268526
## as.factor(team)WAS -1.627e-01 2.302e-01 -0.707 0.479630
## as.factor(opponent)ATL 1.269e-02 2.258e-01 0.056 0.955188
## as.factor(opponent)BAL -3.399e-01 2.337e-01 -1.454 0.145810
## as.factor(opponent)BUF 2.881e-02 2.322e-01 0.124 0.901270
## as.factor(opponent)CAR 1.953e-01 2.290e-01 0.853 0.393720
## as.factor(opponent)CHI 1.412e-01 2.283e-01 0.618 0.536433
## as.factor(opponent)CIN 1.029e-01 2.310e-01 0.445 0.656162
## as.factor(opponent)CLE 6.216e-01 2.389e-01 2.602 0.009262 **
## as.factor(opponent)DAL -9.999e-02 2.223e-01 -0.450 0.652890
## as.factor(opponent)DEN -9.003e-02 2.299e-01 -0.392 0.695375
## as.factor(opponent)DET 2.218e-01 2.252e-01 0.985 0.324535
## as.factor(opponent)GB -6.674e-01 2.355e-01 -2.834 0.004604 **
## as.factor(opponent)HOU 2.557e-01 2.265e-01 1.129 0.258933
## as.factor(opponent)IND 3.764e-02 2.287e-01 0.165 0.869261
## as.factor(opponent)JAX 1.071e+00 2.446e-01 4.380 1.19e-05 ***
## as.factor(opponent)KC -4.348e-01 2.347e-01 -1.852 0.063989 .
## as.factor(opponent)LA -6.867e-01 3.012e-01 -2.280 0.022623 *
## as.factor(opponent)LAC -5.050e-02 3.252e-01 -0.155 0.876592
## as.factor(opponent)LV -4.170e-01 5.468e-01 -0.763 0.445670
## as.factor(opponent)MIA 1.351e-01 2.323e-01 0.582 0.560872
## as.factor(opponent)MIN -4.723e-02 2.244e-01 -0.210 0.833316
## as.factor(opponent)NE -9.314e-01 2.471e-01 -3.769 0.000164 ***
## as.factor(opponent)NO -4.353e-01 2.282e-01 -1.908 0.056383 .
## as.factor(opponent)NYG 3.512e-01 2.300e-01 1.527 0.126866
## as.factor(opponent)NYJ 4.587e-01 2.335e-01 1.964 0.049529 *
## as.factor(opponent)OAK 3.220e-01 2.440e-01 1.319 0.187042
## as.factor(opponent)PHI -8.119e-03 2.273e-01 -0.036 0.971506
## as.factor(opponent)PIT -4.711e-01 2.377e-01 -1.982 0.047468 *
## as.factor(opponent)SD 2.045e-01 2.658e-01 0.769 0.441692
## as.factor(opponent)SEA -4.436e-01 2.305e-01 -1.924 0.054297 .
## as.factor(opponent)SF -7.453e-02 2.277e-01 -0.327 0.743445
## as.factor(opponent)STL 3.119e-01 2.776e-01 1.123 0.261250
## as.factor(opponent)TB 2.411e-01 2.291e-01 1.052 0.292601
## as.factor(opponent)TEN 2.479e-01 2.309e-01 1.073 0.283052
## as.factor(opponent)WAS 3.744e-01 2.308e-01 1.622 0.104811
## traveled -9.168e-06 4.653e-05 -0.197 0.843810
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 8328.8 on 6007 degrees of freedom
## Residual deviance: 7349.5 on 5927 degrees of freedom
## (94 observations deleted due to missingness)
## AIC: 7511.5
##
## Number of Fisher Scoring iterations: 4
Like the linear models, the only statistically significant variables in the logistic model are control variables.
None of the explanatory variables explored have proved to have relationship with home-team wins. However, there is still an anomaly in the data that can be analyzed. The phenomenon of home-team advantage appears to be declining in recent years. Anecdotally, home-team advantage is often attributed to the psycho-social effect of the cheers from fan when playing at home. The pandemic provides a useful natural experiment to test this hypothesis. In 2020 and 2021, games were played with no fans in the stands.
Here we have created a binary variable for games played with no fans in the stands and use it as an explanatory variable in a logistic regression.
##
## Call:
## glm(formula = home_win ~ pandemic, family = binomial(link = "logit"),
## data = df_combine)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -1.175 -1.175 -1.174 1.180 1.181
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -0.005718 0.026736 -0.214 0.831
## pandemic -0.002187 0.092844 -0.024 0.981
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 8459.1 on 6101 degrees of freedom
## Residual deviance: 8459.1 on 6100 degrees of freedom
## AIC: 8463.1
##
## Number of Fisher Scoring iterations: 3
##
## Call:
## glm(formula = home_win ~ home + weighted_point_diff + team_rest +
## opponent_rest + primetime_game + outdoor_game + temp + n_injured_home +
## n_injured_away + home_wr_prev + away_wr_prev + as.factor(team) +
## as.factor(opponent) + traveled + pandemic, family = binomial(link = "logit"),
## data = df_combine)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -2.2791 -1.0320 -0.3951 1.0290 2.2703
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -4.359e-02 2.989e-01 -0.146 0.884062
## home 5.819e-01 5.653e-02 10.294 < 2e-16 ***
## weighted_point_diff 9.141e-02 5.228e-03 17.485 < 2e-16 ***
## team_rest 2.178e-02 1.447e-02 1.505 0.132248
## opponent_rest -1.872e-02 1.446e-02 -1.294 0.195495
## primetime_game -7.061e-02 8.765e-02 -0.806 0.420519
## outdoor_game 1.984e-02 1.122e-01 0.177 0.859663
## temp -4.805e-04 1.459e-03 -0.329 0.741901
## n_injured_home -2.586e-02 1.267e-02 -2.041 0.041217 *
## n_injured_away 3.714e-02 1.265e-02 2.936 0.003329 **
## home_wr_prev 3.991e-01 1.596e-01 2.502 0.012363 *
## away_wr_prev -7.600e-01 1.586e-01 -4.792 1.66e-06 ***
## as.factor(team)ATL -1.501e-01 2.260e-01 -0.664 0.506494
## as.factor(team)BAL -1.186e-01 2.349e-01 -0.505 0.613615
## as.factor(team)BUF -1.124e-01 2.328e-01 -0.483 0.629312
## as.factor(team)CAR -2.098e-01 2.303e-01 -0.911 0.362267
## as.factor(team)CHI -1.008e-01 2.293e-01 -0.440 0.660193
## as.factor(team)CIN -2.494e-01 2.318e-01 -1.076 0.281779
## as.factor(team)CLE -3.621e-01 2.384e-01 -1.519 0.128847
## as.factor(team)DAL -1.407e-01 2.229e-01 -0.631 0.527958
## as.factor(team)DEN -4.467e-02 2.328e-01 -0.192 0.847856
## as.factor(team)DET -2.496e-01 2.259e-01 -1.105 0.269115
## as.factor(team)GB 1.439e-01 2.357e-01 0.611 0.541519
## as.factor(team)HOU -2.766e-01 2.273e-01 -1.217 0.223754
## as.factor(team)IND -1.590e-01 2.297e-01 -0.692 0.488827
## as.factor(team)JAX -6.126e-01 2.464e-01 -2.486 0.012924 *
## as.factor(team)KC 6.943e-02 2.380e-01 0.292 0.770467
## as.factor(team)LA 1.503e-01 2.999e-01 0.501 0.616298
## as.factor(team)LAC -1.056e-01 3.174e-01 -0.333 0.739251
## as.factor(team)LV 7.320e-01 5.429e-01 1.348 0.177567
## as.factor(team)MIA -1.465e-02 2.314e-01 -0.063 0.949525
## as.factor(team)MIN -1.099e-01 2.239e-01 -0.491 0.623639
## as.factor(team)NE 3.948e-02 2.507e-01 0.158 0.874846
## as.factor(team)NO -5.002e-02 2.288e-01 -0.219 0.826905
## as.factor(team)NYG -2.292e-01 2.302e-01 -0.996 0.319370
## as.factor(team)NYJ -8.000e-02 2.357e-01 -0.339 0.734260
## as.factor(team)OAK -4.650e-03 2.455e-01 -0.019 0.984888
## as.factor(team)PHI -2.542e-01 2.286e-01 -1.112 0.266184
## as.factor(team)PIT 5.876e-02 2.366e-01 0.248 0.803897
## as.factor(team)SD -4.115e-01 2.636e-01 -1.561 0.118511
## as.factor(team)SEA 5.322e-03 2.318e-01 0.023 0.981683
## as.factor(team)SF -7.290e-02 2.305e-01 -0.316 0.751822
## as.factor(team)STL -2.296e-01 2.766e-01 -0.830 0.406586
## as.factor(team)TB -1.911e-01 2.300e-01 -0.831 0.405937
## as.factor(team)TEN -2.551e-01 2.309e-01 -1.105 0.269236
## as.factor(team)WAS -1.641e-01 2.303e-01 -0.712 0.476163
## as.factor(opponent)ATL 1.396e-02 2.258e-01 0.062 0.950721
## as.factor(opponent)BAL -3.401e-01 2.337e-01 -1.456 0.145507
## as.factor(opponent)BUF 2.820e-02 2.322e-01 0.121 0.903349
## as.factor(opponent)CAR 1.962e-01 2.290e-01 0.857 0.391493
## as.factor(opponent)CHI 1.403e-01 2.284e-01 0.614 0.539022
## as.factor(opponent)CIN 1.029e-01 2.311e-01 0.445 0.656096
## as.factor(opponent)CLE 6.205e-01 2.389e-01 2.598 0.009379 **
## as.factor(opponent)DAL -9.994e-02 2.223e-01 -0.449 0.653084
## as.factor(opponent)DEN -8.973e-02 2.300e-01 -0.390 0.696374
## as.factor(opponent)DET 2.219e-01 2.252e-01 0.985 0.324484
## as.factor(opponent)GB -6.685e-01 2.355e-01 -2.838 0.004538 **
## as.factor(opponent)HOU 2.564e-01 2.265e-01 1.132 0.257698
## as.factor(opponent)IND 3.607e-02 2.287e-01 0.158 0.874685
## as.factor(opponent)JAX 1.071e+00 2.446e-01 4.380 1.19e-05 ***
## as.factor(opponent)KC -4.348e-01 2.347e-01 -1.852 0.063962 .
## as.factor(opponent)LA -6.879e-01 3.012e-01 -2.283 0.022403 *
## as.factor(opponent)LAC -5.363e-02 3.253e-01 -0.165 0.869064
## as.factor(opponent)LV -4.074e-01 5.473e-01 -0.744 0.456683
## as.factor(opponent)MIA 1.355e-01 2.324e-01 0.583 0.559887
## as.factor(opponent)MIN -4.635e-02 2.244e-01 -0.207 0.836367
## as.factor(opponent)NE -9.348e-01 2.473e-01 -3.780 0.000157 ***
## as.factor(opponent)NO -4.346e-01 2.282e-01 -1.905 0.056786 .
## as.factor(opponent)NYG 3.504e-01 2.301e-01 1.523 0.127768
## as.factor(opponent)NYJ 4.587e-01 2.336e-01 1.964 0.049562 *
## as.factor(opponent)OAK 3.242e-01 2.441e-01 1.328 0.184189
## as.factor(opponent)PHI -7.821e-03 2.273e-01 -0.034 0.972556
## as.factor(opponent)PIT -4.708e-01 2.377e-01 -1.981 0.047590 *
## as.factor(opponent)SD 2.061e-01 2.658e-01 0.775 0.438109
## as.factor(opponent)SEA -4.432e-01 2.305e-01 -1.923 0.054536 .
## as.factor(opponent)SF -7.378e-02 2.277e-01 -0.324 0.745937
## as.factor(opponent)STL 3.153e-01 2.777e-01 1.135 0.256317
## as.factor(opponent)TB 2.412e-01 2.291e-01 1.053 0.292363
## as.factor(opponent)TEN 2.482e-01 2.309e-01 1.075 0.282352
## as.factor(opponent)WAS 3.732e-01 2.308e-01 1.617 0.105971
## traveled -9.472e-06 4.654e-05 -0.204 0.838709
## pandemic 4.790e-02 1.135e-01 0.422 0.673042
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 8328.8 on 6007 degrees of freedom
## Residual deviance: 7349.3 on 5926 degrees of freedom
## (94 observations deleted due to missingness)
## AIC: 7513.3
##
## Number of Fisher Scoring iterations: 4
This regression suggest that the log-odds of a home-team win are reduced by -.257 when no fans are in the stands giving credence to the assertion that the support of fans improves the teams’ performance. However, when we include this variable in a regression with all other explantory variables, ‘pandemic’ is not significant.
Running a random forest regression model will achieve optimal predictability however, this comes at the cost of gaining insights from our data. We will create a “black box” if you will and will be difficult to explain specific effects and determine why an outcome is the way it is.
##
## Call:
## randomForest(formula = home_win ~ season + weekday + gametime + away_team + home_team + overtime + home_rest + away_rest + div_game + roof + surface + pandemic + spread_line, data = train_rf, importance = TRUE, mtry = 1, ntree = 500)
## Type of random forest: classification
## Number of trees: 500
## No. of variables tried at each split: 1
##
## OOB estimate of error rate: 38.09%
## Confusion matrix:
## 0 1 class.error
## 0 420 1009 0.7060882
## 1 219 1576 0.1220056
##
## Call:
## randomForest(formula = home_win ~ season + weekday + gametime + away_team + home_team + overtime + home_rest + away_rest + div_game + roof + surface + pandemic + spread_line, data = train_rf, importance = TRUE, mtry = 2, ntree = 500)
## Type of random forest: classification
## Number of trees: 500
## No. of variables tried at each split: 2
##
## OOB estimate of error rate: 36.85%
## Confusion matrix:
## 0 1 class.error
## 0 769 660 0.4618614
## 1 528 1267 0.2941504
##
## Call:
## randomForest(formula = home_win ~ season + weekday + gametime + away_team + home_team + overtime + home_rest + away_rest + div_game + roof + surface + pandemic + spread_line, data = train_rf, importance = TRUE, mtry = 4, ntree = 500)
## Type of random forest: classification
## Number of trees: 500
## No. of variables tried at each split: 4
##
## OOB estimate of error rate: 37.44%
## Confusion matrix:
## 0 1 class.error
## 0 765 664 0.4646606
## 1 543 1252 0.3025070
##
## Call:
## randomForest(formula = home_win ~ season + weekday + gametime + away_team + home_team + overtime + home_rest + away_rest + div_game + roof + surface + pandemic + spread_line, data = train_rf, importance = TRUE, mtry = 7, ntree = 500)
## Type of random forest: classification
## Number of trees: 500
## No. of variables tried at each split: 7
##
## OOB estimate of error rate: 38.68%
## Confusion matrix:
## 0 1 class.error
## 0 750 679 0.4751575
## 1 568 1227 0.3164345
##
## Call:
## randomForest(formula = home_win ~ season + weekday + gametime + away_team + home_team + overtime + home_rest + away_rest + div_game + roof + surface + pandemic + spread_line, data = train_rf, importance = TRUE, mtry = 13, ntree = 500)
## Type of random forest: classification
## Number of trees: 500
## No. of variables tried at each split: 13
##
## OOB estimate of error rate: 38.4%
## Confusion matrix:
## 0 1 class.error
## 0 762 667 0.4667600
## 1 571 1224 0.3181058
The best mtry (number of predictors sampled for spliting at each node) was mtry = 2 with an OOB estimate of 35.95
##
## Call:
## randomForest(formula = home_win ~ season + weekday + gametime + away_team + home_team + overtime + home_rest + away_rest + div_game + roof + surface + pandemic + spread_line, data = train_rf, importance = TRUE, mtry = 2, ntree = 500)
## Type of random forest: classification
## Number of trees: 500
## No. of variables tried at each split: 2
##
## OOB estimate of error rate: 36.85%
## Confusion matrix:
## 0 1 class.error
## 0 769 660 0.4618614
## 1 528 1267 0.2941504
## OOB
## 0.3684864
Plotting the model will help us visualize the OOB error rate (black line) as trees are averaged across. This will show us that our error rate stabilizes with around 75 trees and slowly decreases therefore after.
## OOB 0 1
## [1,] 0.4332192 0.4603175 0.4126506
## [2,] 0.4201245 0.4486423 0.3977798
## [3,] 0.4261481 0.4507966 0.4066667
## [4,] 0.4258919 0.4384359 0.4159525
## [5,] 0.4295071 0.4311146 0.4282163
## [6,] 0.4316595 0.4527178 0.4146635
## [7,] 0.4292667 0.4556041 0.4080796
## [8,] 0.4279337 0.4522111 0.4083045
## [9,] 0.4168514 0.4422395 0.3963345
## [10,] 0.4187362 0.4488356 0.3945578
## [11,] 0.4131793 0.4673226 0.3698707
## [12,] 0.3989411 0.4491941 0.3587444
## [13,] 0.4156814 0.4737579 0.3691877
## [14,] 0.4106310 0.4786564 0.3562640
## [15,] 0.4116185 0.4737579 0.3620112
## [16,] 0.4074534 0.4814556 0.3484087
## [17,] 0.4023595 0.4793562 0.3409598
## [18,] 0.4011790 0.4891533 0.3311037
## [19,] 0.4016749 0.4849545 0.3353760
## [20,] 0.3982630 0.4821554 0.3314763
## [21,] 0.4022953 0.4856543 0.3359331
## [22,] 0.3967122 0.4772568 0.3325905
## [23,] 0.3967122 0.4828551 0.3281337
## [24,] 0.3967122 0.4870539 0.3247911
## [25,] 0.3988834 0.4863541 0.3292479
## [26,] 0.3951613 0.4835549 0.3247911
## [27,] 0.3939206 0.4849545 0.3214485
## [28,] 0.3933002 0.4821554 0.3225627
## [29,] 0.3892680 0.4786564 0.3181058
## [30,] 0.3830645 0.4702589 0.3136490
## [31,] 0.3874069 0.4674598 0.3236769
## [32,] 0.3864764 0.4653604 0.3236769
## [33,] 0.3836849 0.4639608 0.3197772
## [34,] 0.3833747 0.4611617 0.3214485
## [35,] 0.3793424 0.4506648 0.3225627
## [36,] 0.3864764 0.4590623 0.3286908
## [37,] 0.3843052 0.4569629 0.3264624
## [38,] 0.3864764 0.4632610 0.3253482
## [39,] 0.3830645 0.4520644 0.3281337
## [40,] 0.3821340 0.4499650 0.3281337
## [41,] 0.3861663 0.4583625 0.3286908
## [42,] 0.3864764 0.4555633 0.3314763
## [43,] 0.3861663 0.4520644 0.3337047
## [44,] 0.3855459 0.4618614 0.3247911
## [45,] 0.3812035 0.4569629 0.3208914
## [46,] 0.3839950 0.4534640 0.3286908
## [47,] 0.3815136 0.4513646 0.3259053
## [48,] 0.3839950 0.4513646 0.3303621
## [49,] 0.3843052 0.4576627 0.3259053
## [50,] 0.3849256 0.4597621 0.3253482
## [51,] 0.3833747 0.4611617 0.3214485
## [52,] 0.3830645 0.4597621 0.3220056
## [53,] 0.3870968 0.4639608 0.3259053
## [54,] 0.3830645 0.4625612 0.3197772
## [55,] 0.3821340 0.4569629 0.3225627
## [56,] 0.3861663 0.4632610 0.3247911
## [57,] 0.3852357 0.4590623 0.3264624
## [58,] 0.3846154 0.4583625 0.3259053
## [59,] 0.3815136 0.4555633 0.3225627
## [60,] 0.3808933 0.4541638 0.3225627
## [61,] 0.3793424 0.4534640 0.3203343
## [62,] 0.3802730 0.4548635 0.3208914
## [63,] 0.3849256 0.4674598 0.3192201
## [64,] 0.3833747 0.4639608 0.3192201
## [65,] 0.3833747 0.4646606 0.3186630
## [66,] 0.3836849 0.4653604 0.3186630
## [67,] 0.3824442 0.4681596 0.3142061
## [68,] 0.3821340 0.4667600 0.3147632
## [69,] 0.3821340 0.4632610 0.3175487
## [70,] 0.3815136 0.4625612 0.3169916
## [71,] 0.3808933 0.4646606 0.3142061
## [72,] 0.3790323 0.4576627 0.3164345
## [73,] 0.3793424 0.4583625 0.3164345
## [74,] 0.3777916 0.4604619 0.3119777
## [75,] 0.3777916 0.4597621 0.3125348
## [76,] 0.3793424 0.4611617 0.3142061
## [77,] 0.3787221 0.4632610 0.3114206
## [78,] 0.3799628 0.4653604 0.3119777
## [79,] 0.3812035 0.4688593 0.3114206
## [80,] 0.3812035 0.4681596 0.3119777
## [81,] 0.3774814 0.4639608 0.3086351
## [82,] 0.3784119 0.4632610 0.3108635
## [83,] 0.3765509 0.4639608 0.3069638
## [84,] 0.3756203 0.4590623 0.3091922
## [85,] 0.3734491 0.4569629 0.3069638
## [86,] 0.3753102 0.4604619 0.3075209
## [87,] 0.3740695 0.4562631 0.3086351
## [88,] 0.3784119 0.4625612 0.3114206
## [89,] 0.3753102 0.4583625 0.3091922
## [90,] 0.3774814 0.4618614 0.3103064
## [91,] 0.3787221 0.4618614 0.3125348
## [92,] 0.3768610 0.4618614 0.3091922
## [93,] 0.3750000 0.4590623 0.3080780
## [94,] 0.3746898 0.4611617 0.3058496
## [95,] 0.3781017 0.4632610 0.3103064
## [96,] 0.3765509 0.4618614 0.3086351
## [97,] 0.3777916 0.4604619 0.3119777
## [98,] 0.3740695 0.4590623 0.3064067
## [99,] 0.3753102 0.4597621 0.3080780
## [100,] 0.3784119 0.4646606 0.3097493
## [101,] 0.3743797 0.4611617 0.3052925
## [102,] 0.3759305 0.4667600 0.3036212
## [103,] 0.3790323 0.4674598 0.3086351
## [104,] 0.3759305 0.4625612 0.3069638
## [105,] 0.3771712 0.4646606 0.3075209
## [106,] 0.3787221 0.4667600 0.3086351
## [107,] 0.3771712 0.4702589 0.3030641
## [108,] 0.3768610 0.4674598 0.3047354
## [109,] 0.3784119 0.4674598 0.3075209
## [110,] 0.3765509 0.4674598 0.3041783
## [111,] 0.3750000 0.4618614 0.3058496
## [112,] 0.3781017 0.4674598 0.3069638
## [113,] 0.3768610 0.4653604 0.3064067
## [114,] 0.3787221 0.4639608 0.3108635
## [115,] 0.3746898 0.4597621 0.3069638
## [116,] 0.3743797 0.4590623 0.3069638
## [117,] 0.3728288 0.4604619 0.3030641
## [118,] 0.3737593 0.4618614 0.3036212
## [119,] 0.3753102 0.4632610 0.3052925
## [120,] 0.3731390 0.4590623 0.3047354
## [121,] 0.3759305 0.4618614 0.3075209
## [122,] 0.3718983 0.4611617 0.3008357
## [123,] 0.3734491 0.4625612 0.3025070
## [124,] 0.3718983 0.4583625 0.3030641
## [125,] 0.3687965 0.4534640 0.3013928
## [126,] 0.3697270 0.4534640 0.3030641
## [127,] 0.3684864 0.4527642 0.3013928
## [128,] 0.3687965 0.4569629 0.2986072
## [129,] 0.3675558 0.4576627 0.2958217
## [130,] 0.3678660 0.4562631 0.2974930
## [131,] 0.3681762 0.4520644 0.3013928
## [132,] 0.3691067 0.4520644 0.3030641
## [133,] 0.3712779 0.4541638 0.3052925
## [134,] 0.3706576 0.4520644 0.3058496
## [135,] 0.3725186 0.4527642 0.3086351
## [136,] 0.3722084 0.4541638 0.3069638
## [137,] 0.3718983 0.4513646 0.3086351
## [138,] 0.3731390 0.4562631 0.3069638
## [139,] 0.3740695 0.4569629 0.3080780
## [140,] 0.3722084 0.4555633 0.3058496
## [141,] 0.3734491 0.4548635 0.3086351
## [142,] 0.3728288 0.4534640 0.3086351
## [143,] 0.3734491 0.4527642 0.3103064
## [144,] 0.3728288 0.4527642 0.3091922
## [145,] 0.3715881 0.4541638 0.3058496
## [146,] 0.3709677 0.4548635 0.3041783
## [147,] 0.3725186 0.4555633 0.3064067
## [148,] 0.3700372 0.4534640 0.3036212
## [149,] 0.3703474 0.4520644 0.3052925
## [150,] 0.3709677 0.4513646 0.3069638
## [151,] 0.3731390 0.4548635 0.3080780
## [152,] 0.3743797 0.4569629 0.3086351
## [153,] 0.3731390 0.4520644 0.3103064
## [154,] 0.3750000 0.4569629 0.3097493
## [155,] 0.3740695 0.4583625 0.3069638
## [156,] 0.3722084 0.4576627 0.3041783
## [157,] 0.3750000 0.4597621 0.3075209
## [158,] 0.3728288 0.4576627 0.3052925
## [159,] 0.3734491 0.4604619 0.3041783
## [160,] 0.3718983 0.4562631 0.3047354
## [161,] 0.3737593 0.4597621 0.3052925
## [162,] 0.3734491 0.4583625 0.3058496
## [163,] 0.3750000 0.4611617 0.3064067
## [164,] 0.3743797 0.4604619 0.3058496
## [165,] 0.3746898 0.4618614 0.3052925
## [166,] 0.3750000 0.4639608 0.3041783
## [167,] 0.3756203 0.4639608 0.3052925
## [168,] 0.3740695 0.4639608 0.3025070
## [169,] 0.3750000 0.4625612 0.3052925
## [170,] 0.3759305 0.4639608 0.3058496
## [171,] 0.3743797 0.4611617 0.3052925
## [172,] 0.3731390 0.4632610 0.3013928
## [173,] 0.3753102 0.4639608 0.3047354
## [174,] 0.3746898 0.4646606 0.3030641
## [175,] 0.3722084 0.4583625 0.3036212
## [176,] 0.3762407 0.4653604 0.3052925
## [177,] 0.3771712 0.4660602 0.3064067
## [178,] 0.3790323 0.4667600 0.3091922
## [179,] 0.3802730 0.4716585 0.3075209
## [180,] 0.3805831 0.4737579 0.3064067
## [181,] 0.3771712 0.4702589 0.3030641
## [182,] 0.3765509 0.4695591 0.3025070
## [183,] 0.3765509 0.4702589 0.3019499
## [184,] 0.3762407 0.4688593 0.3025070
## [185,] 0.3753102 0.4639608 0.3047354
## [186,] 0.3746898 0.4653604 0.3025070
## [187,] 0.3762407 0.4681596 0.3030641
## [188,] 0.3759305 0.4667600 0.3036212
## [189,] 0.3756203 0.4688593 0.3013928
## [190,] 0.3756203 0.4681596 0.3019499
## [191,] 0.3756203 0.4660602 0.3036212
## [192,] 0.3762407 0.4709587 0.3008357
## [193,] 0.3746898 0.4681596 0.3002786
## [194,] 0.3725186 0.4625612 0.3008357
## [195,] 0.3731390 0.4653604 0.2997214
## [196,] 0.3737593 0.4653604 0.3008357
## [197,] 0.3762407 0.4716585 0.3002786
## [198,] 0.3734491 0.4632610 0.3019499
## [199,] 0.3750000 0.4667600 0.3019499
## [200,] 0.3737593 0.4681596 0.2986072
## [201,] 0.3737593 0.4667600 0.2997214
## [202,] 0.3725186 0.4639608 0.2997214
## [203,] 0.3731390 0.4618614 0.3025070
## [204,] 0.3746898 0.4639608 0.3036212
## [205,] 0.3762407 0.4667600 0.3041783
## [206,] 0.3762407 0.4660602 0.3047354
## [207,] 0.3759305 0.4674598 0.3030641
## [208,] 0.3756203 0.4688593 0.3013928
## [209,] 0.3750000 0.4702589 0.2991643
## [210,] 0.3746898 0.4674598 0.3008357
## [211,] 0.3753102 0.4702589 0.2997214
## [212,] 0.3765509 0.4716585 0.3008357
## [213,] 0.3743797 0.4681596 0.2997214
## [214,] 0.3750000 0.4681596 0.3008357
## [215,] 0.3731390 0.4660602 0.2991643
## [216,] 0.3759305 0.4702589 0.3008357
## [217,] 0.3746898 0.4674598 0.3008357
## [218,] 0.3753102 0.4660602 0.3030641
## [219,] 0.3762407 0.4688593 0.3025070
## [220,] 0.3759305 0.4688593 0.3019499
## [221,] 0.3759305 0.4674598 0.3030641
## [222,] 0.3777916 0.4702589 0.3041783
## [223,] 0.3771712 0.4695591 0.3036212
## [224,] 0.3750000 0.4667600 0.3019499
## [225,] 0.3759305 0.4695591 0.3013928
## [226,] 0.3756203 0.4695591 0.3008357
## [227,] 0.3790323 0.4730581 0.3041783
## [228,] 0.3774814 0.4716585 0.3025070
## [229,] 0.3777916 0.4737579 0.3013928
## [230,] 0.3768610 0.4723583 0.3008357
## [231,] 0.3756203 0.4702589 0.3002786
## [232,] 0.3750000 0.4681596 0.3008357
## [233,] 0.3762407 0.4695591 0.3019499
## [234,] 0.3753102 0.4702589 0.2997214
## [235,] 0.3734491 0.4667600 0.2991643
## [236,] 0.3746898 0.4716585 0.2974930
## [237,] 0.3731390 0.4681596 0.2974930
## [238,] 0.3743797 0.4702589 0.2980501
## [239,] 0.3728288 0.4702589 0.2952646
## [240,] 0.3743797 0.4695591 0.2986072
## [241,] 0.3746898 0.4695591 0.2991643
## [242,] 0.3750000 0.4702589 0.2991643
## [243,] 0.3740695 0.4695591 0.2980501
## [244,] 0.3743797 0.4695591 0.2986072
## [245,] 0.3740695 0.4674598 0.2997214
## [246,] 0.3737593 0.4667600 0.2997214
## [247,] 0.3743797 0.4667600 0.3008357
## [248,] 0.3750000 0.4674598 0.3013928
## [249,] 0.3750000 0.4688593 0.3002786
## [250,] 0.3743797 0.4674598 0.3002786
## [251,] 0.3718983 0.4653604 0.2974930
## [252,] 0.3715881 0.4660602 0.2963788
## [253,] 0.3731390 0.4667600 0.2986072
## [254,] 0.3725186 0.4674598 0.2969359
## [255,] 0.3743797 0.4695591 0.2986072
## [256,] 0.3718983 0.4660602 0.2969359
## [257,] 0.3753102 0.4688593 0.3008357
## [258,] 0.3737593 0.4674598 0.2991643
## [259,] 0.3737593 0.4667600 0.2997214
## [260,] 0.3756203 0.4695591 0.3008357
## [261,] 0.3731390 0.4674598 0.2980501
## [262,] 0.3734491 0.4667600 0.2991643
## [263,] 0.3734491 0.4639608 0.3013928
## [264,] 0.3743797 0.4646606 0.3025070
## [265,] 0.3740695 0.4660602 0.3008357
## [266,] 0.3728288 0.4625612 0.3013928
## [267,] 0.3756203 0.4660602 0.3036212
## [268,] 0.3765509 0.4667600 0.3047354
## [269,] 0.3771712 0.4674598 0.3052925
## [270,] 0.3762407 0.4667600 0.3041783
## [271,] 0.3750000 0.4660602 0.3025070
## [272,] 0.3737593 0.4646606 0.3013928
## [273,] 0.3734491 0.4646606 0.3008357
## [274,] 0.3728288 0.4632610 0.3008357
## [275,] 0.3737593 0.4653604 0.3008357
## [276,] 0.3740695 0.4646606 0.3019499
## [277,] 0.3743797 0.4646606 0.3025070
## [278,] 0.3750000 0.4674598 0.3013928
## [279,] 0.3746898 0.4653604 0.3025070
## [280,] 0.3737593 0.4667600 0.2997214
## [281,] 0.3731390 0.4646606 0.3002786
## [282,] 0.3734491 0.4660602 0.2997214
## [283,] 0.3722084 0.4646606 0.2986072
## [284,] 0.3715881 0.4653604 0.2969359
## [285,] 0.3709677 0.4653604 0.2958217
## [286,] 0.3728288 0.4674598 0.2974930
## [287,] 0.3728288 0.4674598 0.2974930
## [288,] 0.3731390 0.4667600 0.2986072
## [289,] 0.3715881 0.4639608 0.2980501
## [290,] 0.3718983 0.4653604 0.2974930
## [291,] 0.3743797 0.4688593 0.2991643
## [292,] 0.3715881 0.4639608 0.2980501
## [293,] 0.3722084 0.4646606 0.2986072
## [294,] 0.3706576 0.4646606 0.2958217
## [295,] 0.3700372 0.4611617 0.2974930
## [296,] 0.3706576 0.4604619 0.2991643
## [297,] 0.3697270 0.4576627 0.2997214
## [298,] 0.3703474 0.4583625 0.3002786
## [299,] 0.3722084 0.4590623 0.3030641
## [300,] 0.3712779 0.4597621 0.3008357
## [301,] 0.3715881 0.4611617 0.3002786
## [302,] 0.3734491 0.4632610 0.3019499
## [303,] 0.3728288 0.4611617 0.3025070
## [304,] 0.3706576 0.4597621 0.2997214
## [305,] 0.3703474 0.4569629 0.3013928
## [306,] 0.3709677 0.4583625 0.3013928
## [307,] 0.3694169 0.4548635 0.3013928
## [308,] 0.3700372 0.4576627 0.3002786
## [309,] 0.3697270 0.4569629 0.3002786
## [310,] 0.3700372 0.4569629 0.3008357
## [311,] 0.3709677 0.4576627 0.3019499
## [312,] 0.3718983 0.4583625 0.3030641
## [313,] 0.3700372 0.4569629 0.3008357
## [314,] 0.3706576 0.4576627 0.3013928
## [315,] 0.3706576 0.4576627 0.3013928
## [316,] 0.3691067 0.4548635 0.3008357
## [317,] 0.3681762 0.4520644 0.3013928
## [318,] 0.3712779 0.4583625 0.3019499
## [319,] 0.3703474 0.4562631 0.3019499
## [320,] 0.3706576 0.4562631 0.3025070
## [321,] 0.3703474 0.4555633 0.3025070
## [322,] 0.3697270 0.4555633 0.3013928
## [323,] 0.3703474 0.4562631 0.3019499
## [324,] 0.3703474 0.4562631 0.3019499
## [325,] 0.3715881 0.4583625 0.3025070
## [326,] 0.3722084 0.4611617 0.3013928
## [327,] 0.3722084 0.4590623 0.3030641
## [328,] 0.3722084 0.4597621 0.3025070
## [329,] 0.3734491 0.4604619 0.3041783
## [330,] 0.3734491 0.4611617 0.3036212
## [331,] 0.3728288 0.4604619 0.3030641
## [332,] 0.3722084 0.4583625 0.3036212
## [333,] 0.3725186 0.4590623 0.3036212
## [334,] 0.3709677 0.4576627 0.3019499
## [335,] 0.3722084 0.4590623 0.3030641
## [336,] 0.3718983 0.4590623 0.3025070
## [337,] 0.3728288 0.4604619 0.3030641
## [338,] 0.3728288 0.4604619 0.3030641
## [339,] 0.3718983 0.4583625 0.3030641
## [340,] 0.3715881 0.4597621 0.3013928
## [341,] 0.3715881 0.4604619 0.3008357
## [342,] 0.3718983 0.4611617 0.3008357
## [343,] 0.3703474 0.4597621 0.2991643
## [344,] 0.3709677 0.4590623 0.3008357
## [345,] 0.3709677 0.4583625 0.3013928
## [346,] 0.3709677 0.4597621 0.3002786
## [347,] 0.3715881 0.4611617 0.3002786
## [348,] 0.3700372 0.4583625 0.2997214
## [349,] 0.3709677 0.4597621 0.3002786
## [350,] 0.3703474 0.4604619 0.2986072
## [351,] 0.3681762 0.4576627 0.2969359
## [352,] 0.3694169 0.4590623 0.2980501
## [353,] 0.3706576 0.4604619 0.2991643
## [354,] 0.3697270 0.4611617 0.2969359
## [355,] 0.3694169 0.4597621 0.2974930
## [356,] 0.3681762 0.4583625 0.2963788
## [357,] 0.3694169 0.4597621 0.2974930
## [358,] 0.3687965 0.4569629 0.2986072
## [359,] 0.3694169 0.4604619 0.2969359
## [360,] 0.3681762 0.4569629 0.2974930
## [361,] 0.3681762 0.4576627 0.2969359
## [362,] 0.3678660 0.4583625 0.2958217
## [363,] 0.3672457 0.4576627 0.2952646
## [364,] 0.3678660 0.4569629 0.2969359
## [365,] 0.3700372 0.4590623 0.2991643
## [366,] 0.3691067 0.4583625 0.2980501
## [367,] 0.3672457 0.4569629 0.2958217
## [368,] 0.3678660 0.4583625 0.2958217
## [369,] 0.3681762 0.4597621 0.2952646
## [370,] 0.3687965 0.4590623 0.2969359
## [371,] 0.3694169 0.4590623 0.2980501
## [372,] 0.3681762 0.4597621 0.2952646
## [373,] 0.3687965 0.4611617 0.2952646
## [374,] 0.3691067 0.4618614 0.2952646
## [375,] 0.3687965 0.4597621 0.2963788
## [376,] 0.3684864 0.4604619 0.2952646
## [377,] 0.3672457 0.4597621 0.2935933
## [378,] 0.3681762 0.4597621 0.2952646
## [379,] 0.3669355 0.4590623 0.2935933
## [380,] 0.3672457 0.4576627 0.2952646
## [381,] 0.3672457 0.4611617 0.2924791
## [382,] 0.3666253 0.4576627 0.2941504
## [383,] 0.3678660 0.4590623 0.2952646
## [384,] 0.3669355 0.4583625 0.2941504
## [385,] 0.3681762 0.4604619 0.2947075
## [386,] 0.3672457 0.4583625 0.2947075
## [387,] 0.3675558 0.4604619 0.2935933
## [388,] 0.3672457 0.4590623 0.2941504
## [389,] 0.3681762 0.4611617 0.2941504
## [390,] 0.3666253 0.4590623 0.2930362
## [391,] 0.3672457 0.4583625 0.2947075
## [392,] 0.3675558 0.4597621 0.2941504
## [393,] 0.3675558 0.4590623 0.2947075
## [394,] 0.3697270 0.4639608 0.2947075
## [395,] 0.3694169 0.4632610 0.2947075
## [396,] 0.3675558 0.4611617 0.2930362
## [397,] 0.3694169 0.4625612 0.2952646
## [398,] 0.3684864 0.4611617 0.2947075
## [399,] 0.3694169 0.4618614 0.2958217
## [400,] 0.3684864 0.4632610 0.2930362
## [401,] 0.3684864 0.4618614 0.2941504
## [402,] 0.3697270 0.4618614 0.2963788
## [403,] 0.3694169 0.4611617 0.2963788
## [404,] 0.3694169 0.4618614 0.2958217
## [405,] 0.3697270 0.4639608 0.2947075
## [406,] 0.3700372 0.4632610 0.2958217
## [407,] 0.3694169 0.4632610 0.2947075
## [408,] 0.3697270 0.4639608 0.2947075
## [409,] 0.3697270 0.4639608 0.2947075
## [410,] 0.3700372 0.4646606 0.2947075
## [411,] 0.3703474 0.4646606 0.2952646
## [412,] 0.3691067 0.4639608 0.2935933
## [413,] 0.3703474 0.4646606 0.2952646
## [414,] 0.3700372 0.4632610 0.2958217
## [415,] 0.3697270 0.4625612 0.2958217
## [416,] 0.3697270 0.4625612 0.2958217
## [417,] 0.3681762 0.4611617 0.2941504
## [418,] 0.3684864 0.4625612 0.2935933
## [419,] 0.3687965 0.4646606 0.2924791
## [420,] 0.3694169 0.4639608 0.2941504
## [421,] 0.3703474 0.4660602 0.2941504
## [422,] 0.3691067 0.4632610 0.2941504
## [423,] 0.3697270 0.4646606 0.2941504
## [424,] 0.3697270 0.4653604 0.2935933
## [425,] 0.3694169 0.4660602 0.2924791
## [426,] 0.3691067 0.4653604 0.2924791
## [427,] 0.3697270 0.4674598 0.2919220
## [428,] 0.3687965 0.4646606 0.2924791
## [429,] 0.3697270 0.4632610 0.2952646
## [430,] 0.3684864 0.4618614 0.2941504
## [431,] 0.3684864 0.4632610 0.2930362
## [432,] 0.3678660 0.4611617 0.2935933
## [433,] 0.3691067 0.4632610 0.2941504
## [434,] 0.3691067 0.4639608 0.2935933
## [435,] 0.3694169 0.4653604 0.2930362
## [436,] 0.3678660 0.4625612 0.2924791
## [437,] 0.3684864 0.4618614 0.2941504
## [438,] 0.3675558 0.4604619 0.2935933
## [439,] 0.3675558 0.4611617 0.2930362
## [440,] 0.3678660 0.4604619 0.2941504
## [441,] 0.3678660 0.4604619 0.2941504
## [442,] 0.3669355 0.4576627 0.2947075
## [443,] 0.3691067 0.4611617 0.2958217
## [444,] 0.3684864 0.4611617 0.2947075
## [445,] 0.3675558 0.4618614 0.2924791
## [446,] 0.3675558 0.4625612 0.2919220
## [447,] 0.3672457 0.4611617 0.2924791
## [448,] 0.3684864 0.4618614 0.2941504
## [449,] 0.3687965 0.4625612 0.2941504
## [450,] 0.3678660 0.4611617 0.2935933
## [451,] 0.3691067 0.4632610 0.2941504
## [452,] 0.3672457 0.4597621 0.2935933
## [453,] 0.3691067 0.4625612 0.2947075
## [454,] 0.3684864 0.4625612 0.2935933
## [455,] 0.3687965 0.4632610 0.2935933
## [456,] 0.3687965 0.4632610 0.2935933
## [457,] 0.3687965 0.4632610 0.2935933
## [458,] 0.3675558 0.4632610 0.2913649
## [459,] 0.3663151 0.4611617 0.2908078
## [460,] 0.3666253 0.4625612 0.2902507
## [461,] 0.3684864 0.4632610 0.2930362
## [462,] 0.3700372 0.4639608 0.2952646
## [463,] 0.3691067 0.4639608 0.2935933
## [464,] 0.3681762 0.4632610 0.2924791
## [465,] 0.3675558 0.4625612 0.2919220
## [466,] 0.3666253 0.4625612 0.2902507
## [467,] 0.3666253 0.4625612 0.2902507
## [468,] 0.3672457 0.4646606 0.2896936
## [469,] 0.3669355 0.4632610 0.2902507
## [470,] 0.3672457 0.4646606 0.2896936
## [471,] 0.3669355 0.4632610 0.2902507
## [472,] 0.3675558 0.4653604 0.2896936
## [473,] 0.3678660 0.4646606 0.2908078
## [474,] 0.3684864 0.4639608 0.2924791
## [475,] 0.3675558 0.4632610 0.2913649
## [476,] 0.3687965 0.4639608 0.2930362
## [477,] 0.3678660 0.4625612 0.2924791
## [478,] 0.3675558 0.4625612 0.2919220
## [479,] 0.3678660 0.4632610 0.2919220
## [480,] 0.3672457 0.4632610 0.2908078
## [481,] 0.3672457 0.4632610 0.2908078
## [482,] 0.3675558 0.4632610 0.2913649
## [483,] 0.3678660 0.4625612 0.2924791
## [484,] 0.3669355 0.4618614 0.2913649
## [485,] 0.3663151 0.4611617 0.2908078
## [486,] 0.3672457 0.4611617 0.2924791
## [487,] 0.3666253 0.4618614 0.2908078
## [488,] 0.3672457 0.4632610 0.2908078
## [489,] 0.3660050 0.4625612 0.2891365
## [490,] 0.3672457 0.4639608 0.2902507
## [491,] 0.3678660 0.4639608 0.2913649
## [492,] 0.3684864 0.4646606 0.2919220
## [493,] 0.3681762 0.4646606 0.2913649
## [494,] 0.3687965 0.4653604 0.2919220
## [495,] 0.3678660 0.4632610 0.2919220
## [496,] 0.3681762 0.4632610 0.2924791
## [497,] 0.3684864 0.4625612 0.2935933
## [498,] 0.3672457 0.4618614 0.2919220
## [499,] 0.3678660 0.4618614 0.2930362
## [500,] 0.3684864 0.4618614 0.2941504
let’s try to determine feature importance
## 0 1 MeanDecreaseAccuracy MeanDecreaseGini
## season -0.824065338 3.5375269 2.2726333 115.89691
## weekday -0.007968778 1.0188478 0.9178493 24.09883
## gametime 0.943995532 2.6589447 2.7300114 77.13024
## away_team 1.726107054 1.3062148 2.1429214 161.65806
## home_team 2.454405682 4.0575166 4.9836879 137.16545
## overtime 4.404647769 1.0555048 3.7921414 17.71329
## home_rest 2.515946880 -1.5492105 0.4823032 61.74505
## away_rest 1.412803787 -0.5757318 0.5629243 61.59363
## div_game -0.904686559 -1.0153317 -1.3960091 28.38718
## roof 4.457274861 4.3393853 6.9384926 38.88757
## surface 2.103333952 4.5652054 5.3052787 54.60199
## pandemic -0.742003531 1.7597976 0.8472992 11.49511
## spread_line 59.237340102 49.2501658 69.3726657 320.05080
Visualizing the importance of features against accuracy, we notice that spread_line is of most importance in determining if home team wins or not, remember this is our control variable to determine if the team is good or not.
The OOB estimate of error rate is 38.12%, which is only slightly better than random guessing… not too great.
##
## Call:
## randomForest(formula = home_win ~ season + weekday + gametime + away_team + home_team + overtime + home_rest + away_rest + div_game + roof + surface + pandemic + spread_line, data = train_rf, importance = TRUE, mtry = 2, ntree = 500)
## Type of random forest: classification
## Number of trees: 500
## No. of variables tried at each split: 2
##
## OOB estimate of error rate: 36.85%
## Confusion matrix:
## 0 1 class.error
## 0 769 660 0.4618614
## 1 528 1267 0.2941504
## Model R2 Adj_R2 Num_Vars
## 1 Model1 0.1425707 0.1422895 2
## 2 Model2 0.1444697 0.1434869 7
## 3 Model3 0.2037454 0.1929979 80
## Model AIC Deviance Num_Vars
## 1 Logistic Model 7511.460 7349.460 80
## 2 Pandemic Model 8463.115 8459.115 1
## 3 Pandemic Full Model 7513.281 7349.281 81
## Model OOB
## 1 mtry1 0.3808933
## 2 mtry2 0.3684864
## 3 mtry4 0.3743797
## 4 mtry7 0.3867866
## 5 mtry13 0.3839950
to be added…
2021 NFL Game Data. (n.d.). Retrieved from http://www.habitatring.com/
NFL Football Stadiums - Quest for 31. (n.d.). Retrieved from http://www.nflfootballstadiums.com/
Nflverse. (Sharpe, Lee). Nfldata/rosters.csv at master · nflverse/nfldata. Retrieved from https://github.com/nflverse/nfldata/blob/master/data/rosters.csv
Cleveland, T. (2021, September 14). Numbers that matter for predicting NFL win totals: Sharp Football. Retrieved from https://www.sharpfootballanalysis.com/betting/numbers-that-matter-for-predicting-nfl-win-totals-part-one/
Jamieson, J. P. (2010). The Home Field Advantage in Athletics: A Meta-Analysis. Journal of Applied Social Psychology, 40(7), 1819-1848. doi:10.1111/j.1559-1816.2010.00641.x
Kilgore, A., & Greenberg, N. (2022, January 15). Analysis | NFL home-field advantage was endangered before the pandemic. Now it’s almost extinct. Retrieved from https://www.washingtonpost.com/sports/2022/01/14/nfl-home-field-advantage-pandemic/
Mccarrick, D., Bilalic, M., Neave, N., & Wolfson, S. (2021). Home advantage during the COVID-19 pandemic: Analyses of European football leagues. Psychology of Sport and Exercise, 56, 102013. doi:10.1016/j.psychsport.2021.102013
Ponzo, M., & Scoppa, V. (2014). Does the Home Advantage Depend on Crowd Support? Evidence from Same-Stadium Derbies. SSRN Electronic Journal. doi:10.2139/ssrn.2426859
Swartz, T. B., & Arce, A. (2014). New Insights Involving the Home Team Advantage. International Journal of Sports Science & Coaching, 9(4), 681-692. doi:10.1260/1747-9541.9.4.681