Nicholas Capofari
May 12, 2017
rvest package.Welch Two Sample t-test Comparing Home Win % to Road Win %
Welch Two Sample t-test Comparing Road Win % to BB Road Win %
Home Team (Opponent)
After careful investigation (with the aid of statistical tests and visualizations) the following variables were kept to produce the model.
Real Team Age
Pythagorean Win %
Distance Travelled
Logistic regression is a tool for building models when there is a categorical response variable with two levels. I chose to use logit transformation to ensure that my results are between 0 and 1. The closer the result is to 1, the more likely it will be that the team wins.
\[ log(\frac{p_{i}}{1-p_{i}})=-0.0877 \] \[ -0.3569\times\textrm{OT Game Before} \] \[ +0.1778\times\textrm{BB for Opp} \] \[ +0.1933\times\textrm{OT Game Before Opp} \] \[ +0.0969\times\textrm{Real Team Age} \] \[ +0.7339\times\textrm{Last 10 Opp} \] \[ -0.0982\times\textrm{Real Team Age Opp} \] \[ -0.9838\times\textrm{Last 10 Opp} \] \[ +2.3824\times\textrm{Away Home Difference Pyt Win Pct} \]
The residuals plotted in the order of their corresponding observation presents no patterns that should be investigated.
As the difference between the Pythagorean Win %s (Away-Home) increases, the residuals become much more pronounced.
The model was applied to all NBA games that fit the same criteria for the 2014-2015, 2015-2016, and 2016-2017 seasons (761 games).
\[ \textrm{Probability of Win}=\frac{\textrm{Money to Bet}}{\textrm{Money to Bet + Money to Win}} \]
Is the model better at predicting wins compared to simply selecting the difference between the Away team's and Home team's Pythagorean Win %?
Predicting NBA wins and losses is very difficult. By focusing on a very specific set of games, it was easier to create a model that produced somewhat meaningful results.
Future research will try to incorporate team statistics that drive team winning percentages. Specifically the four factors:
R Packages
R Packages
R Packages