Within this analysis, I will use a Dataset titled NBA championship from the Kaggle website. This dataset contains information about all the NBA teams that have ever won an NBA championship since the year 1980, the year the L.A Lakers and Kareem Abdul-Jabbar won their seventh title. In the dataset, each row represents a game the eventual winner of that NBA finals series played. The Dataset has whether a future NBA team has won or lost a game, how many games each series was and basic statistics about each team.
For this analysis, this study will examine how playing at Home, or Away from Home, Assist, Rebounds, Blocks, Free Throws, Field Goals Attemped and Winning the game impacts a teams ability to make a field goal. In basketball, a field goal is a made basket that is worth 2 points to the team that made the shot within certain boundaries of the basketball court.
Since the 3 point shot was not added until 1979 - 1980 season, for a long time, making field goals was the best way and at one point the only way to win a game. In the NBA circle, it is also widely excepted that Assist, Rebounds, Blocks, Free Throws,Field Goals Attemped impacts a teams momentum and ultimately their ability to make a field goal. It was only until recent that the 3 point shot became important in the game of basketball since many people felt that the 3 point shot was a gimmick. It is the reasons above that are the driving force behind this analysis studying field goals in great detail and why we will not be using the 3 point shot variable in the investigation.
To exam field goals made in each game, we will use a simulation-based method of analysis using the Zelig package. In this exploratory analysis, I believe that the more complex model will be the best model to use. Also, I think that playing at home will result in more Field-goals made.
library(readr)
library(Zelig)
library(texreg)
library(dplyr)
NBAchampionsdata <- read_csv("Desktop/NBAchampionsdata.csv")
OneNBAchampionsdata<- NBAchampionsdata %>%
select(FG, Home, FT, AST, STL, BLK, Win, FGA)%>%
mutate( Home = as.factor(Home))
Poisson1 <- zelig(FG ~ Home, model = "poisson", data = OneNBAchampionsdata, cite = F)
Poisson2 <- zelig(FG ~ Home + FT, model = "poisson", data = OneNBAchampionsdata, cite = F)
Poisson3 <- zelig(FG ~ Home + FT + AST, model = "poisson", data = OneNBAchampionsdata, cite = F)
Poisson4 <- zelig(FG ~ Home + FT + AST + STL, model = "poisson", data = OneNBAchampionsdata, cite = F)
Poisson5 <- zelig(FG ~ Home + FT + AST + STL + BLK, model = "poisson", data = OneNBAchampionsdata, cite = F)
Poisson6 <- zelig(FG ~ Home + FT + AST + STL + BLK + FGA, model = "poisson", data = OneNBAchampionsdata, cite = F)
htmlreg(list(Poisson1,Poisson2,Poisson3,Poisson4,Poisson5,Poisson6))
| Model 1 | Model 2 | Model 3 | Model 4 | Model 5 | Model 6 | ||
|---|---|---|---|---|---|---|---|
| (Intercept) | 3.61*** | 3.67*** | 3.25*** | 3.23*** | 3.22*** | 2.71*** | |
| (0.02) | (0.04) | (0.05) | (0.06) | (0.06) | (0.10) | ||
| HomeHome | 0.05* | 0.05* | -0.01 | -0.01 | -0.01 | -0.00 | |
| (0.02) | (0.02) | (0.02) | (0.02) | (0.02) | (0.02) | ||
| FT | -0.00 | -0.00 | -0.00 | -0.00 | -0.00 | ||
| (0.00) | (0.00) | (0.00) | (0.00) | (0.00) | |||
| AST | 0.02*** | 0.02*** | 0.02*** | 0.01*** | |||
| (0.00) | (0.00) | (0.00) | (0.00) | ||||
| STL | 0.00 | 0.00 | 0.00 | ||||
| (0.00) | (0.00) | (0.00) | |||||
| BLK | 0.00 | -0.00 | |||||
| (0.00) | (0.00) | ||||||
| FGA | 0.01*** | ||||||
| (0.00) | |||||||
| AIC | 1430.36 | 1428.78 | 1315.71 | 1316.24 | 1318.13 | 1285.30 | |
| BIC | 1437.14 | 1438.96 | 1329.28 | 1333.21 | 1338.49 | 1309.06 | |
| Log Likelihood | -713.18 | -711.39 | -653.85 | -653.12 | -653.06 | -635.65 | |
| Deviance | 225.27 | 221.69 | 106.62 | 105.15 | 105.04 | 70.22 | |
| Num. obs. | 220 | 220 | 220 | 220 | 220 | 220 | |
| p < 0.001, p < 0.01, p < 0.05 | |||||||
In the above R chunk, the Independent variables are Home, FT, AST, STL, BLK, Win FGA (Field Goals Attemped). The dependent variable is the number of field goals made(FG). As would be expected, the best model, Poisson6, is the most complex in nature. With this notion in mind, we will select Poisson6 and conduct a simulation with the Zelig package using the Zelig 5 syntax.
Poisson6$setx(Home="Away")
Poisson6$setx1(Home="Home")
Poisson6$sim()
Poisson6$graph()
Within the field of statistics, an expected value usually referred to as EV, is a value that one anticipate to see in their analysis. Another value seen in statistics is the PV or predicted value. PV is defined as the probability of a value coming true. FD or The first-difference is an estimating technique used to exam problems with omitted data as well as give an individual insight into the nature value and how they behave.
In the plot generated by Zelig, one will see seven charts. The first charts towards the top left-hand corner called Predicted Values: Y|X. This chart provides the predicted values for the number of Field goals championship teams make while playing in their opponent’s stadium. It turns out that Championship teams will make an average of 39 shots in their while playing in their opponent’s stadium. The chart below is titled Expected Values: E (Y|X). According to this chart, while playing in their opponent’s stadium, champion teams make a little over 37 field goals on average.
The first chart towards the top right-hand corner is called Predicted Values: Y|X1. This chart provides the predicted values for the number of Field goals championship teams make while playing in their home stadium. It turns out that playing at home does not impact their shot making abilities much since the predicted average of field goals made is 38 shots. The chart below is titled Expected Values: E (Y|X1). This chart has similar results like the other Expected Values chart. It d expected that championship team while playing at home, should make a little over 37 field goals on average.
The idea that playing at home vs. at an opponent’s stadium does not impact a championship team’s field goal making abilities is supported by the chart located in the middle titled First Difference: E (Y|X1) - E (Y|X). The way you get this table is by subtracting the results from E (Y|X1) and E (Y|X) tables. As seen in the chart, the average is around zero, which supports the notion that championship teams will score around the same field goals.
The bottom two graphs are titled Comparison (Y|X) and (Y|X1), and Comparison E(Y|X) and E(Y|X1) respectively. The first graph, Comparison (Y|X) and (Y|X1) is overlays of all three predicted value graphs. The second table, Comparison E(Y|X) and E(Y|X1) is overlays of all three expected value graphs. The two overlay graphs mention above provide similar insight when compared to the other tables.
Poisson6$get_qi(xvalue="x1",qi="fd")%>%
data.frame()%>%
summary()
## .
## Min. :-2.9660
## 1st Qu.:-0.7218
## Median :-0.1118
## Mean :-0.1204
## 3rd Qu.: 0.4741
## Max. : 3.1360
When using the above R chunk to look the numbers generated expected values tables located in the second row of graphs, we learned that championship teams on average score 0.1228 field goals while playing in an opponent’s stadium as opposed to their home court. This is interesting because the data suggest that championship teams score more in their opponent’s court then their own. The data also indicate that it is harder to win away games since you will generally have to score more field goals to win the game.
This dataset has one limitation and that this that it did not include individual player statistics. This is a limitation because having individual player statistics would allow for a more complex analysis since we can examine how a player or multiple players influences if a team wins a game or not and how many field goals are made. Further studies should examine the effects of the 3 point shot and how it alters the number of field goals a team makes. Lastly, the more complex model was the best model and championship teams score more field goals in away games when compared to championship teams playing at a home.