library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
library(ggplot2)
library(readxl)
library(pwr)
library(car)
## Loading required package: carData
##
## Attaching package: 'car'
## The following object is masked from 'package:dplyr':
##
## recode
ufc_data <- read_excel("~/Downloads/UFC_Dataset.xls")
ufc_data <- ufc_data |>
filter(Gender == "MALE", !is.na(WeightClass), WeightClass != "Catch Weight") |>
select(Winner, HeightDif, ReachDif, AgeDif)
ufc_data <- ufc_data |>
mutate(WinnerBinary = ifelse(Winner == "Red", 1, 0))
model <- glm(WinnerBinary ~ HeightDif + ReachDif + AgeDif, data = ufc_data, family = binomial)
summary(model)
##
## Call:
## glm(formula = WinnerBinary ~ HeightDif + ReachDif + AgeDif, family = binomial,
## data = ufc_data)
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 0.343131 0.028172 12.180 < 2e-16 ***
## HeightDif 0.001675 0.005604 0.299 0.764950
## ReachDif -0.016133 0.004296 -3.755 0.000173 ***
## AgeDif -0.023740 0.005505 -4.312 1.61e-05 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 7140.5 on 5257 degrees of freedom
## Residual deviance: 7095.0 on 5254 degrees of freedom
## AIC: 7103
##
## Number of Fisher Scoring iterations: 4
#Confidence Interval
conf_interval <- confint(model)
## Waiting for profiling to be done...
print(conf_interval)
## 2.5 % 97.5 %
## (Intercept) 0.28800167 0.398441009
## HeightDif -0.00931893 0.012640362
## ReachDif -0.02459394 -0.007771656
## AgeDif -0.03454879 -0.012965533
Height Difference: The height difference was seen to not have a significant influence on the likelihood of victory. This is indicated in the coefficient crossing over zero. The GLM indicates that there may be a slight increase in the likelihood of winning when you are taller but the p-value is much greater than .05, suggesting it isn’t statically significant.
Reach Difference & Age Difference:The glm and correlation coefficient both suggest that these differentials have a significant influence on the likelihood of winning a fight. The test interestingly states that a longer reach may lead to a likelihood to lose a fight. The test found that younger fighters are more likely to win. These are both supported by a p-value less than .05 and a correlation coefficient that doesn’t include zero.