February 8, 2026
My topics of choice is simple linear regression. Simple linear regression is simply a way to understand how two variables relate to one another. I am going to take these topics from Statistics and apply it using basketball as a proxy.
In Basketball Statistics can be a powerful tool in helping teams analyze specific patterns. For example one of the plots I will making will show the correlation between the likelihood of winning a game based on the amount of jump shots attempted.
P.S. The colors are purple and gold to represent Kobe Bryant
My goal for using simple linear regression is to have a chart that compares a team’s field gold percentage vs the amount of games they win as a percentage. I will also use the
For this I will focus on using the NBA Statistics from the 2021-2022 Season as a reference.
On the next two slides I will explain the math for simple linear regression.
Y = β₀ + β₁X + ε
This is the formula used to find the relationship between our two variables.
Y here is for the dependent variable
X here is for the independent variable
β₀ here is for the constant
β₁ here is for the slope
ε here stands for error
After forming the chart we then find the line that goes in between all points mostly used as a prediction tool.
Y = β₀ + β₁X (New Formula for Best fit Line)
(same as before just with no error variable essentially)
To find this line In R Studio
1.) State What the dependent and independent variables are
2.) Then make a linear model including the variables and the slope and the intercept.
3.) Put the points on a plot then use geom_smooth (formula = y ~ x, method = “lm”, se = F) to put the best fit line onto that same plot.
Nbaplot1 = data.frame(NbaData)
main = ggplot(data = Nbaplot1, aes(x = Win. , y = Team.FG.)) +
geom_point(color = "gold4", size = 3, shape = 18) +
ggtitle("Win Percentage vs Team Field Goal Percentage") +
geom_smooth(formula = y ~ x, method = "lm", se = F,
color = "purple3") +
theme_minimal()
main + theme(plot.title = element_text(hjust = 0.5))
Nbaplot2 = data.frame(NbaData)
main = ggplot(data = Nbaplot2, aes(x = Win. ,
y = Opponent.FG.), color = category) +
geom_point(color = "gold4", size = 3, shape = 18) +
ggtitle("Win Percentage vs Opponent Field Goal Percentage") +
geom_smooth(formula = y ~ x, method = "lm", se = F,
color = "purple3") +
theme_minimal()
main + theme(plot.title = element_text(hjust = 0.5))
Nbaplot3 = data.frame(NbaData) WinPercentage = NbaData$`Win%` TeamFG = NbaData$`Team FG%` OpponentFG= NbaData$`Opponent FG%` NbaPlot3D = plot_ly(Nbaplot3, x = ~WinPercentage, y = ~TeamFG, z = ~OpponentFG, type = "scatter3d", mode = "markers") NbaPlot3D