February 8, 2026

Statistics Analysis

Slide #1

  • My topics of choice is simple linear regression. Simple linear regression is simply a way to understand how two variables relate to one another. I am going to take these topics from Statistics and apply it using basketball as a proxy.

  • In Basketball Statistics can be a powerful tool in helping teams analyze specific patterns. For example one of the plots I will making will show the correlation between the likelihood of winning a game based on the amount of jump shots attempted.

  • P.S. The colors are purple and gold to represent Kobe Bryant

Slide #2

  • My goal for using simple linear regression is to have a chart that compares a team’s field gold percentage vs the amount of games they win as a percentage. I will also use the

  • For this I will focus on using the NBA Statistics from the 2021-2022 Season as a reference.

  • On the next two slides I will explain the math for simple linear regression.

Slide #3

  • Y = β₀ + β₁X + ε

  • This is the formula used to find the relationship between our two variables.

  • Y here is for the dependent variable

  • X here is for the independent variable

  • β₀ here is for the constant

  • β₁ here is for the slope

  • ε here stands for error

Slide #4

  • After forming the chart we then find the line that goes in between all points mostly used as a prediction tool.

  • Y = β₀ + β₁X (New Formula for Best fit Line)

  • (same as before just with no error variable essentially)

  • To find this line In R Studio

  • 1.) State What the dependent and independent variables are

  • 2.) Then make a linear model including the variables and the slope and the intercept.

  • 3.) Put the points on a plot then use geom_smooth (formula = y ~ x, method = “lm”, se = F) to put the best fit line onto that same plot.

Statistics Analysis (Plots)

Slide #5

Nbaplot1 = data.frame(NbaData)

main = ggplot(data = Nbaplot1, aes(x = Win. , y = Team.FG.)) + 

geom_point(color = "gold4", size = 3, shape = 18) +

ggtitle("Win Percentage vs Team Field Goal Percentage") +

geom_smooth(formula = y ~ x, method = "lm", se = F, 
color = "purple3") +
  
theme_minimal()

main + theme(plot.title = element_text(hjust = 0.5))

Slide #6

Slide #7

Nbaplot2 = data.frame(NbaData)

main = ggplot(data = Nbaplot2, aes(x = Win. ,
y = Opponent.FG.), color = category) +

geom_point(color = "gold4", size = 3, shape = 18) +
  
ggtitle("Win Percentage vs Opponent Field Goal Percentage") +
  
geom_smooth(formula = y ~ x, method = "lm", se = F, 
color = "purple3") +
  
theme_minimal()

main + theme(plot.title = element_text(hjust = 0.5))

Slide #8

Slide #9

Nbaplot3 = data.frame(NbaData)

WinPercentage = NbaData$`Win%`

TeamFG = NbaData$`Team FG%`

OpponentFG= NbaData$`Opponent FG%`

NbaPlot3D = plot_ly(Nbaplot3, x = ~WinPercentage,
y = ~TeamFG, z = ~OpponentFG, type = "scatter3d", 
mode = "markers")

NbaPlot3D

Slide #10