Incorporating more numeric explanatory variables

Our original data looked at likelihood of scoring as a result of the distance of the goal. Our new variables will look individually at \(x\) and \(y\) distances from the goal and the relationship between the time left in the period and the likelihood of scoring a goal.

We thought that it would be interesting to see if certain \(x\) or \(y\) distances had a larger effect on the likelihood of making the shot in comparison to the total distance (maybe it’s much harder to make the shot when you are a large \(y\) distance away from the goal and the \(x\) distance doesn’t have as much of an effect).

We also thought that players may work harder to score when there’s a time limit pressure. Looking at the likelihood of the shot along with how much time is left in the period could tell us if this time pressure has an effect (and if it does, is that effect positive or negative.) Someone may also be willing to take a crappy shot if there’s only a couple seconds left because there are few repercussions if you don’t make a shot at the end of the period (don’t have to keep playing and worry about your shot being swiped by the other team).

Setting up a multiple linear regression model

Constructing the design matrix and determining the fitted model from scratch

library(readxl)
playsfiltered_M4_new_3_ <- read_excel("~/Downloads/4) Second Semester Senior/Linear Modeling/playsfiltered M4 new(3).xlsx")
## Warning in read_fun(path = enc2native(normalizePath(path)), sheet_i = sheet, :
## Expecting numeric in D21996 / R21996C4: got 'NA'
## Warning in read_fun(path = enc2native(normalizePath(path)), sheet_i = sheet, :
## Expecting numeric in E21996 / R21996C5: got 'NA'
## Warning in read_fun(path = enc2native(normalizePath(path)), sheet_i = sheet, :
## Expecting numeric in F21996 / R21996C6: got 'NA'
## Warning in read_fun(path = enc2native(normalizePath(path)), sheet_i = sheet, :
## Expecting numeric in G21996 / R21996C7: got 'NA'
## Warning in read_fun(path = enc2native(normalizePath(path)), sheet_i = sheet, :
## Expecting numeric in D43516 / R43516C4: got 'NA'
## Warning in read_fun(path = enc2native(normalizePath(path)), sheet_i = sheet, :
## Expecting numeric in E43516 / R43516C5: got 'NA'
## Warning in read_fun(path = enc2native(normalizePath(path)), sheet_i = sheet, :
## Expecting numeric in F43516 / R43516C6: got 'NA'
## Warning in read_fun(path = enc2native(normalizePath(path)), sheet_i = sheet, :
## Expecting numeric in G43516 / R43516C7: got 'NA'
## Warning in read_fun(path = enc2native(normalizePath(path)), sheet_i = sheet, :
## Expecting numeric in D43933 / R43933C4: got 'NA'
## Warning in read_fun(path = enc2native(normalizePath(path)), sheet_i = sheet, :
## Expecting numeric in E43933 / R43933C5: got 'NA'
## Warning in read_fun(path = enc2native(normalizePath(path)), sheet_i = sheet, :
## Expecting numeric in F43933 / R43933C6: got 'NA'
## Warning in read_fun(path = enc2native(normalizePath(path)), sheet_i = sheet, :
## Expecting numeric in G43933 / R43933C7: got 'NA'
## Warning in read_fun(path = enc2native(normalizePath(path)), sheet_i = sheet, :
## Expecting numeric in D101975 / R101975C4: got 'NA'
## Warning in read_fun(path = enc2native(normalizePath(path)), sheet_i = sheet, :
## Expecting numeric in E101975 / R101975C5: got 'NA'
## Warning in read_fun(path = enc2native(normalizePath(path)), sheet_i = sheet, :
## Expecting numeric in F101975 / R101975C6: got 'NA'
## Warning in read_fun(path = enc2native(normalizePath(path)), sheet_i = sheet, :
## Expecting numeric in G101975 / R101975C7: got 'NA'
## Warning in read_fun(path = enc2native(normalizePath(path)), sheet_i = sheet, :
## Expecting numeric in D173689 / R173689C4: got 'NA'
## Warning in read_fun(path = enc2native(normalizePath(path)), sheet_i = sheet, :
## Expecting numeric in E173689 / R173689C5: got 'NA'
## Warning in read_fun(path = enc2native(normalizePath(path)), sheet_i = sheet, :
## Expecting numeric in F173689 / R173689C6: got 'NA'
## Warning in read_fun(path = enc2native(normalizePath(path)), sheet_i = sheet, :
## Expecting numeric in G173689 / R173689C7: got 'NA'
## Warning in read_fun(path = enc2native(normalizePath(path)), sheet_i = sheet, :
## Expecting numeric in D230338 / R230338C4: got 'NA'
## Warning in read_fun(path = enc2native(normalizePath(path)), sheet_i = sheet, :
## Expecting numeric in E230338 / R230338C5: got 'NA'
## Warning in read_fun(path = enc2native(normalizePath(path)), sheet_i = sheet, :
## Expecting numeric in F230338 / R230338C6: got 'NA'
## Warning in read_fun(path = enc2native(normalizePath(path)), sheet_i = sheet, :
## Expecting numeric in G230338 / R230338C7: got 'NA'
## Warning in read_fun(path = enc2native(normalizePath(path)), sheet_i = sheet, :
## Expecting numeric in D298627 / R298627C4: got 'NA'
## Warning in read_fun(path = enc2native(normalizePath(path)), sheet_i = sheet, :
## Expecting numeric in E298627 / R298627C5: got 'NA'
## Warning in read_fun(path = enc2native(normalizePath(path)), sheet_i = sheet, :
## Expecting numeric in F298627 / R298627C6: got 'NA'
## Warning in read_fun(path = enc2native(normalizePath(path)), sheet_i = sheet, :
## Expecting numeric in G298627 / R298627C7: got 'NA'
## Warning in read_fun(path = enc2native(normalizePath(path)), sheet_i = sheet, :
## Expecting numeric in D324446 / R324446C4: got 'NA'
## Warning in read_fun(path = enc2native(normalizePath(path)), sheet_i = sheet, :
## Expecting numeric in E324446 / R324446C5: got 'NA'
## Warning in read_fun(path = enc2native(normalizePath(path)), sheet_i = sheet, :
## Expecting numeric in F324446 / R324446C6: got 'NA'
## Warning in read_fun(path = enc2native(normalizePath(path)), sheet_i = sheet, :
## Expecting numeric in G324446 / R324446C7: got 'NA'
## Warning in read_fun(path = enc2native(normalizePath(path)), sheet_i = sheet, :
## Expecting numeric in D324447 / R324447C4: got 'NA'
## Warning in read_fun(path = enc2native(normalizePath(path)), sheet_i = sheet, :
## Expecting numeric in E324447 / R324447C5: got 'NA'
## Warning in read_fun(path = enc2native(normalizePath(path)), sheet_i = sheet, :
## Expecting numeric in F324447 / R324447C6: got 'NA'
## Warning in read_fun(path = enc2native(normalizePath(path)), sheet_i = sheet, :
## Expecting numeric in G324447 / R324447C7: got 'NA'
## Warning in read_fun(path = enc2native(normalizePath(path)), sheet_i = sheet, :
## Expecting numeric in D324448 / R324448C4: got 'NA'
## Warning in read_fun(path = enc2native(normalizePath(path)), sheet_i = sheet, :
## Expecting numeric in E324448 / R324448C5: got 'NA'
## Warning in read_fun(path = enc2native(normalizePath(path)), sheet_i = sheet, :
## Expecting numeric in F324448 / R324448C6: got 'NA'
## Warning in read_fun(path = enc2native(normalizePath(path)), sheet_i = sheet, :
## Expecting numeric in G324448 / R324448C7: got 'NA'
## Warning in read_fun(path = enc2native(normalizePath(path)), sheet_i = sheet, :
## Expecting numeric in D324449 / R324449C4: got 'NA'
## Warning in read_fun(path = enc2native(normalizePath(path)), sheet_i = sheet, :
## Expecting numeric in E324449 / R324449C5: got 'NA'
## Warning in read_fun(path = enc2native(normalizePath(path)), sheet_i = sheet, :
## Expecting numeric in F324449 / R324449C6: got 'NA'
## Warning in read_fun(path = enc2native(normalizePath(path)), sheet_i = sheet, :
## Expecting numeric in G324449 / R324449C7: got 'NA'
## Warning in read_fun(path = enc2native(normalizePath(path)), sheet_i = sheet, :
## Expecting numeric in D324450 / R324450C4: got 'NA'
## Warning in read_fun(path = enc2native(normalizePath(path)), sheet_i = sheet, :
## Expecting numeric in E324450 / R324450C5: got 'NA'
## Warning in read_fun(path = enc2native(normalizePath(path)), sheet_i = sheet, :
## Expecting numeric in F324450 / R324450C6: got 'NA'
## Warning in read_fun(path = enc2native(normalizePath(path)), sheet_i = sheet, :
## Expecting numeric in G324450 / R324450C7: got 'NA'
## Warning in read_fun(path = enc2native(normalizePath(path)), sheet_i = sheet, :
## Expecting numeric in D356919 / R356919C4: got 'NA'
## Warning in read_fun(path = enc2native(normalizePath(path)), sheet_i = sheet, :
## Expecting numeric in E356919 / R356919C5: got 'NA'
## Warning in read_fun(path = enc2native(normalizePath(path)), sheet_i = sheet, :
## Expecting numeric in F356919 / R356919C6: got 'NA'
## Warning in read_fun(path = enc2native(normalizePath(path)), sheet_i = sheet, :
## Expecting numeric in G356919 / R356919C7: got 'NA'
## Warning in read_fun(path = enc2native(normalizePath(path)), sheet_i = sheet, :
## Expecting numeric in D357709 / R357709C4: got 'NA'
## Warning in read_fun(path = enc2native(normalizePath(path)), sheet_i = sheet, :
## Expecting numeric in E357709 / R357709C5: got 'NA'
## Warning in read_fun(path = enc2native(normalizePath(path)), sheet_i = sheet, :
## Expecting numeric in F357709 / R357709C6: got 'NA'
## Warning in read_fun(path = enc2native(normalizePath(path)), sheet_i = sheet, :
## Expecting numeric in G357709 / R357709C7: got 'NA'
#View(playsfiltered_M4_new_3_)

#Importing the data
hockey<-playsfiltered_M4_new_3_
head(hockey)
dim(hockey)
## [1] 373203     11
#Looking at the data
library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.0 ──
## ✓ ggplot2 3.3.3     ✓ purrr   0.3.4
## ✓ tibble  3.0.6     ✓ dplyr   1.0.4
## ✓ tidyr   1.1.2     ✓ stringr 1.4.0
## ✓ readr   1.4.0     ✓ forcats 0.5.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()
hockey%>%
  select(Goal, `Rounded Distance`, st_x, st_y, `Time Remaining in period`)%>%
  pairs()

hockey2<-hockey%>%
  select(Goal, `Rounded Distance`, st_x, st_y, `Time Remaining in period`)%>%
  na.omit()

hockey2$timemsec<- hockey2$`Time Remaining in period`/60

dim(hockey)
## [1] 373203     11
dim(hockey2)
## [1] 373189      6
#First we must create our response vector
Y<-hockey2$Goal

attach(hockey2)



#Design Matrix
xMat<-matrix(c(rep(1, dim(hockey2)[1]), `Rounded Distance`, st_x, st_y, timemsec), nrow=dim(hockey2)[1])
head(xMat)
##      [,1] [,2] [,3] [,4] [,5]
## [1,]    1   20   71   -9 19.1
## [2,]    1    5   88   -5 19.1
## [3,]    1   34   56   -7 19.1
## [4,]    1   57   37   24 19.1
## [5,]    1   21  -71   11 19.1
## [6,]    1   38   57  -20 19.1
#Now we must find our fitted model from scratch

#first we find the transpose
#t(xMat)

#then we preform matrix multiplication
#t(xMat)%*%xMat

#we find the inverse
#solve(t(xMat)%*%xMat)

#finally bringing it all together to find our beta hat vector
solve(t(xMat)%*%xMat)%*%t(xMat)%*%Y
##               [,1]
## [1,]  9.312634e-02
## [2,] -1.538530e-03
## [3,]  5.922780e-04
## [4,] -1.133118e-06
## [5,] -9.641602e-04
#Double check using our snazzy lm() function
mod<-lm(Y~`Rounded Distance`+ st_x + st_y+ timemsec)
mod
## 
## Call:
## lm(formula = Y ~ `Rounded Distance` + st_x + st_y + timemsec)
## 
## Coefficients:
##        (Intercept)  `Rounded Distance`                st_x                st_y  
##          9.313e-02          -1.539e-03           5.923e-04          -1.133e-06  
##           timemsec  
##         -9.642e-04