*** Correlation is a statistical relationship between two variables. Correlations are useful because they can indicate a predictive relationship that can be exploited in practice. Generally it can have three results of its absolute result. 0-0.3 is weak correlated, 0.3-0.7 is intermediate correlated and above 0.7 is strong correlated. The sign decides if two variables are positive or negative correlated. https://en.wikipedia.org/wiki/Correlation ***
*** Covariance is a measure of relationship between two variables. It evaluates how much two variable change each other. The sign indicates the positive or negative relationship. Positive means two variables are moving in same direction and negative means different direction***
*** https://corporatefinanceinstitute.com/resources/data-science/covariance/ ***
df1 <- read.csv('lebron_playoffs.csv')
df2 <- read.csv('jordan_playoffs.csv')
df3 <- rbind(df1, df2)
head(df3)
## game date series series_game team opp result mp fg fga fgp three
## 1 1 2006-04-22 EC1 1 CLE WAS W (+11) 48 12 27 0.444 1
## 2 2 2006-04-25 EC1 2 CLE WAS L (-5) 43 7 25 0.280 1
## 3 3 2006-04-28 EC1 3 CLE WAS W (+1) 48 16 28 0.571 3
## 4 4 2006-04-30 EC1 4 CLE WAS L (-10) 45 13 23 0.565 7
## 5 5 2006-05-03 EC1 5 CLE WAS W (+1) 46 14 23 0.609 0
## 6 6 2006-05-05 EC1 6 CLE WAS W (+1) 53 15 25 0.600 1
## threeatt threep ft fta ftp orb drb trb ast stl blk tov pts game_score
## 1 4 0.250 7 11 0.636 3 8 11 11 0 0 4 32 23.3
## 2 6 0.167 11 15 0.733 2 7 9 2 2 3 10 26 6.7
## 3 5 0.600 6 9 0.667 1 4 5 3 2 0 4 41 27.4
## 4 12 0.583 5 7 0.714 1 5 6 5 0 0 7 38 23.0
## 5 1 0.000 17 18 0.944 5 2 7 6 2 0 4 45 38.4
## 6 6 0.167 1 3 0.333 0 7 7 7 2 1 5 32 22.8
## plus_minus
## 1 11
## 2 -2
## 3 1
## 4 -16
## 5 -4
## 6 1
*** To briefly explain the dataset I am interested in relation of goat players’ result(y) with independent variables(rest) ***
library(stargazer)
##
## Please cite as:
## Hlavac, Marek (2018). stargazer: Well-Formatted Regression and Summary Statistics Tables.
## R package version 5.2.2. https://CRAN.R-project.org/package=stargazer
stargazer(df3, type = "text")
##
## ================================================================
## Statistic N Mean St. Dev. Min Pctl(25) Pctl(75) Max
## ----------------------------------------------------------------
## game 439 9.706 5.841 1 5 14 23
## series_game 439 3.187 1.660 1 2 4 7
## mp 439 41.670 4.363 24 39.5 44 57
## fg 439 11.068 3.690 2 9 13 24
## fga 439 22.517 5.977 8 18 26 45
## fgp 439 0.492 0.108 0.111 0.423 0.560 0.846
## three 439 1.280 1.335 0 0 2 7
## threeatt 439 3.829 2.507 0 2 5 12
## threep 402 0.310 0.261 0.000 0.000 0.500 1.000
## ft 439 7.285 4.053 0 4 10 23
## fta 439 9.355 4.681 0 6 12 28
## ftp 435 0.766 0.179 0.000 0.667 0.889 1.000
## orb 439 1.592 1.388 0 1 2 8
## drb 439 6.380 2.985 0 4 8 16
## trb 439 7.973 3.333 0 6 10 19
## ast 439 6.590 2.938 1 4 8 16
## stl 439 1.870 1.339 0 1 3 6
## blk 439 0.929 1.021 0 0 1 5
## tov 439 3.408 2.017 0 2 5 10
## pts 439 30.702 8.725 7 25 36 63
## game_score 439 24.246 8.281 -0.700 19.150 29.550 49.800
## plus_minus 260 5.008 14.012 -32.000 -4.000 14.250 46.000
## ----------------------------------------------------------------
I set type=text to produce ASCII text output, rather than LATEX code
corr <- cor(df3$pts, df3$fga)
corr
## [1] 0.7154493
cov <- cov(df3$pts, df3$fga)
cov
## [1] 37.3099
*** Above checks the correlation and covariance between field goal attempt and points earned The corr = 0.71 which is high correlation between two variables and they are positively correlated cov = 37.3099 proves the positive relationship again The result satisfy general understanding***