Introduction:
The purpose of this analysis was to identify if there is any relationship between amount of games an MLB hitter plays and their: on base percentage, runs batted in, and the specific league that the hitter plays for (either NL or AL). Below is the data set used which complied all MLB hitters statistics since 1871!. This analysis used Lahman datasets. His datasets are the best complied and follow all the most recent baseball events.
Packages Used
library(tidyverse)
library(Zelig)
library(pander)
library(texreg)
library(visreg)
library(lmtest)
library(visreg)
library(sjmisc)
Data Used
Baseball_Batting_Stats <- read_csv("/Users/andyr1017/Downloads/baseballdatabank-2017.1 4/core/Batting.csv")
head(Baseball_Batting_Stats)
Piping Code
Below is a compilation of the nesscary code needed to refine the dataset to only the vectors that would be used. I created the varible “On_Base_Percentage” using the varibles “Hits,” “Base_On_Balls,” and “Stirke_Out.” This was used to identify the percentage a hitter reached base safely via walk or hit. The varible “Games_Played” serves as a continuous dependent variable to be analyzed against the variables: On_Base_Percentage, Runs_Batted_In and League.
Baseball_Batting_Stats1 <- Baseball_Batting_Stats%>%
rename(Y2016 = yearID,
Team_2016 = teamID,
Games_Played = G,
At_Bat = AB,
Hits = H,
Home_Run = HR,
Runs_Batted_In = RBI,
Base_On_Balls = BB,
Intentional_Base_On_Balls = IBB,
Stirke_Out = SO,
League = lgID)%>%
select(Y2016,
Team_2016,
Games_Played,
At_Bat,
Hits,
Home_Run,
Runs_Batted_In,
Base_On_Balls,
Intentional_Base_On_Balls,
Stirke_Out,
League)%>%
mutate(On_Base_Percentage = ((Hits + Base_On_Balls) / (Hits + Base_On_Balls + Stirke_Out)),
Home_Run_Stirke_Out = (Stirke_Out/Home_Run),
Home_Run_Base_On_Balls = (Base_On_Balls/Home_Run),
Home_Run_Hits = (Hits/Home_Run))%>%
filter(On_Base_Percentage>=0, Y2016 == 2016)
head(Baseball_Batting_Stats1)
Linear Regretion Models
Model3
In this Model, I again used Games_Played as the dependent variable and tested it by On_Base_Percentage, Runs_batted_in while introducing League. I added League to the equation to try to Identify if there is a significance based on if hitters played on the AL or NL. I incorporated both Runs_Batted_In and League to see if this would signify a Leagues significance. The results show that Runs_batted_in combined with League contributes little impact to a players “playrate.”
Model3 <- lm(Games_Played ~ On_Base_Percentage + Runs_Batted_In * League, Baseball_Batting_Stats1)
summary(Model3)
Call:
lm(formula = Games_Played ~ On_Base_Percentage + Runs_Batted_In *
League, data = Baseball_Batting_Stats1)
Residuals:
Min 1Q Median 3Q Max
-76.277 -15.099 -4.185 10.953 84.595
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 21.39029 1.82515 11.720 < 2e-16 ***
On_Base_Percentage 12.55268 3.08042 4.075 4.97e-05 ***
Runs_Batted_In 1.47596 0.03495 42.228 < 2e-16 ***
LeagueNL -0.57201 1.72331 -0.332 0.740
Runs_Batted_In:LeagueNL 0.07719 0.04782 1.614 0.107
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 21.36 on 986 degrees of freedom
Multiple R-squared: 0.816, Adjusted R-squared: 0.8153
F-statistic: 1093 on 4 and 986 DF, p-value: < 2.2e-16
Tables and Graphics
library(texreg)
htmlreg(list(Model1, Model2, Model3))
Statistical models
|
Model 1
|
Model 2
|
Model 3
|
(Intercept)
|
16.66***
|
21.01***
|
21.39***
|
|
(2.91)
|
(1.41)
|
(1.83)
|
On_Base_Percentage
|
92.03***
|
12.80***
|
12.55***
|
|
(5.62)
|
(3.06)
|
(3.08)
|
Runs_Batted_In
|
|
1.51***
|
1.48***
|
|
|
(0.03)
|
(0.03)
|
LeagueNL
|
|
|
-0.57
|
|
|
|
(1.72)
|
Runs_Batted_In:LeagueNL
|
|
|
0.08
|
|
|
|
(0.05)
|
R2
|
0.21
|
0.82
|
0.82
|
Adj. R2
|
0.21
|
0.82
|
0.82
|
Num. obs.
|
991
|
991
|
991
|
RMSE
|
44.10
|
21.37
|
21.36
|
p < 0.001, p < 0.01, p < 0.05
|
visreg(Model3, "Games_Played", by = "On_Base_Percentage", scale = "response")

ggplot(data = Baseball_Batting_Stats1) +
geom_point(mapping = aes(x = Runs_Batted_In , y = Games_Played))

Conclusion
Based on this analysis one is able to concluded that On_Base_Percentage and Runs_Batted_In, are large factors in how many games a hitter will play. However, a hitters league will not told any significance on weather or not an individual plays any games.
LS0tCnRpdGxlOiAnTUxCIENvbXBhcmlzb25zOiBHYW1lcyBQbGF5ZWQ6IFdoYXQgY2F1c2VzIGl0ICMyICcKb3V0cHV0OgogIGh0bWxfbm90ZWJvb2s6IGRlZmF1bHQKICBodG1sX2RvY3VtZW50OiBkZWZhdWx0Ci0tLQoKIyMjX0ludHJvZHVjdGlvbjpfCgojIyMjIyNUaGUgcHVycG9zZSBvZiB0aGlzIGFuYWx5c2lzIHdhcyB0byBpZGVudGlmeSBpZiB0aGVyZSBpcyBhbnkgcmVsYXRpb25zaGlwIGJldHdlZW4gYW1vdW50IG9mIGdhbWVzIGFuIE1MQiBoaXR0ZXIgcGxheXMgYW5kIHRoZWlyOiBvbiBiYXNlIHBlcmNlbnRhZ2UsIHJ1bnMgYmF0dGVkIGluLCBhbmQgdGhlIHNwZWNpZmljIGxlYWd1ZSB0aGF0IHRoZSBoaXR0ZXIgcGxheXMgZm9yIChlaXRoZXIgTkwgb3IgQUwpLiBCZWxvdyBpcyB0aGUgZGF0YSBzZXQgdXNlZCB3aGljaCBjb21wbGllZCBhbGwgTUxCIGhpdHRlcnMgc3RhdGlzdGljcyBzaW5jZSAxODcxIS4gVGhpcyBhbmFseXNpcyB1c2VkIExhaG1hbiBkYXRhc2V0cy4gSGlzIGRhdGFzZXRzIGFyZSB0aGUgYmVzdCBjb21wbGllZCBhbmQgZm9sbG93IGFsbCB0aGUgbW9zdCByZWNlbnQgYmFzZWJhbGwgZXZlbnRzLgoKLS0tLQoKIyMjUGFja2FnZXMgVXNlZApgYGB7ciwgbWVzc2FnZT1GQUxTRSwgd2FybmluZz1GQUxTRX0KbGlicmFyeSh0aWR5dmVyc2UpCmxpYnJhcnkoWmVsaWcpCmxpYnJhcnkocGFuZGVyKQpsaWJyYXJ5KHRleHJlZykKbGlicmFyeSh2aXNyZWcpCmxpYnJhcnkobG10ZXN0KQpsaWJyYXJ5KHZpc3JlZykKbGlicmFyeShzam1pc2MpCmBgYAoKLS0tLS0tCgojIyNEYXRhIFVzZWQKCmBgYHtyLCBtZXNzYWdlPUZBTFNFLCB3YXJuaW5nPUZBTFNFfQpCYXNlYmFsbF9CYXR0aW5nX1N0YXRzIDwtIHJlYWRfY3N2KCIvVXNlcnMvYW5keXIxMDE3L0Rvd25sb2Fkcy9iYXNlYmFsbGRhdGFiYW5rLTIwMTcuMSA0L2NvcmUvQmF0dGluZy5jc3YiKQpoZWFkKEJhc2ViYWxsX0JhdHRpbmdfU3RhdHMpCmBgYAoKLS0tLQoKIyMjUGlwaW5nIENvZGUgCgojIyMjIyNCZWxvdyBpcyBhIGNvbXBpbGF0aW9uIG9mIHRoZSBuZXNzY2FyeSBjb2RlIG5lZWRlZCB0byByZWZpbmUgdGhlIGRhdGFzZXQgdG8gb25seSB0aGUgdmVjdG9ycyB0aGF0IHdvdWxkIGJlIHVzZWQuIEkgY3JlYXRlZCB0aGUgdmFyaWJsZSAiT25fQmFzZV9QZXJjZW50YWdlIiB1c2luZyB0aGUgdmFyaWJsZXMgIkhpdHMsIiAiQmFzZV9Pbl9CYWxscywiIGFuZCAiU3RpcmtlX091dC4iIFRoaXMgd2FzIHVzZWQgdG8gaWRlbnRpZnkgdGhlIHBlcmNlbnRhZ2UgYSBoaXR0ZXIgcmVhY2hlZCBiYXNlIHNhZmVseSB2aWEgd2FsayBvciBoaXQuIFRoZSB2YXJpYmxlICJHYW1lc19QbGF5ZWQiIHNlcnZlcyBhcyBhIGNvbnRpbnVvdXMgZGVwZW5kZW50IHZhcmlhYmxlIHRvIGJlIGFuYWx5emVkIGFnYWluc3QgdGhlIHZhcmlhYmxlczogT25fQmFzZV9QZXJjZW50YWdlLCBSdW5zX0JhdHRlZF9JbiBhbmQgTGVhZ3VlLgoKYGBge3J9CkJhc2ViYWxsX0JhdHRpbmdfU3RhdHMxIDwtIEJhc2ViYWxsX0JhdHRpbmdfU3RhdHMlPiUKICByZW5hbWUoWTIwMTYgPSB5ZWFySUQsCiAgICAgICAgIFRlYW1fMjAxNiA9IHRlYW1JRCwKICAgICAgICAgR2FtZXNfUGxheWVkID0gRywgCiAgICAgICAgIEF0X0JhdCA9IEFCLCAKICAgICAgICAgSGl0cyA9IEgsCiAgICAgICAgIEhvbWVfUnVuID0gSFIsCiAgICAgICAgIFJ1bnNfQmF0dGVkX0luID0gUkJJLCAKICAgICAgICAgQmFzZV9Pbl9CYWxscyA9IEJCLCAKICAgICAgICAgSW50ZW50aW9uYWxfQmFzZV9Pbl9CYWxscyA9IElCQiwgCiAgICAgICAgIFN0aXJrZV9PdXQgPSBTTywKICAgICAgICAgTGVhZ3VlID0gbGdJRCklPiUKICBzZWxlY3QoWTIwMTYsCiAgICAgICAgIFRlYW1fMjAxNiwgCiAgICAgICAgIEdhbWVzX1BsYXllZCwgCiAgICAgICAgIEF0X0JhdCwgCiAgICAgICAgIEhpdHMsIAogICAgICAgICBIb21lX1J1biwgCiAgICAgICAgIFJ1bnNfQmF0dGVkX0luLCAKICAgICAgICAgQmFzZV9Pbl9CYWxscywgCiAgICAgICAgIEludGVudGlvbmFsX0Jhc2VfT25fQmFsbHMsIAogICAgICAgICBTdGlya2VfT3V0LAogICAgICAgICBMZWFndWUpJT4lCiAgbXV0YXRlKE9uX0Jhc2VfUGVyY2VudGFnZSA9ICgoSGl0cyArIEJhc2VfT25fQmFsbHMpIC8gKEhpdHMgKyBCYXNlX09uX0JhbGxzICsgU3RpcmtlX091dCkpLAogICAgICAgICBIb21lX1J1bl9TdGlya2VfT3V0ID0gKFN0aXJrZV9PdXQvSG9tZV9SdW4pLCAKICAgICAgICAgSG9tZV9SdW5fQmFzZV9Pbl9CYWxscyA9IChCYXNlX09uX0JhbGxzL0hvbWVfUnVuKSwKICAgICAgICAgSG9tZV9SdW5fSGl0cyA9IChIaXRzL0hvbWVfUnVuKSklPiUKICBmaWx0ZXIoT25fQmFzZV9QZXJjZW50YWdlPj0wLCBZMjAxNiA9PSAyMDE2KQogIApoZWFkKEJhc2ViYWxsX0JhdHRpbmdfU3RhdHMxKQpgYGAKCi0tLS0tCgojIyNMaW5lYXIgUmVncmV0aW9uIE1vZGVscyAKCi0tLS0tCgoKIyMjTW9kZWwxCgojIyMjIyNJbiB0aGlzIE1vZGVsLCBJIHVzZWQgR2FtZXNfUGxheWVkIGFzIHRoZSBkZXBlbmRlbnQgdmFyaWJsZSBhbmQgdGVzdGVkIGl0IGJ5IE9uX0Jhc2VfUGVyY2VudGFnZSB0byBmaWd1cmUgb3V0IGl0cyBzaWduaWZpY2FuY2UuIEJhc2VkIG9uIHRoZXNlIHJlc3VsdHMsIE9uX0Jhc2VfUGVyY2VudGFnZSBwcm92ZWQgdG8gYmUgaGlnaGx5IGltcG9ydGFudCB3aGVuIHRha2luZyBpbnRvIGFjY291bnQgdGhlIG51bWJlciBvZiBnYW1lcyBhbiBhdGhsZXRlIHBsYXllZAoKYGBge3J9Ck1vZGVsMSA8LSBsbShHYW1lc19QbGF5ZWQgfiBPbl9CYXNlX1BlcmNlbnRhZ2UsIEJhc2ViYWxsX0JhdHRpbmdfU3RhdHMxKQpzdW1tYXJ5KE1vZGVsMSkKYGBgCgotLS0tCgojIyNNb2RlbDIKCiMjIyMjI0luIHRoaXMgTW9kZWwsIEkgYWdhaW4gdXNlZCBHYW1lc19QbGF5ZWQgYXMgdGhlIGRlcGVuZGVudCB2YXJpYmxlIGFuZCB0ZXN0ZWQgaXQgYnkgT25fQmFzZV9QZXJjZW50YWdlLCBidXQgdGhpcyB0aW1lIEkgaW5jbHVkZWQgUnVuc19iYXR0ZWRfaW4gdG8gZmlndXJlIG91dCBpZiB0aGVzZSB2YXJpYmxlcyBoZWlnaHRlbmVkIHNpZ25pZmljYW5jZS5CYXNlZCBvbiB0aGVzZSByZXN1bHRzLCBPbl9CYXNlX1BlcmNlbnRhZ2UgY29udGludWVkIHRvIGJlIHNpZ25pZmljYW50IHdoZW4gaW1wbGVtZW50aW5nIFJ1bnNfYmF0dGVkX2luIHdpdGggYSBQIHZhdWxlIG9mIDMuMTJlLTA1IHJlc3BlY3RmdWxseS4KCmBgYHtyLCBtZXNzYWdlPUZBTFNFLCB3YXJuaW5nPUZBTFNFfQpNb2RlbDIgPC0gbG0oR2FtZXNfUGxheWVkIH4gT25fQmFzZV9QZXJjZW50YWdlICsgUnVuc19CYXR0ZWRfSW4sIEJhc2ViYWxsX0JhdHRpbmdfU3RhdHMxKQpzdW1tYXJ5KE1vZGVsMikKCmBgYAoKLS0tLS0KCiMjI01vZGVsMwoKIyMjIyMjSW4gdGhpcyBNb2RlbCwgSSBhZ2FpbiB1c2VkIEdhbWVzX1BsYXllZCBhcyB0aGUgZGVwZW5kZW50IHZhcmlhYmxlIGFuZCB0ZXN0ZWQgaXQgYnkgT25fQmFzZV9QZXJjZW50YWdlLCBSdW5zX2JhdHRlZF9pbiB3aGlsZSBpbnRyb2R1Y2luZyBMZWFndWUuIEkgYWRkZWQgTGVhZ3VlIHRvIHRoZSBlcXVhdGlvbiB0byB0cnkgdG8gSWRlbnRpZnkgaWYgdGhlcmUgaXMgYSBzaWduaWZpY2FuY2UgYmFzZWQgb24gaWYgaGl0dGVycyBwbGF5ZWQgb24gdGhlIEFMIG9yIE5MLiBJIGluY29ycG9yYXRlZCBib3RoIFJ1bnNfQmF0dGVkX0luIGFuZCBMZWFndWUgdG8gc2VlIGlmIHRoaXMgd291bGQgc2lnbmlmeSBhIExlYWd1ZXMgc2lnbmlmaWNhbmNlLiBUaGUgcmVzdWx0cyBzaG93IHRoYXQgUnVuc19iYXR0ZWRfaW4gY29tYmluZWQgd2l0aCBMZWFndWUgY29udHJpYnV0ZXMgbGl0dGxlIGltcGFjdCB0byBhIHBsYXllcnMgInBsYXlyYXRlLiIgCgpgYGB7cn0KTW9kZWwzIDwtIGxtKEdhbWVzX1BsYXllZCB+IE9uX0Jhc2VfUGVyY2VudGFnZSArIFJ1bnNfQmF0dGVkX0luICogTGVhZ3VlLCBCYXNlYmFsbF9CYXR0aW5nX1N0YXRzMSkKc3VtbWFyeShNb2RlbDMpCmBgYAoKLS0tLS0KCiMjIyNUYWJsZXMgYW5kIEdyYXBoaWNzIAoKYGBge3IsIHJlc3VsdHM9J2FzaXMnfQpsaWJyYXJ5KHRleHJlZykKaHRtbHJlZyhsaXN0KE1vZGVsMSwgTW9kZWwyLCBNb2RlbDMpKQpgYGAKCmBgYHtyfQp2aXNyZWcoTW9kZWwzLCAiR2FtZXNfUGxheWVkIiwgYnkgPSAiT25fQmFzZV9QZXJjZW50YWdlIiwgc2NhbGUgPSAicmVzcG9uc2UiKQpgYGAKCmBgYHtyfQpnZ3Bsb3QoZGF0YSA9IEJhc2ViYWxsX0JhdHRpbmdfU3RhdHMxKSArIAogIGdlb21fcG9pbnQobWFwcGluZyA9IGFlcyh4ID0gUnVuc19CYXR0ZWRfSW4gLCB5ID0gR2FtZXNfUGxheWVkKSkKYGBgCgojIyNfQ29uY2x1c2lvbl8KCkJhc2VkIG9uIHRoaXMgYW5hbHlzaXMgb25lIGlzIGFibGUgdG8gY29uY2x1ZGVkIHRoYXQgT25fQmFzZV9QZXJjZW50YWdlIGFuZCBSdW5zX0JhdHRlZF9JbiwgYXJlIGxhcmdlIGZhY3RvcnMgaW4gaG93IG1hbnkgZ2FtZXMgYSBoaXR0ZXIgd2lsbCBwbGF5LiBIb3dldmVyLCBhIGhpdHRlcnMgbGVhZ3VlIHdpbGwgbm90IHRvbGQgYW55IHNpZ25pZmljYW5jZSBvbiB3ZWF0aGVyIG9yIG5vdCBhbiBpbmRpdmlkdWFsIHBsYXlzIGFueSBnYW1lcy4gCgoK