mydata <- read_csv("https://myxavier-my.sharepoint.com/:x:/g/personal/janishefskib_xavier_edu/ETjz1TyRf5xNjPd7ooW4qLUBJ6TsJMh43uRJT8ecuqgwvQ?download=1")Analysis of the 100 best hitters in baseball
Introduction
I am going to be performing analysis on the top 100 baseball hitters data by year. It includes statistics from the year as well as any awards they had received by their performance. You can learn more about the data at this link: https://myxavier-my.sharepoint.com/:x:/g/personal/janishefskib_xavier_edu/ETjz1TyRf5xNjPd7ooW4qLUBJ6TsJMh43uRJT8ecuqgwvQ?download=1
Running Code
When you click the Render button a document will be generated that includes both content and the output of embedded code. You can embed code like this:
1 + 1[1] 2
You can add options to executable code like this
[1] 4
The echo: false option disables the printing of code (only output is displayed).
Research Question
Do players that have have more hits have a higher amount of RBI’s?
colnames(mydata) <- as.character(unlist(mydata[1, ]))
mydata <- mydata[-1, ]
mydata <- mydata %>%
mutate(RBI = as.numeric(RBI),
H = as.numeric(H))
mydata %>%
ggplot(aes(x= RBI, y=H ))+
geom_point(color = "darkgreen", alpha= .5, size = .75)+
labs(x= "Runs Batted In (RBI)", y= "Hits")+
scale_x_continuous(breaks= seq(0,200, by = 25))+
geom_smooth(method = "lm")The analysis performed above shows the relationship between hits and OBP. # Although I expected more drastic results in the relation, there is still a # small correlation between Hits and OBP. In terms of the question I posed, # hits don’t contribute as much to OBP as some people think. This graph shows # that, most likely, walks contribute more to a high OBP (On-base percentage)