Administrative

Please indicate

  • Roughly how much time you spent on this HW so far: 1.5 hours
  • The URL of the RPubs published URL here.
  • What gave you the most trouble: creating the plot of slugging percentage
  • Any comments you have: Not too sure if I did the 10 most succesful seasons part correct. I interpretted it as the seasons in which the angels won the most games.

load tidyverse and lahman packages

library(tidyverse)
## Loading tidyverse: ggplot2
## Loading tidyverse: tibble
## Loading tidyverse: tidyr
## Loading tidyverse: readr
## Loading tidyverse: purrr
## Loading tidyverse: dplyr
## Conflicts with tidy packages ----------------------------------------------
## filter(): dplyr, stats
## lag():    dplyr, stats
library(Lahman)

Problem 1.

Define two new variables in the Teams data frame: batting average (BA) and slugging percentage (SLG). Batting average is the ratio of hits (H) to at-bats (AB), and slugging percentage is the total bases divided by at-bats. To compute the total bases, you get 1 for a single, 2 for a double, 3 for a triple, and 4 for a home run.

Team <- Teams

mutate to add Batting average

Team <-mutate(Team, BA = H/AB)

mutate to add slugging percentage

Team <-mutate(Team, SLG = ((H+2*X2B+3*X3B+4*HR)/AB))

Problem 2.

Plot a time series of SLG since 1954 by league (lgID). Is slugging percentage typically higher in the American League (AL) or the National League?

Since 1954 sluggling percentage was typically higher in the AL. This is probably due to the designated hitter rule in the AL.

slugplot<-filter(Team, yearID > 1953) %>%
  select(SLG, lgID, yearID) %>%
  group_by(yearID,lgID)%>%
  mutate(SLG_LG = mean(SLG)) 
slug_lg_plot<- 
ggplot(slugplot, aes(x= yearID, y=SLG_LG, color=lgID)) +
  geom_point() +
  geom_smooth() +
  labs(title= "Trend in Slugging Percentage since 1954", subtitle = "NL vs. AL", y= "Slugging Percentage", x="Year")

slug_lg_plot
## `geom_smooth()` using method = 'loess'

Problem 3.

Display the top 15 teams ranked in terms of slugging percentage in MLB history. Repeat this using teams since 1969.

arrage Team to display top 15 teams in slugging in MLB history

Top15<- arrange(Team, desc(SLG)) %>%
  select(yearID,teamID,SLG)
head(Top15, 15)
##    yearID teamID       SLG
## 1    2003    BOS 0.6033975
## 2    1927    NYA 0.5922947
## 3    1997    SEA 0.5908443
## 4    1996    SEA 0.5906845
## 5    1930    NYA 0.5904919
## 6    1994    CLE 0.5900050
## 7    2001    COL 0.5880492
## 8    1936    NYA 0.5871937
## 9    2009    NYA 0.5818021
## 10   2004    BOS 0.5807692
## 11   1995    CLE 0.5799523
## 12   2000    HOU 0.5797127
## 13   1930    CHN 0.5791077
## 14   2003    ATL 0.5790123
## 15   1999    TEX 0.5783047

filtering from 1969 up

Top15<- arrange(Team, desc(SLG)) %>%
  select(yearID,teamID,SLG) %>%
  filter(yearID > 1968)
head(Top15, 15)
##    yearID teamID       SLG
## 1    2003    BOS 0.6033975
## 2    1997    SEA 0.5908443
## 3    1996    SEA 0.5906845
## 4    1994    CLE 0.5900050
## 5    2001    COL 0.5880492
## 6    2009    NYA 0.5818021
## 7    2004    BOS 0.5807692
## 8    1995    CLE 0.5799523
## 9    2000    HOU 0.5797127
## 10   2003    ATL 0.5790123
## 11   1999    TEX 0.5783047
## 12   1996    CLE 0.5766590
## 13   2000    SFN 0.5760101
## 14   1997    COL 0.5755845
## 15   2001    TEX 0.5753738

Problem 4.

The Angles have at times been called the California Angles (CAL), the Anaheim Angels (ANA), and the Los Angeles Angels (LAA). Find the 10 most successful seasons in Angels history. Have they ever won the world series?

They won the world seies in 2002. I decided that the 10 most succesful seasons would be the seasons where they won the most games.

filtering and selecting to get angels

Angels<- select(Team, WSWin, W, teamID, yearID) %>%
  filter(teamID %in% c("CAL", "ANA", "LAA")) %>%
  arrange(desc(W))

head(Angels, 10)
##    WSWin   W teamID yearID
## 1      N 100    LAA   2008
## 2      Y  99    ANA   2002
## 3      N  98    LAA   2014
## 4      N  97    LAA   2009
## 5      N  95    LAA   2005
## 6      N  94    LAA   2007
## 7      N  93    CAL   1982
## 8      N  92    CAL   1986
## 9      N  92    ANA   2004
## 10     N  91    CAL   1989