Please indicate
Define two new variables in the Teams
data frame: batting average (BA
) and slugging percentage (SLG
). Batting average is the ratio of hits (H
) to at-bats (AB
), and slugging percentage is the total bases divided by at-bats. To compute the total bases, you get 1 for a single, 2 for a double, 3 for a triple, and 4 for a home run.
library(tidyverse)
## Loading tidyverse: ggplot2
## Loading tidyverse: tibble
## Loading tidyverse: tidyr
## Loading tidyverse: readr
## Loading tidyverse: purrr
## Loading tidyverse: dplyr
## Conflicts with tidy packages ----------------------------------------------
## filter(): dplyr, stats
## lag(): dplyr, stats
library(Lahman)
library(ggthemes)
data(Teams)
Teams1 <- Teams %>%
mutate(BA = H / AB,
SLG = (H + X2B + 2 * X3B + 3 * HR) / AB)
Plot a time series of SLG
since 1954 by league (lgID
). Is slugging percentage typically higher in the American League (AL) or the National League?
library(ggthemes)
Teams1 %>%
filter(yearID >= 1954) %>%
ggplot(aes(x = yearID, y = SLG)) +
geom_point(size = 0.5, color = "firebrick", alpha = 0.5) +
geom_smooth(method = "lm", se = TRUE) +
facet_wrap(~ lgID) +
labs(x = "Year (post-1954)", y = "Slugging percentage") +
theme_economist()
ANSWER: Slugging Percentage (SLG) is typically higher in the American league (AL)
Display the top 15 teams ranked in terms of slugging percentage in MLB history. Repeat this using teams since 1969.
Teams1 %>%
tail(sort(SLG, decreasing = TRUE), n = 15) %>%
ggplot(aes(x = reorder(teamID, SLG), y = SLG, fill = lgID)) +
geom_bar(stat = "identity") +
coord_flip() +
ylim(0, 0.45) +
labs(x = NULL, y = NULL,
title = "Top 15 teams in MLB history",
subtitle = "Based on slugging percentage") +
theme_fivethirtyeight() +
theme(legend.position = "none")
Teams1 %>%
filter(yearID >= 1969) %>%
tail(sort(SLG, decreasing = TRUE), n = 15) %>%
ggplot(aes(x = reorder(teamID, SLG), y = SLG, fill = lgID)) +
geom_bar(stat = "identity") +
coord_flip() +
ylim(0, 0.45) +
labs(x = NULL, y = NULL,
title = "Top 15 MLB teams post-1969",
subtitle = "Based on slugging percentage") +
theme_fivethirtyeight() +
theme(legend.position = "none")
The Angles have at times been called the California Angles (CAL
), the Anaheim Angels (ANA
), and the Los Angeles Angels (LAA
). Find the 10 most successful seasons in Angels history. Have they ever won the world series?
Teams %>%
mutate(Success = W / L) %>%
filter(teamID %in% c("CAL", "ANA", "LAA")) %>%
tail(sort(Success, decreasing = TRUE), n = 10) %>%
arrange(-Success)
## yearID lgID teamID franchID divID Rank G Ghome W L DivWin WCWin
## 1 2008 AL LAA ANA W 1 162 81 100 62 Y N
## 2 2014 AL LAA ANA W 1 162 81 98 64 Y N
## 3 2009 AL LAA ANA W 1 162 81 97 65 Y N
## 4 2007 AL LAA ANA W 1 162 81 94 68 Y N
## 5 2006 AL LAA ANA W 2 162 81 89 73 N N
## 6 2012 AL LAA ANA W 3 162 81 89 73 N N
## 7 2011 AL LAA ANA W 2 162 81 86 76 N N
## 8 2015 AL LAA ANA W 3 162 81 85 77 N N
## 9 2010 AL LAA ANA W 3 162 81 80 82 N N
## 10 2013 AL LAA ANA W 3 162 81 78 84 N N
## LgWin WSWin R AB H X2B X3B HR BB SO SB CS HBP SF RA ER
## 1 N N 765 5540 1486 274 25 159 481 987 129 48 52 50 697 644
## 2 N N 773 5652 1464 304 31 155 492 1266 81 39 60 54 630 590
## 3 N N 883 5622 1604 293 33 173 547 1054 148 63 41 52 761 715
## 4 N N 822 5554 1578 324 23 123 507 883 139 55 40 65 731 674
## 5 N N 766 5609 1539 309 29 159 486 914 148 57 42 53 732 652
## 6 N N 767 5536 1518 273 22 187 449 1113 134 33 47 41 699 640
## 7 N N 667 5513 1394 289 34 155 442 1086 135 52 51 32 633 581
## 8 N N 661 5417 1331 243 21 176 435 1150 52 34 58 40 675 630
## 9 N N 681 5488 1363 276 19 155 466 1070 104 52 52 37 702 651
## 10 N N 733 5588 1476 270 39 164 523 1221 82 34 48 64 737 685
## ERA CG SHO SV IPouts HA HRA BBA SOA E DP FP
## 1 3.99 7 10 66 4354 1455 160 457 1106 91 159 0.985
## 2 3.58 3 13 46 4448 1307 126 504 1342 83 127 0.986
## 3 4.45 9 13 51 4335 1513 180 523 1062 85 174 0.986
## 4 4.23 5 9 43 4305 1480 151 477 1156 101 154 0.983
## 5 4.04 5 12 50 4358 1410 158 471 1164 124 154 0.979
## 6 4.02 6 16 38 4300 1339 186 483 1157 98 141 0.984
## 7 3.57 12 11 39 4395 1388 142 476 1058 93 157 0.985
## 8 3.94 2 12 46 4322 1355 166 466 1221 93 108 0.984
## 9 4.04 10 9 39 4348 1422 148 565 1130 113 116 0.981
## 10 4.23 4 12 41 4373 1475 167 533 1200 112 135 0.981
## name park attendance BPF
## 1 Los Angeles Angels of Anaheim Angel Stadium 3336747 103
## 2 Los Angeles Angels of Anaheim Angel Stadium of Anaheim 3095935 96
## 3 Los Angeles Angels of Anaheim Angel Stadium 3240386 99
## 4 Los Angeles Angels of Anaheim Angel Stadium 3365632 101
## 5 Los Angeles Angels of Anaheim Angel Stadium 3406790 100
## 6 Los Angeles Angels of Anaheim Angel Stadium of Anaheim 3061770 92
## 7 Los Angeles Angels of Anaheim Angel Stadium 3166321 93
## 8 Los Angeles Angels of Anaheim Angel Stadium of Anaheim 3012765 94
## 9 Los Angeles Angels of Anaheim Angel Stadium 3250816 98
## 10 Los Angeles Angels of Anaheim Angel Stadium of Anaheim 3019505 94
## PPF teamIDBR teamIDlahman45 teamIDretro Success
## 1 102 LAA ANA ANA 1.6129032
## 2 95 LAA ANA ANA 1.5312500
## 3 98 LAA ANA ANA 1.4923077
## 4 100 LAA ANA ANA 1.3823529
## 5 100 LAA ANA ANA 1.2191781
## 6 92 LAA ANA ANA 1.2191781
## 7 93 LAA ANA ANA 1.1315789
## 8 95 LAA ANA ANA 1.1038961
## 9 98 LAA ANA ANA 0.9756098
## 10 94 LAA ANA ANA 0.9285714
ANSWER: Top 10 most successful seasons have been determined based on the team’s highest Wins to Losses ratios.
Teams %>%
filter(WSWin == "Y" & teamID %in% c("CAL", "ANA", "LAA"))
## yearID lgID teamID franchID divID Rank G Ghome W L DivWin WCWin
## 1 2002 AL ANA ANA W 2 162 81 99 63 N Y
## LgWin WSWin R AB H X2B X3B HR BB SO SB CS HBP SF RA ER ERA
## 1 Y Y 851 5678 1603 333 32 152 462 805 117 51 74 64 644 595 3.69
## CG SHO SV IPouts HA HRA BBA SOA E DP FP name
## 1 7 14 54 4357 1345 169 509 999 87 151 0.986 Anaheim Angels
## park attendance BPF PPF teamIDBR teamIDlahman45
## 1 Edison International Field 2305547 100 99 ANA ANA
## teamIDretro
## 1 ANA
ANSWER: Yes, the Angels have won the World Series once in 2002.