ggplot(mlb_players_18, aes(x = AVG)) +
geom_histogram(binwidth = 0.01, fill = "steelblue", color = "white") +
labs(title = "Distribution of Batting Averages", x = "Batting Average", y = "Player Count") +
theme_minimal()
Observation: Most players have a batting average clustered between .200 and .300, showing a standard MLB performance spread.
ggplot(mlb_players_18, aes(x = position, y = OPS)) +
geom_boxplot(fill = "darkorange", outlier.color = "black") +
labs(title = "OPS by Player Position", x = "Position", y = "OPS") +
theme_minimal()
Observation: Outfielders and first basemen tend to have higher OPS values, reflecting more power hitting roles.
ggplot(mlb_players_18, aes(x = AB, y = HR)) +
geom_point(alpha = 0.6, color = "darkgreen") +
labs(title = "Home Runs vs At-Bats", x = "At-Bats", y = "Home Runs") +
theme_minimal()
Observation: Players with more at-bats generally have more home runs, but a few players show high HR totals with fewer ABs, indicating efficiency.
mlb_players_18 %>%
group_by(team) %>%
summarize(total_RBI = sum(RBI, na.rm = TRUE)) %>%
ggplot(aes(x = reorder(team, -total_RBI), y = total_RBI)) +
geom_bar(stat = "identity", fill = "tomato") +
labs(title = "Total RBI by Team", x = "Team", y = "Total RBI") +
theme(axis.text.x = element_text(angle = 90, hjust = 1)) +
theme_minimal()
Observation: Certain teams dominate in total RBIs, showing offensive depth and lineup strength.
ggplot(mlb_players_18, aes(x = OBP, y = SLG)) +
geom_point(alpha = 0.5, color = "purple") +
labs(title = "Slugging vs On-Base Percentage", x = "On-Base % (OBP)", y = "Slugging % (SLG)") +
theme_minimal()
Observation: Strong correlation between OBP and SLG — players who get on base often also tend to hit for power.