Please indicate
Use the mlb_teams.csv data set to create an informative data graphic that illustrates the relationship between winning percentage (WPct) and payroll in context.
library(tidyverse)
## Warning: package 'tidyverse' was built under R version 3.2.5
## Loading tidyverse: ggplot2
## Loading tidyverse: tibble
## Loading tidyverse: tidyr
## Loading tidyverse: readr
## Loading tidyverse: purrr
## Loading tidyverse: dplyr
## Warning: package 'ggplot2' was built under R version 3.2.5
## Warning: package 'tibble' was built under R version 3.2.5
## Warning: package 'tidyr' was built under R version 3.2.5
## Warning: package 'readr' was built under R version 3.2.5
## Warning: package 'purrr' was built under R version 3.2.5
## Warning: package 'dplyr' was built under R version 3.2.5
## Conflicts with tidy packages ----------------------------------------------
## filter(): dplyr, stats
## lag(): dplyr, stats
MLB_Data <- read.csv("https://raw.githubusercontent.com/cmsc205/data/master/mlb_teams.csv")
head(MLB_Data)
## yearID teamID lgID W L WPct attendance normAttend payroll
## 1 2008 ARI NL 82 80 0.5061728 2509924 0.5838859 66202712
## 2 2008 ATL NL 72 90 0.4444444 2532834 0.5892155 102365683
## 3 2008 BAL AL 68 93 0.4223602 1950075 0.4536477 67196246
## 4 2008 BOS AL 95 67 0.5864198 3048250 0.7091172 133390035
## 5 2008 CHA AL 89 74 0.5460123 2500648 0.5817280 121189332
## 6 2008 CHN NL 97 64 0.6024845 3300200 0.7677285 118345833
## metroPop name
## 1 4489109 Arizona Diamondbacks
## 2 5614323 Atlanta Braves
## 3 2785874 Baltimore Orioles
## 4 4732161 Boston Red Sox
## 5 9554598 Chicago White Sox
## 6 9554598 Chicago Cubs
library(ggthemes)
## Warning: package 'ggthemes' was built under R version 3.2.5
Variables: winning percentage (quantitative) and payroll (quantitative) Graph: scatterplot - x-axis is Winning Percentage y-axis is payroll
ggplot(MLB_Data, aes(x=WPct, y=payroll))+
geom_point(alpha =.7, color = "orange")+
labs(x="Winning Percentages of individual Players",
y= "Payroll of individual players",
title = "Winning Percentages and Payroll of MLB Players") +
scale_color_solarized()
Using data from the nasaweather R package, use the path geometry (i.e. use a geom_path layer) to plot the path of each tropical storm in the storms data table. Use color to distinguish the storms from one another, and use faceting to plot each year in its own panel.
Hint: Don’t forget to install and load the nasaweather R package!
Storms <- nasaweather::storms
Storms
## # A tibble: 2,747 × 11
## name year month day hour lat long pressure wind
## <chr> <int> <int> <int> <int> <dbl> <dbl> <int> <int>
## 1 Allison 1995 6 3 0 17.4 -84.3 1005 30
## 2 Allison 1995 6 3 6 18.3 -84.9 1004 30
## 3 Allison 1995 6 3 12 19.3 -85.7 1003 35
## 4 Allison 1995 6 3 18 20.6 -85.8 1001 40
## 5 Allison 1995 6 4 0 22.0 -86.0 997 50
## 6 Allison 1995 6 4 6 23.3 -86.3 995 60
## 7 Allison 1995 6 4 12 24.7 -86.2 987 65
## 8 Allison 1995 6 4 18 26.2 -86.2 988 65
## 9 Allison 1995 6 5 0 27.6 -86.1 988 65
## 10 Allison 1995 6 5 6 28.5 -85.6 990 60
## # ... with 2,737 more rows, and 2 more variables: type <chr>,
## # seasday <int>
ggplot(Storms, aes(x=lat, y=long))+
geom_path(aes(col=name))+
facet_wrap(~year)
Using the data set Top25CommonFemaleNames.csv, recreate the “Median Names for Females with the 25 Most Common Names” graphic from FiveThirtyEight (link to graphic; link to full article).
Female_Names <- read.csv("https://raw.githubusercontent.com/cmsc205/data/master/Top25CommonFemaleNames.csv")
ggplot(Female_Names, aes(x=reorder(name, -median_age), y = median_age))+
geom_linerange(ymin=Female_Names$q1_age, ymax=Female_Names$q3_age, col = "orange", size = 4, alpha = 0.7)+
geom_point(col="red")+
ylim(9,70)+
coord_flip()+
labs(x= NULL, y = NULL,
title = "Median Ages for Females With the 25 Most\nCommon Names",
subtitle = "Among Americans estimated to be alive as of Jan 1. 2014")