This Coding Problem Set 2 is to be completed after you have worked through Coding Lecture 2. Take some time reading through the notes from Coding Lecture 2. Like in Coding Problem Set 1, you should go through each code block in the Coding Lecture and see if you can both explain what all of the code does.
In the Stat Lecture: Tools for Comparisons, Prof. Wyner discussed the relationship between a team’s payroll and its winning percentage. We will create plots from his analysis in the following problems using the dataset mlb_relative_payrolls.csv, which you should download into the “data” folder of your working directory. You should save all of the code for this analysis in an R script called “ps2_mlb_payroll.R”.
read_csv(). Trouble shooting: Make sure your working directory is correctly
set to the folder ‘data’ on your desktop. Ensure you have the
payroll file in that folder and the file names match your
read_csv().
ggplot(). Play around with different binwidths.Trouble shooting: Is your aesthetic mappings (aes) correct? Are x and y defined? did you tell ggplot what tbl to use?
Make a histogram of the relative payrolls, using
geom_hist().
Make a scatterplot with geom_point. Put relative
payroll on the horizontal axis (x-axis) and winning percentage on the
vertical axis (y-axis).
Without executing the code below, consider if you can figure out what it is doing. And what will be generated by the code that follows:
relative_payroll %>%
ggplot(aes(x = Year, y = Team_Payroll)) +
geom_point()
Execute the code above. What can you say about how team payrolls have evolved over time?
Now Make 2nd related plot that visualizes how relative payrolls have evolved over time.
Add an appropriate title and relabel the y-axis using the
labs() function to the plot above.