library(ggplot2)
library(dplyr)
library(fivethirtyeight)
names(college_all_ages)
##  [1] "major_code"                  "major"                      
##  [3] "major_category"              "total"                      
##  [5] "employed"                    "employed_fulltime_yearround"
##  [7] "unemployed"                  "unemployment_rate"          
##  [9] "p25th"                       "median"                     
## [11] "p75th"

Question 1:

  1. We will use the “college_all_ages” data set which contains 173 rows representing majors and 11 variables. Take a few minutes to look at the data file.
  1. Create a histogram of the variable “median” which is the median earnings of full-time, year-round workers. Write your code below in the provided code chunk.
ggplot(college_all_ages, aes(x=median))+
  geom_histogram(binwidth=1200,color="pink",fill="black")

  1. Create a scatterplot of the variables “unemployment_rate” (x-axis) and “median” (y-axis). Write your code below.
ggplot(data=college_all_ages,aes(x=unemployment_rate,y=median))+
  geom_point(alpha=0.7)

  1. Write a few sentences describing the scatterplot This scatterplot shows the unemployment rate of people who have grsduated from college. The x-axis is the rate of unemployment and the y-axis is the median annual income of each unemployed individual. I added transparency to the data point with the command “alpha”.

  2. Let’s reduce the data set to only the majors “Agriculture & Natural Resources” “Computers and Math”;“Engineering”; “Biology and Life Sciences” and “Business”. The following code will do that.

college_reduced<-college_all_ages%>%
  filter(major_category=="Business"|major_category=="Engineering"|
           major_category=="Computers & Mathematics"|major_category=="Biology and Life Science"|
           major_category=="Agrictulture & Natural Resources")
  1. Create a boxplot that compares the median earnings for different major_category.
ggplot(college_reduced,aes(x=median))+
  geom_histogram(binwidth=1300,color="green",fill="black")+
facet_wrap(~major_category,nrow=5)

  1. Create a violin plot that compares the unemployment rate (unemployment_rate) among the different major_category.
ggplot(college_reduced,aes(x=factor(major_category),y=unemployment_rate))+
  geom_violin(color="blue",fill="red")

Knit this document and print it out to turn in Wednesday, Feb. 12th.