Choose one of David Robinson’s tidytuesday screencasts, watch the video, and summarise. https://www.youtube.com/channel/UCeiiqmVK07qhY-wvg3IZiZQ

Instructions

You must follow the instructions below to get credits for this assignment.

Q1 What is the title of the screencast?

"Tidy Tuesday Screencast: Analyzing College Major & Income Data in R

Q2 When was it published?

It was published on October 15, 2018.

Q3 Describe the data

Hint: What’s the source of the data; what does the row represent; how many observations?; what are the variables; and what do they mean?

All of the data is from American Community Survey 2010-2012 Public Use Microdata Series. There are 173 rows in this data set, and each row represents a college major. There are a lot of different variables in this data set including the ranking of majors by median earnings, total number of people in each major, male graduates, female graduates, number of people employed from each major, number of people unemployed from each major, median earnings of full-time workers from each major, etc.

Q4-Q5 Describe how Dave approached the analysis each step.

Hint: For example, importing data, understanding the data, data exploration, etc.

First Dave imported data in to R, then he went through all of the data and its variables to get a understanding and meaning of them both. He was mostly interested in the median so he created a variety of plots to get visual representation of the data and its distribution. He then creates a plot of the top 20 highest earning majors, but made sure to limit the sample size to at least 100 graduates since some majors in the data had very small sample sizes. He then moves on to looking at the most common majors and major categories based on the data. He does this by creating a bar plot of the data and analyzing it. He then moves onto how gender affects typical earnings. He makes another bar plot of this data, but user a gather function this time to display both men and women on each bar of the plot. He also examines the correlation between median salary and percentage of women in each major category. He analyzes that there is possibly a negative correlation between the two, which means that as the percentage of women in each major category gets higher, the median salary for that major category decreases.

Q6 Did you see anything in the video that you learned in class? Discuss in a short paragraph.

There was definitely a lot of things in the video I watched that I have learned in our class. Although sometimes more specific and advanced, there were a lot of code chunks and functions that I recognized. Also, the plots he used (histogram, box plot, bar graph, etc.) were all things that we learned how to code in this class. Finally, the analysis of the data he performed was very similar to how we were taught to analyze and interpret the data in our assignments.

Q7 What is a major finding from the analysis.

There were a lot of interesting findings in the analysis of the data. One major finding was that engineering is the highest earning major category. Also, business is the most common major category. Finally, he found that engineering and business majors are predominantly male, while health and education majors are predominantly women.

Q8 What is the most interesting thing you really liked about the analysis.

I thought the whole analysis was very interesting. It is topic I have always been interested in so it was intriguing to see someone actually code and analyze all that data using different functions and plots. I was particularly interested in the highest earning majors because it was intriguing to see how engineering is the top earning major, but it also had a smaller sample size compared to other majors.

Q9 Display the title and your name correctly at the top of the webpage.

Q10 Use the correct slug.