# library)(dslabs) This is a functions that contains many data sets.# library(tidyverse) This function opens the tidyverse tools.# data("admissions") This is a dataset from dslabs.library(dslabs)library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr 1.2.1 ✔ readr 2.2.0
✔ forcats 1.0.1 ✔ stringr 1.6.0
✔ ggplot2 4.0.3 ✔ tibble 3.3.1
✔ lubridate 1.9.5 ✔ tidyr 1.3.2
✔ purrr 1.2.2
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag() masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
data("admissions")# <- commands R to create a new spreadsheet called admissions_named.# %>% passes the result of function into the next arguemnet.# mutate() function commands R to search for a column named majors. # fct_recode finds and replaces current data names into newer and cleaned up names admissions_named <- admissions %>%mutate(major =fct_recode(major,"Engineering / Med"="A","Natural Sciences"="B","Humanities"="C","Social Studies"="D","Language Arts"="E","Arts / Law"="F" ))# ggplot creates a blank graph that is ready for potting.# geom_jitter declutters graph.# aes() creates the asthetics of the graph.ggplot(data = admissions_named, aes(x = major, y = admitted, color = gender, size = applicants)) +geom_jitter(width =0, height =0.1, alpha =0.8)+# labs() creates the x, y-axis labeling. labs(title ="Gender Patterns in UC Berkeley Graduate Admissions",subtitle ="Admissions rates vs. Multiple academic fields",x ="Academic Department field",y ="Admission rate(Percentage)",caption ="Data Source: DS Labs Package",color ="Gender",size ="Number of Applicants" ) +# scale_color_brewer adds color to the graph.# coord_flip() this function flips graph horizontally.# theme_minimal(base_size = 13) removes default text size of 11 to 13 and creates a white background instead of default grey background.scale_color_brewer(palette ="Set1", labels =c("Men", "Women")) +coord_flip()+theme_minimal(base_size =13)
For assignment 7, I chose to evaluate UC Berkeley graduate admissions from the dslabs package. This data set evaluates admission rates between men and women who applied to UC Berkeley. To display my results, I designed a bubble scatterplot using the ggplot function. The layout of my scatterplot consists of having the x-axis depict different majors and the y-axis depict admission rate. To ensure that the display of majors does not clutter, I incorporated the geom_jitter() function. Additionally, I incorporated gender as a third variable while also using customized colors from the Set1 color palette and distinctly used different sizes for each applicant pool size.The results from this data show that women and men are roughly on par when it comes to lower technical degrees such as language arts, social science, humanities, etc. However, when looking at data for highly technical and skilled-based degrees such as Engineering, Medical majors, and law, women have a higher admission rate than men, but a much smaller applicant pool size.