Assignment 7

Author

Jaiden Soto

Installing necessary libraries

# Imports the neccesary libraries
library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.2.0     ✔ readr     2.1.6
✔ forcats   1.0.1     ✔ stringr   1.6.0
✔ ggplot2   4.0.2     ✔ tibble    3.3.1
✔ lubridate 1.9.5     ✔ tidyr     1.3.2
✔ purrr     1.2.1     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(dslabs)

# Loads the data and heads the table of the data
data("research_funding_rates")
head(research_funding_rates)
          discipline applications_total applications_men applications_women
1  Chemical sciences                122               83                 39
2  Physical sciences                174              135                 39
3            Physics                 76               67                  9
4         Humanities                396              230                166
5 Technical sciences                251              189                 62
6  Interdisciplinary                183              105                 78
  awards_total awards_men awards_women success_rates_total success_rates_men
1           32         22           10                26.2              26.5
2           35         26            9                20.1              19.3
3           20         18            2                26.3              26.9
4           65         33           32                16.4              14.3
5           43         30           13                17.1              15.9
6           29         12           17                15.8              11.4
  success_rates_women
1                25.6
2                23.1
3                22.2
4                19.3
5                21.0
6                21.8

Multivariable Graph

# Creates a standard scatterplot called "multivariable graph", sets the data to the 
# research_funding rates, the x to the men success rate, the y to the women success
# rate, and the color to the major
multivariable_graph <- ggplot(research_funding_rates, aes(x = success_rates_men, y = success_rates_women, color = discipline))+

# Creates labels for the title, datasource, color legend, and axises
  labs(title = "How Men and Women Success Rates compare",
       caption = "Source: dslabs",
       color = "College Major",
       x = "Men Success Rate",
       y = "Women Success Rate")+
  
# Plots the points at the size 3
  geom_point(size = 3)+
  
# Changes the theme
  theme_bw()
multivariable_graph

The data-set I used for this assignment was the, “research_funding_rates” data-set. This data-set compared the applications, awards, and success rate between men and women for different college majors. The purpose of the data-set was to find inequalities between the two genders, rooted in sexism. For this project, I decided to go with a graph comparing the success rates between men and women categorized by their college major. I set the axis’s to the success rates, with men on the x-axis and women on the y-axis. This was to directly compared the two different genders. For the multivariable aspect, I set the color for the plots to be based on the college major that both genders applied in. This allowed for more understanding into biases for select majors. I also changed the size of the plots, to allow for the comparisons to be clearer on the graph.