This is an R Markdown Notebook. When you execute code within the notebook, the results appear beneath the code.

library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(tidyr)
library(ggplot2)
setwd("/Users/briannacarbonaro/Documents/Dida325")
occupations <- read.csv("occupation_gender_race.csv")
jobs<- na.omit(occupations)
#Here I am cleaning up the data by choosing the columns I want to include in my work.
colnames(jobs)<-c("Job_Type","Job_Description","Year","All_Employees","Percent_Female","Percent_Black","Percent_Asian","Percent_Hispanic")

jobs<-jobs%>%
  mutate(Female_Employees=(All_Employees*Percent_Female)/100)%>%
  mutate(Black_Employees=(All_Employees*Percent_Black)/100)%>%
  mutate(Asian_Employees=(All_Employees*Percent_Asian)/100)%>%
  mutate(Hispanic_Employees=(All_Employees*Percent_Hispanic)/100)%>%
  mutate(Male_Employees=All_Employees-Female_Employees)%>%
  mutate(Percent_Male=(Male_Employees/All_Employees)*100)
#Summary Table
summary_info<-jobs%>%filter(Job_Description %in% c(" Total, 16 years and over","Computer and mathematical occupations","Web developers","Computer support specialists"))%>%group_by(Job_Description)%>%summarise(
  avg_female_employees=mean(Percent_Female,na.rm=T),
  avg_male_employees=mean(Percent_Male,na.rm=T),
  avg_black_employees=mean(Percent_Black,na.rm=T),
  avg_asian_employees=mean(Percent_Asian,na.rm=T),
  avg_hispanic_employees=mean(Percent_Hispanic,na.rm=T))

colnames(summary_info) <- c("Job Description", "Average Percent of Female Employees", "Average Percent of Male Employees","Average Percent of Black Employees", "Average Percent of Asian Employees","Average Percent of Hispanic/Latino Employees")
summary_info
#Plot 1 wide data to long data conversion
computer_programmers<-jobs%>%filter(Job_Description=="Computer programmers")
wide_data1 <- computer_programmers %>%
  select(Year, Percent_Black, Percent_Asian, Percent_Hispanic) 

colnames(wide_data1)<-c("Year", "Black_Employees",
         "Asian_Employees", "Hispanic_Latino_Employees" )

head(wide_data1)
long_data1 <- wide_data1 %>%
  pivot_longer(-Year, names_to = "Group", values_to = "Employees")

head(long_data1)
#Plot 1 line graph
palette<-c("darkturquoise", "yellow", "hotpink")
ggplot(long_data1, aes(x=Year, y=Employees, group = Group, color=Group, na.rm=TRUE))+
  geom_line()+
  labs(y="Employees(%)", x="Year",
  title= "Minority Representation in Computer Programming Occupations",
  color="Group")+
scale_color_manual(values=palette)

Summary of Key Findings: This first graph I have created represents the percentage of minority groups that were included in the data (Asian, Black, and Hispanic/Latino) showing these employees in computer programming occupations. After converting my data to average percentage of each group I was able to get a better understanding of what each number and category represented. I believe that the very low percentages of Black and Hispanic/Latino works can raise awareness to the lack of representation both of these groups have in this industry. This can be shown by the percentages of the group being below 7% representation for each group over the years and has been staying steady at this level over the time frame of 2005-2020. It was also interesting to see how Asian representation and employment were rapidly decreasing from 2005 to 2010 and think start increasing rapidly from 2010 on as the graph spikes up. I would be curious to find out to why this happened and why with only this minority group. My key findings from this graph show the lack of representation of minority groups in this specific field of computer programming, which shows the lack of diversity, equity, and inclusion these occupations have.

#Plot 2 conversion from wide data to long data
computer<-jobs%>%filter(Job_Description %in% c("Computer support specialists ", "Computer support specialists"))
wide_data2<-computer%>%select(Year, Percent_Female, Percent_Male)

colnames(wide_data2)<-c("Year","Females", "Males" )

head(wide_data2)
long_data2 <- wide_data2 %>%
  pivot_longer(-Year, names_to = "Gender", values_to = "Percent")

head(long_data2)
#Plot 2 bar graph
palette<-c("orange", "blue")
ggplot(long_data2, aes(x=Year, y=Percent, fill=Gender))+
  geom_bar(stat="Identity", position= "dodge")+
  labs(y="Average % Employees", x="Year",
  title= "Gender Diversity in Computer Support Specialist Jobs",
  color="Gender")+
scale_fill_manual(values=palette)

Summary of Key Findings: Once I converted my data, I was able to find the percentage of male employees which helped me create my bar graph. The graph I created titled “Gender Diversity in Computer Support Specialist Jobs” shows the gender gap between men and women in this occupation. I believe this graph shows the stereotypical difference between men and women representation in male-dominated fields. It is alarming to see how this gap has not closed over time, especially now when movements of feminism have been more prominent and women have been advocating for equal pay and equal employment as men. With the gap being about a 40% gap each year in the computer support specialist jobs between men and women, it can bring attention to outsiders just how large the gap really is, since most people are aware there is a gap but are not aware as to how large it really is. My key findings give a visual of the gap and hopefully when people become aware of the gap, reform will come and the gap can decrease. It also brings to light the lack of diversity, equity, and inclusion between genders and not just races as I showed in my other chart.