library(tidyverse)
## -- Attaching packages ----------------------------------------------------------------------------------------------- tidyverse 1.2.1 --
## v ggplot2 3.1.0       v purrr   0.2.5  
## v tibble  2.0.1       v dplyr   0.8.0.1
## v tidyr   0.8.2       v stringr 1.4.0  
## v readr   1.3.1       v forcats 0.4.0
## -- Conflicts -------------------------------------------------------------------------------------------------- tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()
library(knitr)
library(kableExtra)
## 
## Attaching package: 'kableExtra'
## The following object is masked from 'package:dplyr':
## 
##     group_rows

Introduction

Choice of major in graduation has always been a major decision in every student’s life. Having a good estimate about the job and salary prospects in a major in future defintely helps students in making an informative decision about his/her grad choice. We have seen numerous articles advising which course to choose and which one would be in demand with prospective employers. ‘STEM’ which stands for Science Technology Engineering Mathematics, and has been highlighted to be a great choice in majors to find high paying jobs.
In this project, I’m researching to find whether the choice of STEM major in graduation has an impact on a graduate salary?

Data

Data collection:The recent grad list compiled by census.gov has provided us with a list of compiled data of grads during 2010-2012. There are observations made against 173 majors.
Cases: Each observation represents a major with data mentioning the number of grads who passed, their employement rate, salary ranges etc. There are totally 173 observations which each observation representing a major.
Variables:We will be studying two variables, the median salary which is a response variable and course major which is an independent variable also an explantory variable. This is an observation study since it’s derived from a data which is taken from an observation during a period of time.
The population of interest is the graduates of United States. And the findings from this analysis can be generalized for the entire population of interest i.e., the graduates.
Causality: We can use this data to establish a causality between the variables of interest. Since the data has been sampled randomly from a large population of data source.

Source: https://github.com/fivethirtyeight/data/tree/master/college-majors

grads <- read.csv("https://raw.githubusercontent.com/san123i/CUNY/master/Semester1/606/Final%20Project/grad-students.csv")
head(grads)%>%
  kable() %>%
  kable_styling() %>% scroll_box(width = "800px")
Major_code Major Major_category Grad_total Grad_sample_size Grad_employed Grad_full_time_year_round Grad_unemployed Grad_unemployment_rate Grad_median Grad_P25 Grad_P75 Nongrad_total Nongrad_employed Nongrad_full_time_year_round Nongrad_unemployed Nongrad_unemployment_rate Nongrad_median Nongrad_P25 Nongrad_P75 Grad_share Grad_premium
5601 CONSTRUCTION SERVICES Industrial Arts & Consumer Services 9173 200 7098 6511 681 0.0875434 75000 53000 110000 86062 73607 62435 3928 0.0506610 65000 47000 98000 0.0963196 0.1538462
6004 COMMERCIAL ART AND GRAPHIC DESIGN Arts 53864 882 40492 29553 2482 0.0577559 60000 40000 89000 461977 347166 250596 25484 0.0683859 48000 34000 71000 0.1044198 0.2500000
6211 HOSPITALITY MANAGEMENT Business 24417 437 18368 14784 1465 0.0738668 65000 45000 100000 179335 145597 113579 7409 0.0484229 50000 35000 75000 0.1198369 0.3000000
2201 COSMETOLOGY SERVICES AND CULINARY ARTS Industrial Arts & Consumer Services 5411 72 3590 2701 316 0.0809012 47000 24500 85000 37575 29738 23249 1661 0.0528998 41600 29000 60000 0.1258782 0.1298077
2001 COMMUNICATION TECHNOLOGIES Computers & Mathematics 9109 171 7512 5622 466 0.0584106 57000 40600 83700 53819 43163 34231 3389 0.0728003 52000 36000 78000 0.1447527 0.0961538
3201 COURT REPORTING Law & Public Policy 1542 22 1008 860 0 0.0000000 75000 55000 120000 8921 6967 6063 518 0.0692051 50000 34000 75000 0.1473765 0.5000000
summary(grads)
##    Major_code                                     Major    
##  Min.   :1100   ACCOUNTING                           :  1  
##  1st Qu.:2403   ACTUARIAL SCIENCE                    :  1  
##  Median :3608   ADVERTISING AND PUBLIC RELATIONS     :  1  
##  Mean   :3880   AEROSPACE ENGINEERING                :  1  
##  3rd Qu.:5503   AGRICULTURAL ECONOMICS               :  1  
##  Max.   :6403   AGRICULTURE PRODUCTION AND MANAGEMENT:  1  
##                 (Other)                              :167  
##                    Major_category   Grad_total      Grad_sample_size
##  Engineering              :29     Min.   :   1542   Min.   :   22   
##  Education                :16     1st Qu.:  15284   1st Qu.:  314   
##  Humanities & Liberal Arts:15     Median :  37872   Median :  688   
##  Biology & Life Science   :14     Mean   : 127672   Mean   : 2251   
##  Business                 :13     3rd Qu.: 148255   3rd Qu.: 2528   
##  Health                   :12     Max.   :1184158   Max.   :21994   
##  (Other)                  :74                                       
##  Grad_employed    Grad_full_time_year_round Grad_unemployed
##  Min.   :  1008   Min.   :   770            Min.   :    0  
##  1st Qu.: 12659   1st Qu.:  9894            1st Qu.:  453  
##  Median : 28930   Median : 22523            Median : 1179  
##  Mean   : 94037   Mean   : 72861            Mean   : 3506  
##  3rd Qu.:109944   3rd Qu.: 80794            3rd Qu.: 3329  
##  Max.   :915341   Max.   :703347            Max.   :35718  
##                                                            
##  Grad_unemployment_rate  Grad_median        Grad_P25        Grad_P75     
##  Min.   :0.00000        Min.   : 47000   Min.   :24500   Min.   : 65000  
##  1st Qu.:0.02607        1st Qu.: 65000   1st Qu.:45000   1st Qu.: 93000  
##  Median :0.03665        Median : 75000   Median :50000   Median :108000  
##  Mean   :0.03934        Mean   : 76756   Mean   :52597   Mean   :112087  
##  3rd Qu.:0.04805        3rd Qu.: 90000   3rd Qu.:60000   3rd Qu.:130000  
##  Max.   :0.13851        Max.   :135000   Max.   :85000   Max.   :294000  
##                                                                          
##  Nongrad_total     Nongrad_employed  Nongrad_full_time_year_round
##  Min.   :   2232   Min.   :   1328   Min.   :    980             
##  1st Qu.:  20564   1st Qu.:  15914   1st Qu.:  11755             
##  Median :  68993   Median :  50092   Median :  38384             
##  Mean   : 214720   Mean   : 154554   Mean   : 120737             
##  3rd Qu.: 184971   3rd Qu.: 129179   3rd Qu.: 103629             
##  Max.   :2996892   Max.   :2253649   Max.   :1882507             
##                                                                  
##  Nongrad_unemployed Nongrad_unemployment_rate Nongrad_median  
##  Min.   :     0     Min.   :0.00000           Min.   : 37000  
##  1st Qu.:   880     1st Qu.:0.04198           1st Qu.: 48700  
##  Median :  3157     Median :0.05103           Median : 55000  
##  Mean   :  8486     Mean   :0.05395           Mean   : 58584  
##  3rd Qu.:  7409     3rd Qu.:0.06439           3rd Qu.: 65000  
##  Max.   :136978     Max.   :0.16091           Max.   :126000  
##                                                               
##   Nongrad_P25     Nongrad_P75       Grad_share       Grad_premium    
##  Min.   :25000   Min.   : 48000   Min.   :0.09632   Min.   :-0.0250  
##  1st Qu.:34000   1st Qu.: 72000   1st Qu.:0.26757   1st Qu.: 0.2308  
##  Median :38000   Median : 80000   Median :0.39875   Median : 0.3208  
##  Mean   :40078   Mean   : 84333   Mean   :0.40059   Mean   : 0.3285  
##  3rd Qu.:44000   3rd Qu.: 97000   3rd Qu.:0.49912   3rd Qu.: 0.4000  
##  Max.   :80000   Max.   :215000   Max.   :0.93117   Max.   : 1.6471  
## 

Exploratory Data Analysis

isSTEM <- function(majorCategory){
    result <- grepl("science|math|engineering|technology|computer", majorCategory, ignore.case = T)
    result <- ifelse(result, !grepl("Social", majorCategory, ignore.case = T), result)
    return(ifelse(result, "STEM","NON-STEM"));
}
grads <- grads %>% mutate("Major_Type_STEM"=isSTEM(Major_category))

x <- grads %>% select(Major, Major_category, Major_Type_STEM)
#x[order(x$Major_Type_STEM),]
x[order(x$Major_Type_STEM),] %>%
  kable() %>%
  kable_styling()
Major Major_category Major_Type_STEM
1 CONSTRUCTION SERVICES Industrial Arts & Consumer Services NON-STEM
2 COMMERCIAL ART AND GRAPHIC DESIGN Arts NON-STEM
3 HOSPITALITY MANAGEMENT Business NON-STEM
4 COSMETOLOGY SERVICES AND CULINARY ARTS Industrial Arts & Consumer Services NON-STEM
6 COURT REPORTING Law & Public Policy NON-STEM
7 MARKETING AND MARKETING RESEARCH Business NON-STEM
8 AGRICULTURE PRODUCTION AND MANAGEMENT Agriculture & Natural Resources NON-STEM
10 ADVERTISING AND PUBLIC RELATIONS Communications & Journalism NON-STEM
11 FILM VIDEO AND PHOTOGRAPHIC ARTS Arts NON-STEM
12 ELECTRICAL, MECHANICAL, AND PRECISION TECHNOLOGIES AND PRODUCTION Industrial Arts & Consumer Services NON-STEM
14 MASS MEDIA Communications & Journalism NON-STEM
15 TRANSPORTATION SCIENCES AND TECHNOLOGIES Industrial Arts & Consumer Services NON-STEM
17 MISCELLANEOUS BUSINESS & MEDICAL ADMINISTRATION Business NON-STEM
20 MISCELLANEOUS FINE ARTS Arts NON-STEM
21 CRIMINAL JUSTICE AND FIRE PROTECTION Law & Public Policy NON-STEM
22 BUSINESS MANAGEMENT AND ADMINISTRATION Business NON-STEM
23 CRIMINOLOGY Social Science NON-STEM
24 MANAGEMENT INFORMATION SYSTEMS AND STATISTICS Business NON-STEM
26 OPERATIONS LOGISTICS AND E-COMMERCE Business NON-STEM
27 GENERAL BUSINESS Business NON-STEM
28 MEDICAL TECHNOLOGIES TECHNICIANS Health NON-STEM
30 COMMUNICATIONS Communications & Journalism NON-STEM
31 ACTUARIAL SCIENCE Business NON-STEM
33 JOURNALISM Communications & Journalism NON-STEM
34 MEDICAL ASSISTING SERVICES Health NON-STEM
36 ACCOUNTING Business NON-STEM
37 FINE ARTS Arts NON-STEM
38 NURSING Health NON-STEM
41 MULTI/INTERDISCIPLINARY STUDIES Interdisciplinary NON-STEM
43 GENERAL AGRICULTURE Agriculture & Natural Resources NON-STEM
44 FORESTRY Agriculture & Natural Resources NON-STEM
45 LIBERAL ARTS Humanities & Liberal Arts NON-STEM
46 HUMAN SERVICES AND COMMUNITY ORGANIZATION Psychology & Social Work NON-STEM
47 VISUAL AND PERFORMING ARTS Arts NON-STEM
48 NATURAL RESOURCES MANAGEMENT Agriculture & Natural Resources NON-STEM
49 STUDIO ARTS Arts NON-STEM
50 FAMILY AND CONSUMER SCIENCES Industrial Arts & Consumer Services NON-STEM
51 PHYSICAL FITNESS PARKS RECREATION AND LEISURE Industrial Arts & Consumer Services NON-STEM
52 FINANCE Business NON-STEM
54 PLANT SCIENCE AND AGRONOMY Agriculture & Natural Resources NON-STEM
55 HUMAN RESOURCES AND PERSONNEL MANAGEMENT Business NON-STEM
56 INTERNATIONAL BUSINESS Business NON-STEM
57 COMPOSITION AND RHETORIC Humanities & Liberal Arts NON-STEM
58 DRAMA AND THEATER ARTS Arts NON-STEM
59 BUSINESS ECONOMICS Business NON-STEM
62 HEALTH AND MEDICAL ADMINISTRATIVE SERVICES Health NON-STEM
63 AGRICULTURAL ECONOMICS Agriculture & Natural Resources NON-STEM
65 GEOGRAPHY Social Science NON-STEM
68 INTERDISCIPLINARY SOCIAL SCIENCES Social Science NON-STEM
70 SOIL SCIENCE Agriculture & Natural Resources NON-STEM
71 PRE-LAW AND LEGAL STUDIES Law & Public Policy NON-STEM
77 EARLY CHILDHOOD EDUCATION Education NON-STEM
78 SOCIOLOGY Social Science NON-STEM
79 GENERAL SOCIAL SCIENCES Social Science NON-STEM
80 ANIMAL SCIENCES Agriculture & Natural Resources NON-STEM
81 TREATMENT THERAPY PROFESSIONS Health NON-STEM
82 MISCELLANEOUS AGRICULTURE Agriculture & Natural Resources NON-STEM
84 HUMANITIES Humanities & Liberal Arts NON-STEM
85 FOOD SCIENCE Agriculture & Natural Resources NON-STEM
88 SOCIAL PSYCHOLOGY Psychology & Social Work NON-STEM
91 ART HISTORY AND CRITICISM Humanities & Liberal Arts NON-STEM
92 MISCELLANEOUS HEALTH MEDICAL PROFESSIONS Health NON-STEM
93 GENERAL MEDICAL AND HEALTH SERVICES Health NON-STEM
94 INTERCULTURAL AND INTERNATIONAL STUDIES Humanities & Liberal Arts NON-STEM
95 NUTRITION SCIENCES Health NON-STEM
96 ECONOMICS Social Science NON-STEM
97 PHYSICAL AND HEALTH EDUCATION TEACHING Education NON-STEM
98 COMMUNITY AND PUBLIC HEALTH Health NON-STEM
100 THEOLOGY AND RELIGIOUS VOCATIONS Humanities & Liberal Arts NON-STEM
102 MISCELLANEOUS EDUCATION Education NON-STEM
104 PUBLIC ADMINISTRATION Law & Public Policy NON-STEM
105 ELEMENTARY EDUCATION Education NON-STEM
106 INDUSTRIAL AND ORGANIZATIONAL PSYCHOLOGY Psychology & Social Work NON-STEM
107 MILITARY TECHNOLOGIES Industrial Arts & Consumer Services NON-STEM
108 GENERAL EDUCATION Education NON-STEM
109 MUSIC Arts NON-STEM
110 ART AND MUSIC EDUCATION Education NON-STEM
111 LINGUISTICS AND COMPARATIVE LANGUAGE AND LITERATURE Humanities & Liberal Arts NON-STEM
113 ANTHROPOLOGY AND ARCHEOLOGY Humanities & Liberal Arts NON-STEM
114 SOCIAL WORK Psychology & Social Work NON-STEM
115 ENGLISH LANGUAGE AND LITERATURE Humanities & Liberal Arts NON-STEM
116 TEACHER EDUCATION: MULTIPLE LEVELS Education NON-STEM
118 PHARMACY PHARMACEUTICAL SCIENCES AND ADMINISTRATION Health NON-STEM
119 OTHER FOREIGN LANGUAGES Humanities & Liberal Arts NON-STEM
120 PSYCHOLOGY Psychology & Social Work NON-STEM
121 AREA ETHNIC AND CIVILIZATION STUDIES Humanities & Liberal Arts NON-STEM
126 HISTORY Humanities & Liberal Arts NON-STEM
127 MISCELLANEOUS SOCIAL SCIENCES Social Science NON-STEM
130 FRENCH GERMAN LATIN AND OTHER COMMON FOREIGN LANGUAGE STUDIES Humanities & Liberal Arts NON-STEM
131 SOCIAL SCIENCE OR HISTORY TEACHER EDUCATION Education NON-STEM
133 POLITICAL SCIENCE AND GOVERNMENT Social Science NON-STEM
134 INTERNATIONAL RELATIONS Social Science NON-STEM
137 MISCELLANEOUS PSYCHOLOGY Psychology & Social Work NON-STEM
139 SECONDARY TEACHER EDUCATION Education NON-STEM
141 UNITED STATES HISTORY Humanities & Liberal Arts NON-STEM
144 LANGUAGE AND DRAMA EDUCATION Education NON-STEM
146 PUBLIC POLICY Law & Public Policy NON-STEM
147 MATHEMATICS TEACHER EDUCATION Education NON-STEM
148 SCIENCE AND COMPUTER TEACHER EDUCATION Education NON-STEM
150 PHILOSOPHY AND RELIGIOUS STUDIES Humanities & Liberal Arts NON-STEM
151 SPECIAL NEEDS EDUCATION Education NON-STEM
158 LIBRARY SCIENCE Education NON-STEM
164 EDUCATIONAL PSYCHOLOGY Psychology & Social Work NON-STEM
168 COMMUNICATION DISORDERS SCIENCES AND SERVICES Health NON-STEM
169 COUNSELING PSYCHOLOGY Psychology & Social Work NON-STEM
170 CLINICAL PSYCHOLOGY Psychology & Social Work NON-STEM
171 HEALTH AND MEDICAL PREPARATORY PROGRAMS Health NON-STEM
172 SCHOOL STUDENT COUNSELING Education NON-STEM
173 EDUCATIONAL ADMINISTRATION AND SUPERVISION Education NON-STEM
5 COMMUNICATION TECHNOLOGIES Computers & Mathematics STEM
9 COMPUTER PROGRAMMING AND DATA PROCESSING Computers & Mathematics STEM
13 MECHANICAL ENGINEERING RELATED TECHNOLOGIES Engineering STEM
16 COMPUTER NETWORKING AND TELECOMMUNICATIONS Computers & Mathematics STEM
18 MISCELLANEOUS ENGINEERING TECHNOLOGIES Engineering STEM
19 INDUSTRIAL PRODUCTION TECHNOLOGIES Engineering STEM
25 COMPUTER ADMINISTRATION MANAGEMENT AND SECURITY Computers & Mathematics STEM
29 COMPUTER AND INFORMATION SYSTEMS Computers & Mathematics STEM
32 ELECTRICAL ENGINEERING TECHNOLOGY Engineering STEM
35 ENGINEERING TECHNOLOGIES Engineering STEM
39 INFORMATION SCIENCES Computers & Mathematics STEM
40 ARCHITECTURAL ENGINEERING Engineering STEM
42 NUCLEAR, INDUSTRIAL RADIOLOGY, AND BIOLOGICAL TECHNOLOGIES Physical Sciences STEM
53 PETROLEUM ENGINEERING Engineering STEM
60 ENGINEERING AND INDUSTRIAL MANAGEMENT Engineering STEM
61 COMPUTER SCIENCE Computers & Mathematics STEM
64 ENVIRONMENTAL SCIENCE Biology & Life Science STEM
66 MISCELLANEOUS ENGINEERING Engineering STEM
67 ECOLOGY Biology & Life Science STEM
69 ARCHITECTURE Engineering STEM
72 GENERAL ENGINEERING Engineering STEM
73 MULTI-DISCIPLINARY OR GENERAL SCIENCE Physical Sciences STEM
74 CIVIL ENGINEERING Engineering STEM
75 COMPUTER ENGINEERING Engineering STEM
76 MINING AND MINERAL ENGINEERING Engineering STEM
83 MECHANICAL ENGINEERING Engineering STEM
86 INDUSTRIAL AND MANUFACTURING ENGINEERING Engineering STEM
87 GEOLOGICAL AND GEOPHYSICAL ENGINEERING Engineering STEM
89 NAVAL ARCHITECTURE AND MARINE ENGINEERING Engineering STEM
90 MATHEMATICS AND COMPUTER SCIENCE Computers & Mathematics STEM
99 ELECTRICAL ENGINEERING Engineering STEM
101 OCEANOGRAPHY Physical Sciences STEM
103 BIOLOGICAL ENGINEERING Engineering STEM
112 MATERIALS ENGINEERING AND MATERIALS SCIENCE Engineering STEM
117 GEOLOGY AND EARTH SCIENCE Physical Sciences STEM
122 PHYSICAL SCIENCES Physical Sciences STEM
123 ATMOSPHERIC SCIENCES AND METEOROLOGY Physical Sciences STEM
124 CHEMICAL ENGINEERING Engineering STEM
125 AEROSPACE ENGINEERING Engineering STEM
128 APPLIED MATHEMATICS Computers & Mathematics STEM
129 STATISTICS AND DECISION SCIENCE Computers & Mathematics STEM
132 MATHEMATICS Computers & Mathematics STEM
135 ENVIRONMENTAL ENGINEERING Engineering STEM
136 MISCELLANEOUS BIOLOGY Biology & Life Science STEM
138 METALLURGICAL ENGINEERING Engineering STEM
140 GEOSCIENCES Physical Sciences STEM
142 ENGINEERING MECHANICS PHYSICS AND SCIENCE Engineering STEM
143 COGNITIVE SCIENCE AND BIOPSYCHOLOGY Biology & Life Science STEM
145 NUCLEAR ENGINEERING Engineering STEM
149 MICROBIOLOGY Biology & Life Science STEM
152 BOTANY Biology & Life Science STEM
153 BIOLOGY Biology & Life Science STEM
154 ASTRONOMY AND ASTROPHYSICS Physical Sciences STEM
155 CHEMISTRY Physical Sciences STEM
156 PHYSIOLOGY Biology & Life Science STEM
157 BIOMEDICAL ENGINEERING Engineering STEM
159 MOLECULAR BIOLOGY Biology & Life Science STEM
160 PHARMACOLOGY Biology & Life Science STEM
161 ZOOLOGY Biology & Life Science STEM
162 PHYSICS Physical Sciences STEM
163 NEUROSCIENCE Biology & Life Science STEM
165 BIOCHEMICAL SCIENCES Biology & Life Science STEM
166 GENETICS Biology & Life Science STEM
167 MATERIALS SCIENCE Engineering STEM

The below plot shows that majority of the graduates fall under ‘Education’ major type where as the least are under ‘interdiscriplinary’ and ‘Agriculture & Natural Resources’ category. Also, there are only 4 major categories which fall under STEM i.e. ‘Biology & Life Sciences’, ‘Computers & Mathematics’, ‘Engineering’ and ‘Physical Sciences’.

1. Number of graduates in each major

Let’s do an analysis on the number of graduates in each major.

ggplot(grads, aes(x=reorder(Major_category, Grad_total), y = Grad_total, fill=Major_Type_STEM)) + geom_bar(stat="identity") + theme(axis.text.x=element_text(angle=90,hjust=1,vjust=0.5))+ labs( title = 'Graduates by major category type')

list <- aggregate(Grad_total ~ Major_category, grads, sum)
ggplot(list, aes(x=reorder(Major_category, Grad_total), y=Grad_total, label=Grad_total, label.size=.25)) + geom_col() + theme(axis.text.x=element_text(angle=90,hjust=1,vjust=0.5))+ labs(title = '# of Graduates by major category type') + 
  geom_text(size = 3, position = position_stack(vjust = 0.5), color='white')

list <- aggregate(Grad_total ~ Major_Type_STEM, grads, sum)
ggplot(list, aes(x=reorder(Major_Type_STEM, Grad_total), y=Grad_total, label=Grad_total, fill=Major_Type_STEM)) + geom_col() + theme(axis.text.x=element_text(angle=90,hjust=1,vjust=0.5))+ labs(title = '# of Graduates by STEM type') + 
  geom_text(size = 3, position = position_stack(vjust = 0.5), color='white')

# label=sum(Grad_total)

2. Salary data analysis by category

The below plot shows the grad median salary in each major category.

ggplot(grads, aes(Major_category, Grad_median, fill=Major_Type_STEM)) + geom_boxplot()  +  theme(axis.text.x=element_text(angle=90,hjust=1,vjust=0.5))

ggplot(grads, aes(x=Major, y=Grad_median, shape=Major_Type_STEM, colour=Major_Type_STEM)) + geom_point()  + scale_shape_manual(values=c(19, 2))

3. Average ‘p25’, ‘p75’ and ‘Median’ salary of each category.

ggplot(grads, aes(x=reorder(Major_category, Grad_P25), y = Grad_P25, fill=Major_Type_STEM)) + geom_bar(stat="identity") +labs(title = 'a. 25Perc of Grad salary by Major_category') +  theme(axis.text.x=element_text(angle=90,hjust=1,vjust=0.5))

ggplot(grads, aes(x=reorder(Major_category, Grad_P75), y = Grad_P75, fill=Major_Type_STEM)) + geom_bar(stat="identity") +labs(title = 'b. 75Perc of Grad salary by Major_category') +  theme(axis.text.x=element_text(angle=90,hjust=1,vjust=0.5))

ggplot(grads, aes(x=reorder(Major_category, Grad_median), y = Grad_median, fill=Major_Type_STEM)) + geom_bar(stat="identity") +labs(title = 'c. Median of Grad salary by Major_category') +  theme(axis.text.x=element_text(angle=90,hjust=1,vjust=0.5))

4. Grad Median salaries by STEM or Non-STEM

ggplot(grads, aes(Major_Type_STEM, Grad_median, fill=Major_Type_STEM)) + geom_boxplot()  +  theme(axis.text.x=element_text(angle=90,hjust=1,vjust=0.5))

Inference

We are using Analysis of Variance (ANOVA) method to dervie an inference from the Hypothesis. Below are the conditions we evaluated and found to be satisfied.
a. Independence of cases - The cases provided as per data are independent in nature
b. Distribution of Residuals are Normal - The below qqnorm graph displays that residuals are normally distributed without major outliers
c. Homogentiy of variances

Let’s start our inference with a Null Hypothesis test
H0 <- The median salary for an average graduate in STEM and Non-STEM categories are same
HA <- The median salary for an average graduate in STEM and non-STEM categories are not same.

grads_anova <- aov(grads$Grad_median ~ grads$Major_Type_STEM)
summary(grads_anova)
##                        Df    Sum Sq   Mean Sq F value Pr(>F)    
## grads$Major_Type_STEM   1 1.644e+10 1.644e+10    85.8 <2e-16 ***
## Residuals             171 3.276e+10 1.916e+08                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
plot(grads_anova)

Since the P value is less than .05, we reject the null hypothesis (H0), and can conclude that average graduate salary for STEM and non-STEM category students is not equal. And from the geom plot, we can identify that Median salary from STEM category students is significantly higher than non-STEM categories.

Conclusion

Using the given data, we were able to do an exploratory analysis and identify a clear pattern/trend that STEM course major students have a median salary higher than non-grad students. And using the ANOVA analysis, we were able to confirm that grad median salaries for both STEM and Non-STEM grad student are NOT equals. Therefore, we can clearly state that grad median salaries are higher for STEM students.
Future research: I believe there are few other factors which can influence the grad salary such as market conditions, number of open job opportunities, type of economies i.e., a developed, emerging or under developed. If I can get access to these datasets, then an even more realistic and robust model can be designed.