# A tibble: 10 × 2
# Groups: Major [10]
Major Median
<chr> <int>
1 PETROLEUM ENGINEERING 125000
2 PHARMACY PHARMACEUTICAL SCIENCES AND ADMINISTRATION 106000
3 NAVAL ARCHITECTURE AND MARINE ENGINEERING 97000
4 METALLURGICAL ENGINEERING 96000
5 NUCLEAR ENGINEERING 95000
6 MATHEMATICS AND COMPUTER SCIENCE 92000
7 MINING AND MINERAL ENGINEERING 92000
8 ELECTRICAL ENGINEERING 88000
9 CHEMICAL ENGINEERING 86000
10 GEOLOGICAL AND GEOPHYSICAL ENGINEERING 85000
College Data Analysis
Introduction
Since being at Xavier and even before started my college career, I have always been interested in data surrounding universities themselves. When I was a senior in high school I wanted to find a college that would prepare me for the future, and find a major that would interest me and also allow for me to have a job after graduation. I found a data set on Kaggle that contains data on a number of different majors based on their number of students with that major, the number of students unemployed who graduate with that major, and even the median salary for students who graduate with a specific major. Some of the questions I would like to answer follow below.
What majors have the highest median salary right after graduation?
What majors have the highest unemployment rate?
Which category of major produces the most employed students after graduation?
The data set with this information can be found here if you are interested in taking a peek yourself.
Data Dictionary
Here’s a quick rundown on what each of the variables in the data set mean.
Index: Identifier for each row
Major_code: Identifier for each major
Major: Specific major a student had
Major_category: A broader category for majors
Total: Total number of students in the major
Employed: Number of students employed after graduation
Employed_full_time_year_round: Number of students employed full time year round after graduation
Unemployed: Number of students unemployed after graduation
Unemployment_rate: Ratio of unemployed students to total students
Median: Median income of a student after graduation
P25th: 25th percentile for median income of a student after graduation
P75th: 75th percentile for median income of a student after graduation
Descriptive Analysis
Question 1
What majors have the highest median salary right after graduation?
Just looking at the top 10 here, 8 of the 10 majors are engineering degrees which goes to show that while it is a difficult major, it certainly pays off early.
# A tibble: 10 × 2
# Groups: Major [10]
Major Median
<chr> <int>
1 NEUROSCIENCE 35000
2 EARLY CHILDHOOD EDUCATION 35300
3 STUDIO ARTS 37600
4 HUMAN SERVICES AND COMMUNITY ORGANIZATION 38000
5 COUNSELING PSYCHOLOGY 39000
6 COMPOSITION AND RHETORIC 40000
7 COSMETOLOGY SERVICES AND CULINARY ARTS 40000
8 EDUCATIONAL PSYCHOLOGY 40000
9 ELEMENTARY EDUCATION 40000
10 LIBRARY SCIENCE 40000
Now looking at the lowest median incomes for graduating students, there appear to be a significant amount of education majors here, which is unfortunate because I know how hard some teachers work to educate students even at the elementary school age and they certainly deserve to be getting paid more than they really are.
Question 2
What majors have the highest unemployment rate?
# A tibble: 10 × 2
# Groups: Major [10]
Major Unemployment_rate
<chr> <dbl>
1 MISCELLANEOUS FINE ARTS 0.156
2 CLINICAL PSYCHOLOGY 0.103
3 MILITARY TECHNOLOGIES 0.102
4 SCHOOL STUDENT COUNSELING 0.102
5 LIBRARY SCIENCE 0.0948
6 VISUAL AND PERFORMING ARTS 0.0947
7 COMPUTER PROGRAMMING AND DATA PROCESSING 0.0903
8 SOCIAL PSYCHOLOGY 0.0873
9 ASTRONOMY AND ASTROPHYSICS 0.0860
10 ARCHITECTURE 0.0860
One of the most shocking majors to appear here to me is Computer Programming and Data Processing, as I know how high in demand students with programming or analytics backgrounds are.
# A tibble: 10 × 2
# Groups: Major [10]
Major Unemployment_rate
<chr> <dbl>
1 EDUCATIONAL ADMINISTRATION AND SUPERVISION 0
2 GEOLOGICAL AND GEOPHYSICAL ENGINEERING 0
3 PHARMACOLOGY 0.0161
4 MATERIALS SCIENCE 0.0223
5 MATHEMATICS AND COMPUTER SCIENCE 0.0249
6 GENERAL AGRICULTURE 0.0261
7 TREATMENT THERAPY PROFESSIONS 0.0263
8 NURSING 0.0268
9 AGRICULTURE PRODUCTION AND MANAGEMENT 0.0286
10 AGRICULTURAL ECONOMICS 0.0302
I was surprised to see that Educational Administration and Supervision has an unemployment rate of 0 considering the results of the previous table with so many education majors having high unemployment rates.
Question 3
Which category of major produces the most employed students after graduation?
As can be seen from this graph, majors in the business category have significantly more students employed year round than all the other majors shown combined. This makes me feel even better about choosing my degrees in both Accounting and Business Analytics as I have learned that there is a lot of job security within a lot of business majors.
Secondary Data Source Analysis
Introduction
Something that I have been interested in all semester has been college information and more specifically how students review them. I think there is a lot of power and information that can be learned from student reviews of colleges they have attended or even just toured, as students are the primary people who leverage college to help them with their future careers. GradReports is a website that compares colleges on a variety of factors including annual tuition, salary coming out of college, and also has individualized webpages for schools like Xavier University with reviews. What drove the following analysis was my interest in comparing both Xavier University and the University of Cincinnati to see what kinds of things people were saying about each university, and does one university have more positive reviews than the other?
Data Dictionary
I scraped the webpages on GradReports for reviews on both Xavier University and the University of Cincinnati and compiled them into a csv file to analyze. Here’s a short data dictionary that gives some context as to what each variable means and is used for.
Name: The name of the person who left a review.
Date: The date that the person left their review.
University: The specific school that the reviewer was leaving a review on.
Review: The content of text included in their written review on the university.
If you are interested in looking at this file yourself, you can find it here.
Question 1
How many positive versus negative words are there for reviews of Xavier University and the University of Cincinnati?
Looking at this graph, it appears that there are more positive words around UC reviews than Xavier reviews, and Xavier reviews appear to have more negative words.
Question 2
How do the positivity scores compare over different days of the week?
For this it seems like just about every day UC has Xavier beat on positivity score besides Thursdays. It would be interesting to do some additional analysis on why so many people wrote positive reviews on Monday and Tuesday for UC, and why Xavier’s best review days were Wednesday through Friday.
Question 3
How do Xavier and UC compare in terms of emotional sentiment?
From this graph it appears that UC has more words in general than Xavier, which means that people are writing more and leaving more reviews on UC as opposed to Xavier. It is interesting to see the significance in the difference in anticipation words with UC having over double the amount.
Conclusion
It was quite interesting to dive into some analysis on what kinds of words were being said about both Xavier University and the University of Cincinnati. Something that definitely threw off the analysis a bit was just fact that UC had more reviews than Xavier, so it is hard to say necessarily whether the overall sentiment of UC really is higher than Xavier. Nonetheless, this was a very interesting topic to dive into and analyze. Overall it was fun looking into some information on college majors as well as diving into some analysis of how people are reviewing universities like Xavier and UC.