Final Project

Eve Thorner

May 4, 2023

Introduction

Hello, I am once again asking all of you to read about college_scorecard data [insert Bernie Sanders meme here]. As we have learned, the college_scorecard data set provides us with an inside look into public, private non-profit, and private for-profit universities across the United States. Through this data set, we are able to better understand strengths and weaknesses of over 7,000 American colleges by assessing each institution by leveraging the following variables.

Variable Description
ID Unit ID for institution
INSTNM Institution name
CITY City of the institution
STABBR State postcode of the institution
ZIP Zip code of the institution
CONTROL Control of the institution (public, private nonprofit, private for-profit)
LOCALE Locale of the institution
LATITUDE Latitude
LONGITUDE Longitude
HBCU Flag for Historically Black College and University
MENONLY Flag for men-only college
WOMENONLY Flag for women-only college
ADM_RATE Admission rate
ACTCM25 25th percentile of the ACT cumulative score
ACTCM75 75th percentile of the ACT cumulative score
ACTCMMID Midpoint of the ACT cumulative score
SAT_AVG Average SAT equivalent score of students admitted
UGDS Enrollment of undergraduate certificate/degree-seeking students
COSTT4_A Average cost of attendance (academic year institutions)
AVGFACSAL Average faculty salary
PCTPELL Percentage of undergraduates who receive a Pell Grant
PCTFLOAN Percent of all undergraduate students receiving a federal student loan
AGE_ENTRY Average age of entry
FEMALE Share of female students
MARRIED Share of married students
DEPENDENT Share of dependent students
VETERAN Share of veteran students
FIRST_GEN Share of first-generation students
FAMINC Average family income

I wish to gain a better understanding of what factors could make a university more or less appealing for college students. To do so, I will first analyze relationships between the college_scorecard variables listed above. Specifically, I want to see whether there are any relationships between the cost of attendance, average family income, average faculty salary, and overall descriptive analyses to compare universities with a strong presence of female, married, veteran, and first-generation students.

Then, I want to use these insights to compare student satisfaction rates that I gathered from midwestern colleges on the Rate My Professor website. Do colleges with a high cost of attendance have higher or lower rates of student satisfaction? Do current students seem more or less satisfied at institutions where the average faculty salary is high? These are a few of the questions I hope to uncover in this project.

Summary Statistics

Take some time to look at the following descriptive visualizations.

## # A tibble: 3 × 3
##   CONTROL            `Average Cost of Attendance` `Average Faculty Salary`
##   <chr>                                     <dbl>                    <dbl>
## 1 Private for-profit                       26329.                    4854.
## 2 Private non-profit                       38760.                    6814.
## 3 Public                                   15695.                    7389.

Here are the key takeaways from the following visualizations:

  • Private Non-Profit institutions had the highest average cost of attendance of 38,760 dollars with Private For-Profit institutions having the second-highest average cost of attendance at 26,329 dollars. It is no surprise that Public schools had the lowest cost at 15,695 dollars, on average.

  • Surprisingly, faculty who worked for Public institutions received the highest average salary of 7,389 dollars while Private Non-Profit and Private For-Profit institutions had average faculty salaries of 6,814 dollars and 4,854 dollars, respectively.

  • Looking at the average cost of attendance overall, the distribution is relatively normal, but is partially skewed right. There is a high concentration of universities whose average cost of attendance was between 10,000 and 25,000 dollars.

  • Before we dive deeper into identifying relationships between variables, let’s first look at the distribution of each classification of control. It appears that the majority of institutions included in this data set are Private For-Profit, at 3,000 total schools. Public schools make up the second-largest pool at just over 2,000 total schools, and Private Non-Profit institutions are slightly lower than the number of Public schools.

Gaining Foundational Knowledge on Midwestern Colleges

Most of the colleges I scraped from Rate My Professor were from midwestern states, so I wanted to lay the groundwork for seeing relationships between institutions located in the following states: Ohio, Kentucky, Illinois, Indiana, Illinois, and Michigan.

Family Income

I was first interested in seeing what the distribution looked like for Average Family Income in those states. Based on the visual below, the median household income for all five states was somewhere between 25,000 and 38,000 dollars, which was significantly lower than I expected. There are a high number of outliers which are making the range of most of these incomes quite wide. For instance, there are three institutions in Ohio in which the average family income is well above 125,000 dollars.

Relationship between Admission Rate and Average SAT Score

Next, I wanted to gain a better understanding of the relationship between admission rate and average SAT score for the same group of midwestern universities. It is not very surprising to see (in the scatterplot below) that schools with an average SAT score between 1000 and 1200 have higher admission rates. We can also see that scores with a lower admission rate (less than 25%) have higher average SAT scores above a 1400.

Student Population Distributions

I also wanted to gain a better understanding of the distribution of the following student populations in midwestern universities: female, married, veteran, and first generation students.

Looking at female populations for the same group of midwestern states, we see that the median percentage of students that are female is well over 50% and most schools in Ohio have a fairly symmetric distribution. Michigan, Illinois, Kentucky, and Indiana are all right skewed, with multiple outliers in the low ranges of 10-20% female populations.

Looking at married populations for the same group of midwestern states, we see that the median percentage of students that are married is between 10-20% and all schools appear to have a symmetric distribution. Indiana has no outliers, but the other four states do, with some colleges having married students make up 30-50% of the student population. Upon further investigation, the university in Illinois that had a married status of 66% was the Rosel School of Cosmetology, where the average age of entry is 36 years old. According to the Population Reference Bureau, the median age at first marriage in America is between 28-30 years old for men and women, so this checks out. https://www.prb.org/usdata/indicator/marriage-age-women/snapshot/.

The following code chunk shows how I found the outlier in Illinois:

scorecard %>% select(STABBR, MARRIED, INSTNM, AGE_ENTRY) %>% filter(STABBR == "IL") %>% arrange(-MARRIED) %>% view

It is not surprising to see that veterans make up a very low percentage of most universities in the midwest. At first I was surprised to see so may outliers for the state of Ohio, but then I thought about Wright Patterson Air Force Base, which is one of the largest military bases operated by the Air Force on the United States territory. This base is located just up north in Dayton, Ohio. (Fun fact: I was born at WPAFB!)

Looking at first generation student populations for the same group of midwestern states, we see that the median percentage of students that are first-gen is between 40-50% and most schools in Ohio have a fairly symmetric distribution, with a slight skew to the left. The two institutions in Illinois that have First-Gen populations above 70% are the National Latino Education Institute (76%) and the Cannella School of Hair Design-Chicago (73%). The one upper outlier in Michigan is Mr Bela’s School of Cosmetology Inc, which had a first-gen population of 77%.

The following code chunks show how I found the outliers in Illinois and Michigan:

scorecard %>% select(STABBR, FIRST_GEN, INSTNM) %>% filter(STABBR == "IL") %>% arrange(-FIRST_GEN) %>% view

scorecard %>% select(STABBR, FIRST_GEN, INSTNM) %>% filter(STABBR == "MI") %>% arrange(-FIRST_GEN) %>% view

Joining college_scorecard with scraped Rate My Professor data

Introducing Rate My Professor Ethically-Scraped Data

I scraped Rate My Professor websites for the following universities to compare reviews that current and past students have posted about their respective colleges.

  • Miami University

  • Ohio University

  • Sinclair Community College

  • The Ohio State University

  • University of Cincinnati

  • University of Dayton

  • University of Kentucky

  • Xavier University

I also scraped the following variables from each university’s respective website:

Variable Description
school name Name of the University
review content Content of the review
review date Date the reviewer posted the review
review emotion Attitude the reviewer expressed towards the university (awesome, average, or awful)
review rating The score the reviewer gave to the University (on a scale from 1-5)

Then, I want to use these insights to compare student satisfaction rates that I gathered from mid-west colleges on the Rate My Professor website. Do colleges with a high cost of attendance have higher or lower rates of student satisfaction? Do current students seem more or less satisfied at institutions where the average faculty salary is high? These are a few of the questions I hope to uncover in this project.

Comparing Average Cost of Attendance with Student Satisfaction

The visualization below shows the distribution of ratings for each university. As we can see, Ohio University has the widest range of scores, while the University of Kentucky has the smallest range. It seems that many college students have similar attitudes towards UK; whereas, the opposite is true of OU. It appears that Xavier overall was pretty average compared to other universities; it didn’t have a surprisingly high median rating or large IQR, it was just fairly average. It is noteworthy, in my opinion, that this could look a lot different if the outlier was excluded from Xavier’s review data. After I looked into it a little more, though, the 1.7 rating was attributed to Xavier not meeting standards the reviewer had based on other friends’ reviews. This person also left a review in November of 2021, which was the weird semester when the University was cracking down on social gatherings of 10 or more people both on and off campus. There could be a correlation between the low rating of this score and the date it was published.

Another interesting takeaway I had was that the University of Dayton had the highest median score of 4.5 out of 5, while also having the highest cost of attendance (56,370 dollars) out of the 8 universities I scraped from. Perhaps the university has allocated a generous amount of its tuition cost to better enhancing the student experience at UD. The same cannot be said for the University of Kentucky, which had the lowest median score of ~2.9 out of 5, while also charging the 4th-largest tuition rate of 27,215 dollars. While UK is charging its students a higher cost of attendance than 4 other universities, those other 4 universities (UC, OU, OSU, SCC) all have median ratings above a 4 out of 5. There seems to be an opportunity for UK to maximize student satisfaction by leveraging its funds.

## # A tibble: 8 × 3
##   INSTNM                               COSTT4_A AVGFACSAL
##   <chr>                                   <dbl>     <dbl>
## 1 University of Dayton                    56370      8687
## 2 Xavier University                       50880      8694
## 3 Miami University-Oxford                 30420      9395
## 4 University of Kentucky                  27215     10644
## 5 University of Cincinnati-Main Campus    26829      9708
## 6 Ohio University-Main Campus             26233      8978
## 7 Ohio State University-Main Campus       25498     12153
## 8 Sinclair Community College               7773      7517

Comparing Average Faculty Salary with Student Satisfaction

My final question was whether students who attend the institutions I scraped data from had any positive words to say about professors, and to compare their review contents to what the faculty at these universities make annually. Are students more or less satisfied with faculty who are paid a high amount?

Using the table from the previous section, we can see that the school with the highest-paid faculty is Ohio State University, at 12,153 dollars, on average. Let’s perform a basic emotional sentiment analysis to see if any reviewers mention the word “professor” in their reviews.

After looking at this, it is evident that the word “professor” was not a common word in the reviews. The most common words with positivity scores were “love, cute, amazing, recommend, super, pretty, nice”. Let’s next visualize a simple table to show which reviews did contain the word “professor”.

## # A tibble: 26 × 3
##    school_name               review_content                              revie…¹
##    <chr>                     <chr>                                         <dbl>
##  1  Xavier University        The professors are really willing to help …     3.5
##  2  Xavier University        Pros: Most professors actually care about …     3.6
##  3  Xavier University        All the clubs get advertised in the studen…     4.3
##  4  Xavier University        As with most schools you get exactly what …     4.5
##  5  Xavier University        Would NEVER recommend the accelerated nurs…     2.8
##  6  University of Cincinnati The overall college experience is great! T…     4.3
##  7  University of Cincinnati I am in Cybersecurity program at UC and I …     4.9
##  8  University of Cincinnati As long as you actively take advantage of …     3.7
##  9  University of Cincinnati Pros: Great co op program Friendly student…     2.7
## 10  University of Cincinnati One of the best recognized University glob…     4.7
## # … with 16 more rows, and abbreviated variable name ¹​review_rating

It is evident that even though OSU has the highest-paid faculty out of the 8 schools I scraped from, the three student reviews that mention the word “professor” are all pretty mixed in terms of positive and negative attitudes.

For instance, one review reads “Excellent school with incredible professors and opportunities.” This student clearly had positive interactions with his/her teachers.

On the other hand, another reviewer stated “Some professors are professional and compassionate towards students, while others are total jerks with their harsh grading and condescending behaviors/attitudes.” There is obviously some gray area, which isn’t surprising. I guarantee that every college has both good and bad professors; we don’t live in a utopian society where everyone behaves the way we want them to.

To wrap up this project, I wanted to look at what Xavier Students had to say about the professors they have had.

Xavier University Review Content that Included “professor”
The professors are really willing to help you and are interested in your | success there are a lot of programs that are freely available to help if | your struggling as well. Cleaney ave is horrible to drive on and parking | is limited when you get on campus. Because of the small size it can be | difficult to make friends if you aren’t very outgoing.
Pros: Most professors actually care about your success The campus looks | like a park and is well kept Smaller class size BASKETBALL SEASON! Cons: | High costs with or without aid CLENEAY AVE IS HORRIBLE TO DRIVE OVER | Limited Parking DIFFICULT TO MAKE REAL FRIENDS AS A COMMUTER Lack of | Diversity THE STAIRS THAT LEAD UP TO ELET HALL
All the clubs get advertised in the student center so you know what is | going on that week. The professors care about your success. If you enjoy | basketball, going to the basketball games is an amazing experience. It’s | the perfect size so meeting people was not as hard as it would be at | other schools.
As with most schools you get exactly what you put into your college | experience. Basketball is king. Parties can be found if you look for | them. The professors genuinely care about your success.
Would NEVER recommend the accelerated nursing program at this school. It is disorganized, almost every professor is rude and not willing to help you learn or do better, and the pass rate is incredibly slim due to the lack of support they provide. Save yourself time, money, and your tears and pick another school!!!!!

It was really nice to read that 4 out of the 5 reviews mentioned something along the lines of Xavier professors being very caring and prioritizing student success! I wholeheartedly agree! :)