Introduction
Hello, I am once again asking all of you to read about
college_scorecard data [insert Bernie Sanders meme here].
As we have learned, the college_scorecard data set provides us with an
inside look into public, private non-profit, and private for-profit
universities across the United States. Through this data set, we are
able to better understand strengths and weaknesses of over 7,000
American colleges by assessing each institution by leveraging the
following variables.
| Variable | Description |
|---|---|
| ID | Unit ID for institution |
| INSTNM | Institution name |
| CITY | City of the institution |
| STABBR | State postcode of the institution |
| ZIP | Zip code of the institution |
| CONTROL | Control of the institution (public, private nonprofit, private for-profit) |
| LOCALE | Locale of the institution |
| LATITUDE | Latitude |
| LONGITUDE | Longitude |
| HBCU | Flag for Historically Black College and University |
| MENONLY | Flag for men-only college |
| WOMENONLY | Flag for women-only college |
| ADM_RATE | Admission rate |
| ACTCM25 | 25th percentile of the ACT cumulative score |
| ACTCM75 | 75th percentile of the ACT cumulative score |
| ACTCMMID | Midpoint of the ACT cumulative score |
| SAT_AVG | Average SAT equivalent score of students admitted |
| UGDS | Enrollment of undergraduate certificate/degree-seeking students |
| COSTT4_A | Average cost of attendance (academic year institutions) |
| AVGFACSAL | Average faculty salary |
| PCTPELL | Percentage of undergraduates who receive a Pell Grant |
| PCTFLOAN | Percent of all undergraduate students receiving a federal student loan |
| AGE_ENTRY | Average age of entry |
| FEMALE | Share of female students |
| MARRIED | Share of married students |
| DEPENDENT | Share of dependent students |
| VETERAN | Share of veteran students |
| FIRST_GEN | Share of first-generation students |
| FAMINC | Average family income |
I wish to gain a better understanding of what factors could make a
university more or less appealing for college students. To do so, I will
first analyze relationships between the college_scorecard
variables listed above. Specifically, I want to see whether there are
any relationships between the cost of attendance, average family income,
average faculty salary, and overall descriptive analyses to compare
universities with a strong presence of female, married, veteran, and
first-generation students.
Then, I want to use these insights to compare student satisfaction rates that I gathered from midwestern colleges on the Rate My Professor website. Do colleges with a high cost of attendance have higher or lower rates of student satisfaction? Do current students seem more or less satisfied at institutions where the average faculty salary is high? These are a few of the questions I hope to uncover in this project.
Summary Statistics
Take some time to look at the following descriptive visualizations.
## # A tibble: 3 × 3
## CONTROL `Average Cost of Attendance` `Average Faculty Salary`
## <chr> <dbl> <dbl>
## 1 Private for-profit 26329. 4854.
## 2 Private non-profit 38760. 6814.
## 3 Public 15695. 7389.
Here are the key takeaways from the following visualizations:
Private Non-Profit institutions had the highest average cost of attendance of 38,760 dollars with Private For-Profit institutions having the second-highest average cost of attendance at 26,329 dollars. It is no surprise that Public schools had the lowest cost at 15,695 dollars, on average.
Surprisingly, faculty who worked for Public institutions received the highest average salary of 7,389 dollars while Private Non-Profit and Private For-Profit institutions had average faculty salaries of 6,814 dollars and 4,854 dollars, respectively.
Looking at the average cost of attendance overall, the distribution is relatively normal, but is partially skewed right. There is a high concentration of universities whose average cost of attendance was between 10,000 and 25,000 dollars.
Before we dive deeper into identifying relationships between variables, let’s first look at the distribution of each classification of control. It appears that the majority of institutions included in this data set are Private For-Profit, at 3,000 total schools. Public schools make up the second-largest pool at just over 2,000 total schools, and Private Non-Profit institutions are slightly lower than the number of Public schools.
Gaining Foundational Knowledge on Midwestern Colleges
Most of the colleges I scraped from Rate My Professor were from midwestern states, so I wanted to lay the groundwork for seeing relationships between institutions located in the following states: Ohio, Kentucky, Illinois, Indiana, Illinois, and Michigan.
Family Income
I was first interested in seeing what the distribution looked like for Average Family Income in those states. Based on the visual below, the median household income for all five states was somewhere between 25,000 and 38,000 dollars, which was significantly lower than I expected. There are a high number of outliers which are making the range of most of these incomes quite wide. For instance, there are three institutions in Ohio in which the average family income is well above 125,000 dollars.
Relationship between Admission Rate and Average SAT Score
Next, I wanted to gain a better understanding of the relationship between admission rate and average SAT score for the same group of midwestern universities. It is not very surprising to see (in the scatterplot below) that schools with an average SAT score between 1000 and 1200 have higher admission rates. We can also see that scores with a lower admission rate (less than 25%) have higher average SAT scores above a 1400.
Student Population Distributions
I also wanted to gain a better understanding of the distribution of the following student populations in midwestern universities: female, married, veteran, and first generation students.
Looking at female populations for the same group of midwestern states, we see that the median percentage of students that are female is well over 50% and most schools in Ohio have a fairly symmetric distribution. Michigan, Illinois, Kentucky, and Indiana are all right skewed, with multiple outliers in the low ranges of 10-20% female populations.
Looking at married populations for the same group of midwestern states, we see that the median percentage of students that are married is between 10-20% and all schools appear to have a symmetric distribution. Indiana has no outliers, but the other four states do, with some colleges having married students make up 30-50% of the student population. Upon further investigation, the university in Illinois that had a married status of 66% was the Rosel School of Cosmetology, where the average age of entry is 36 years old. According to the Population Reference Bureau, the median age at first marriage in America is between 28-30 years old for men and women, so this checks out. https://www.prb.org/usdata/indicator/marriage-age-women/snapshot/.
The following code chunk shows how I found the outlier in Illinois:
scorecard %>% select(STABBR, MARRIED, INSTNM, AGE_ENTRY) %>% filter(STABBR == "IL") %>% arrange(-MARRIED) %>% view
It is not surprising to see that veterans make up a very low percentage of most universities in the midwest. At first I was surprised to see so may outliers for the state of Ohio, but then I thought about Wright Patterson Air Force Base, which is one of the largest military bases operated by the Air Force on the United States territory. This base is located just up north in Dayton, Ohio. (Fun fact: I was born at WPAFB!)
Looking at first generation student populations for the same group of midwestern states, we see that the median percentage of students that are first-gen is between 40-50% and most schools in Ohio have a fairly symmetric distribution, with a slight skew to the left. The two institutions in Illinois that have First-Gen populations above 70% are the National Latino Education Institute (76%) and the Cannella School of Hair Design-Chicago (73%). The one upper outlier in Michigan is Mr Bela’s School of Cosmetology Inc, which had a first-gen population of 77%.
The following code chunks show how I found the outliers in Illinois and Michigan:
scorecard %>% select(STABBR, FIRST_GEN, INSTNM) %>% filter(STABBR == "IL") %>% arrange(-FIRST_GEN) %>% view
scorecard %>% select(STABBR, FIRST_GEN, INSTNM) %>% filter(STABBR == "MI") %>% arrange(-FIRST_GEN) %>% view
Joining college_scorecard with scraped
Rate My Professor data
Introducing Rate My Professor Ethically-Scraped
Data
I scraped Rate My Professor websites for the following universities to compare reviews that current and past students have posted about their respective colleges.
Miami University
Ohio University
Sinclair Community College
The Ohio State University
University of Cincinnati
University of Dayton
University of Kentucky
Xavier University
I also scraped the following variables from each university’s respective website:
| Variable | Description |
|---|---|
| school name | Name of the University |
| review content | Content of the review |
| review date | Date the reviewer posted the review |
| review emotion | Attitude the reviewer expressed towards the university (awesome, average, or awful) |
| review rating | The score the reviewer gave to the University (on a scale from 1-5) |
Then, I want to use these insights to compare student satisfaction rates that I gathered from mid-west colleges on the Rate My Professor website. Do colleges with a high cost of attendance have higher or lower rates of student satisfaction? Do current students seem more or less satisfied at institutions where the average faculty salary is high? These are a few of the questions I hope to uncover in this project.
Comparing Average Cost of Attendance with Student Satisfaction
The visualization below shows the distribution of ratings for each university. As we can see, Ohio University has the widest range of scores, while the University of Kentucky has the smallest range. It seems that many college students have similar attitudes towards UK; whereas, the opposite is true of OU. It appears that Xavier overall was pretty average compared to other universities; it didn’t have a surprisingly high median rating or large IQR, it was just fairly average. It is noteworthy, in my opinion, that this could look a lot different if the outlier was excluded from Xavier’s review data. After I looked into it a little more, though, the 1.7 rating was attributed to Xavier not meeting standards the reviewer had based on other friends’ reviews. This person also left a review in November of 2021, which was the weird semester when the University was cracking down on social gatherings of 10 or more people both on and off campus. There could be a correlation between the low rating of this score and the date it was published.
Another interesting takeaway I had was that the University of Dayton had the highest median score of 4.5 out of 5, while also having the highest cost of attendance (56,370 dollars) out of the 8 universities I scraped from. Perhaps the university has allocated a generous amount of its tuition cost to better enhancing the student experience at UD. The same cannot be said for the University of Kentucky, which had the lowest median score of ~2.9 out of 5, while also charging the 4th-largest tuition rate of 27,215 dollars. While UK is charging its students a higher cost of attendance than 4 other universities, those other 4 universities (UC, OU, OSU, SCC) all have median ratings above a 4 out of 5. There seems to be an opportunity for UK to maximize student satisfaction by leveraging its funds.
## # A tibble: 8 × 3
## INSTNM COSTT4_A AVGFACSAL
## <chr> <dbl> <dbl>
## 1 University of Dayton 56370 8687
## 2 Xavier University 50880 8694
## 3 Miami University-Oxford 30420 9395
## 4 University of Kentucky 27215 10644
## 5 University of Cincinnati-Main Campus 26829 9708
## 6 Ohio University-Main Campus 26233 8978
## 7 Ohio State University-Main Campus 25498 12153
## 8 Sinclair Community College 7773 7517
Comparing Average Faculty Salary with Student Satisfaction
My final question was whether students who attend the institutions I scraped data from had any positive words to say about professors, and to compare their review contents to what the faculty at these universities make annually. Are students more or less satisfied with faculty who are paid a high amount?
Using the table from the previous section, we can see that the school with the highest-paid faculty is Ohio State University, at 12,153 dollars, on average. Let’s perform a basic emotional sentiment analysis to see if any reviewers mention the word “professor” in their reviews.
After looking at this, it is evident that the word “professor” was not a common word in the reviews. The most common words with positivity scores were “love, cute, amazing, recommend, super, pretty, nice”. Let’s next visualize a simple table to show which reviews did contain the word “professor”.
## # A tibble: 26 × 3
## school_name review_content revie…¹
## <chr> <chr> <dbl>
## 1 Xavier University The professors are really willing to help … 3.5
## 2 Xavier University Pros: Most professors actually care about … 3.6
## 3 Xavier University All the clubs get advertised in the studen… 4.3
## 4 Xavier University As with most schools you get exactly what … 4.5
## 5 Xavier University Would NEVER recommend the accelerated nurs… 2.8
## 6 University of Cincinnati The overall college experience is great! T… 4.3
## 7 University of Cincinnati I am in Cybersecurity program at UC and I … 4.9
## 8 University of Cincinnati As long as you actively take advantage of … 3.7
## 9 University of Cincinnati Pros: Great co op program Friendly student… 2.7
## 10 University of Cincinnati One of the best recognized University glob… 4.7
## # … with 16 more rows, and abbreviated variable name ¹review_rating
It is evident that even though OSU has the highest-paid faculty out of the 8 schools I scraped from, the three student reviews that mention the word “professor” are all pretty mixed in terms of positive and negative attitudes.
For instance, one review reads “Excellent school with incredible professors and opportunities.” This student clearly had positive interactions with his/her teachers.
On the other hand, another reviewer stated “Some professors are professional and compassionate towards students, while others are total jerks with their harsh grading and condescending behaviors/attitudes.” There is obviously some gray area, which isn’t surprising. I guarantee that every college has both good and bad professors; we don’t live in a utopian society where everyone behaves the way we want them to.
To wrap up this project, I wanted to look at what Xavier Students had to say about the professors they have had.
| Xavier University Review Content that Included “professor” | |
| The professors are really willing to help you and are interested in your | success there are a lot of programs that are freely available to help if | your struggling as well. Cleaney ave is horrible to drive on and parking | is limited when you get on campus. Because of the small size it can be | difficult to make friends if you aren’t very outgoing. | |
| Pros: Most professors actually care about your success The campus looks | like a park and is well kept Smaller class size BASKETBALL SEASON! Cons: | High costs with or without aid CLENEAY AVE IS HORRIBLE TO DRIVE OVER | Limited Parking DIFFICULT TO MAKE REAL FRIENDS AS A COMMUTER Lack of | Diversity THE STAIRS THAT LEAD UP TO ELET HALL | |
| All the clubs get advertised in the student center so you know what is | going on that week. The professors care about your success. If you enjoy | basketball, going to the basketball games is an amazing experience. It’s | the perfect size so meeting people was not as hard as it would be at | other schools. | |
| As with most schools you get exactly what you put into your college | experience. Basketball is king. Parties can be found if you look for | them. The professors genuinely care about your success. | |
| Would NEVER recommend the accelerated nursing program at this school. It is disorganized, almost every professor is rude and not willing to help you learn or do better, and the pass rate is incredibly slim due to the lack of support they provide. Save yourself time, money, and your tears and pick another school!!!!! | |
It was really nice to read that 4 out of the 5 reviews mentioned something along the lines of Xavier professors being very caring and prioritizing student success! I wholeheartedly agree! :)