Final
My Area of Interest
Being a student a Xavier for the past 4 years, there is always one debate floating around the city. That would be, who gets the better education, a larger, public school (like UC), or a smaller, private school (like XU)? This question could be quite ambiguous hence the ever lasting debate within Cincinnati. To combat the ambiguity, Iwillbe comparing private schools to public schools based on several performance factors as well as a sentiment analysis between XU and UC.
First set of data
My first datset is the College Scorecard. If you have taken Joel’s data modeling course then you should be well versed in this dataset but for those of you that did not get granted that special opportunity, I will explain it.
The College Scorecard is a data tool provided by the U.S. Department of Education that contains information about colleges and universities in the United States. It includes data on various aspects such as enrollment, graduation rates, financial aid, student demographics, family earnings and some other things as well.
The full dataset can be accessed at the link below, as well as the data dictionary beneath.
http://asayanalytics.com/scorecard_clean-csv
Data Dictionary
ID - Unique ID for institution
INSTNM - Institution name
CITY - City
STABBR - State postcode
ZIP- Zip code
CONTROL - 1 if public, 2 if private nonprofit, 3 if private for-profit
LOCALE - Locale of institution - city, suburb, town, rural
LATTITUDE - Latitude of school
LONGITUDE - Longitude of school
HBCU - 0 if it is not a HBCU, 1 if it is an HBCU
MENONLY - 0 if not men only, 1 if men only institute
WOMENONLY - 0 if not women only, 1 if women only institute
ADM_RATE - Admission rate
ACTCM25 - 25th percentile of the ACT cumulative score
ACTCM75 - 75th percentile of the ACT cumulative score
ACTCMMID - Midpoint of the ACT cumulative score
SAT_AVG - Average SAT equivalent score of students admitted
UGDS - Enrollment of undergraduate certificate/degree-seeking students
COSTT4_A - Average cost of attendance (academic year institutions)
AVGFACSAL - Average faculty salary
PCTPELL - Percentage of undergraduates who receive a Pell Grant
PCTFLOAN - Percent of all undergraduate students receiving a federal student loan
AGE_ENTRY - Average age of entry
FEMALE - Share of female students
MARRIED - Share of married students
DEPENDENT - Share of dependent students
VETERAN - Share of veteran students
FIRST_GEN - Share of first-generation students
FAMINC - Average family income in real 2015 dollars
Below are some summary statistics to help understand the data better
Second set of data
As stated in my introduction, the question at hand is quite ambiguous and to combat that, I will be using a second dataset. For the second set, I went out and scraped reviews on UC and XU provided by niche.com. Niche.com is a website where students, faculty, or really anyone can rate a school on several factors and leave a review if they wish to do so. In this dataset, I focused on the reviews rather than ratings and turned the unstructured website into a structured dataset. Although this analysis is about private vs public as a whole and big vs small, the debate that we are going after is in regards to these schools. Because of that, I only scraped the reviews from the two schools and we will compare sentiments based on these reviews.
To access this dataset, you can use the link below.
Data Dictionary
School - Either University of Cincinnati or Xavier University
Review - The content of the review
Reviewer_year - the year of the reviewer, fresh, soph, junior, senior, staff, other
Rating - excellent, very good, average, poor, terrible
Descriptive Analysis
Comparing Xavier University to University of Cincinnati
To begin the analysis, it is good to start with some information to familiarize yourself. I went ahead and filtered the data to just see a comparison of Xavier to UC with about half of of the possible columns. Other colunms such as latitude or locale I felt were unimportant for this comparison. Here we just want an idea of how big each school is, the price, the family income, and other things such as financial aid for the students.
What this table is telling us is that despite the size and cost of the schools being drastically different, the academics are very similar. Using SAT and ACT scores as success metrics, both schools fit in at above average.
Visualization 1
As number of undergrads increases, what happens to average SAT scores?
Although it is not the strongest correlation, something that stands out to me in this graph would be that for public schools, as the number of undergrads increases, the SAT scores also increase.
On the other hand, we do not see as strong as of a correlation for private schools. There does appear to be some postive correlation but not nearly as strong. One reason that I can think to explain this would be how small private schools generally are and that they usually don’t get to be the size of the bigger public schools.
Visualization 2
As cost for attendance increases, what happens to average SAT scores?
This visual clearly shows a positive relationship between cost and SAT scores. To me, this is a good thing, it is good to know that as the school costs more to attend, it should generally be giving their students the better education. It is also interesting to see the private schools having such a positive relationship.
In regards to the debate in Cincy, this would be a good visual to support Xavier as it is pricier and is also private.
Visualization 3
As family income increases, what happens to average SAT scores?
As a Xavier student, this visual is good to see. Although I am an outlier, many students at Xavier come from wealthy families and that ends up being a major talking point when it comes to this debate in the city. UC kids especially love to talk about how Xavier students are all rich kids and that the school is expensive for no reason.
What this visual is showing us is, regardless of school type (private or public), the higher the family income for a school, the higher the SAT scores are going to be for that school. This could be for many factors such as accessibility to a good education but in regards to a school as a whole, rather than individual students, the more rich kids a school has, the higher the test scores are going to be for that school. For me myself, I know that I am not a “rich kid” but I do go to a “rich kid” school so it is good to see that the rich kid schools produce the better test scores.
Visualization 4
As faculty salary increases, what happens to average SAT scores?
This visual also shows us a strong positive relationship regardless of the shcool type. The more the school pays their faculty, the better the test scores are going to be on average. This is an interesting talking point in the debate because as you can tell in the first visual, which is comparing XU to UC, you can see that UC actually pays their faculty more on average. The faculty salary is definitely something that XU students have heard complaints about from their professors, so it is interesting to see that UC is actually the higher paying school.
If we were debating in a bar about which school is better, a UC student could use the fact that the faculty salary is higher which in turn can lead to better test scores.
Visualization 5
Does the amount of students receiving federal loans effect the admission rate of a school?
As a student who does take advantage of federal loans at Xavier, it would be interesting to see if this effects how many kids a school would accept.
There is definitly a lot going on here but somthing that stands out would be the positive relationship that we see with the private schools. What this means is that schools that have a higher admission rate, easier to get in, often contain students who are taking on federal loans.
As far as the debate in Cincy goes, this graph may support the UC side because it really shows that for schools with that low admission rate, it would be harder for a student who comes from a less fortunate family income to get in to. Basically what I am saying is that the harder it is to get into a school, the less likely the student uses federal loans, meaning the more likely they are to come from a wealthier family. Basically, school is pay to win.
Visualization 6 - Positive Words
Using my second set of data now, we are looking at the reviews that people left on Xavier and UC’s nich.com pages. This chart is comparing amount of positive words that were left in the reviews for each school based on who left the review. Although the schools a have relatively even amount of positive words, something that stands out would be coming from the freshman and sophomores. It is apparent that Xavier gets more positive words coming from their freshman and UC gets more positive words from their sophomores. Something that comes to mind to me as a former Xavier freshman would be the orientation that the shcool gives. I think that first week of college introduces you to a lot of new things and people and Xavier does a good job of making the freshman feel at home. For UC sophomores, I can see them enjoying the night life a little more and easing into their school. It is a much bigger school so maybe it just takes a semster or two to settle in and gert comfortable.
Visual 7 - Positive vs Negative
This is just a quick count of positive vs negative words used for each school and to me the most telliing thing would be how similar the results are. Sure Xavier this more positive words and less negative wors so this chart could help Xavier students in their debate. But overall, the numbers are so close that I would just say it is interesting to see the result of the reviews being so similar.
Conclusion
Overall, using all of this data, charts, and visuals, you can easily make a case for or against the smaller, more expensive, private school compared to the larger, less expensive, public school. I feel that the comparison of XU to UC is a good indicator of the two types of schools because of the location being about the same and the academics being so similar. This analysis may not have ended the debate, it may have only fueled the debate more.