This report analyzes various aspects of college data from CollegeScores4yr dataset, focusing on metrics such as tuition, admission rates, net price variation, SAT scores, and demographic distribution across institutions. This analysis applies various statistical methods, including measures of central tendency, variability, correlation, and visualizations, to better understand the factors affecting college data.
Some questions that will be explored are:
What is the average in-state tuition for colleges across different regions?
What is the median admission rate among colleges in this dataset?
How much variation exists in net price among colleges?
What is the standard deviation of average SAT scores across colleges?
Does the percentage of first-generation students correlate with the completion rate?
What are the quartiles for average debt among students who complete the program?
How are undergraduate enrollment numbers distributed across different institutions?
How does the percent of faculty that are full-time vary across different types of control (Private, Public, Profit)?
How are average SAT scores distributed across colleges?
What is the percentage distribution of colleges based on region?
In this analysis, we are utilizing R programming language, specifically within the RStudio environment, to process and interpret college data from the CollegeScores4yr dataset.
We will first load the dataset directly into R using the read.csv() function, which enabled seamless access to the data for analysis.
To explore the dataset, we will apply fundamental descriptive statistics to compute measures of central tendency and variability such as mean(), var(), sd(),stem(), hist(), etc.
Each analysis will be followed by an interpretation of results, providing context.
In analysis, we are exploring the questions relating to CollegeScores4yr dataset. By applying descriptive statistics, we examine both the financial and academic environments of colleges.
college = read.csv("https://www.lock5stat.com/datasets3e/CollegeScores4yr.csv")
head(college)
## Name State ID Main
## 1 Alabama A & M University AL 100654 1
## 2 University of Alabama at Birmingham AL 100663 1
## 3 Amridge University AL 100690 1
## 4 University of Alabama in Huntsville AL 100706 1
## 5 Alabama State University AL 100724 1
## 6 The University of Alabama AL 100751 1
## Accred
## 1 Southern Association of Colleges and Schools Commission on Colleges
## 2 Southern Association of Colleges and Schools Commission on Colleges
## 3 Southern Association of Colleges and Schools Commission on Colleges
## 4 Southern Association of Colleges and Schools Commission on Colleges
## 5 Southern Association of Colleges and Schools Commission on Colleges
## 6 Southern Association of Colleges and Schools Commission on Colleges
## MainDegree HighDegree Control Region Locale Latitude Longitude AdmitRate
## 1 3 4 Public Southeast City 34.78337 -86.56850 0.9027
## 2 3 4 Public Southeast City 33.50570 -86.79935 0.9181
## 3 3 4 Private Southeast City 32.36261 -86.17401 NA
## 4 3 4 Public Southeast City 34.72456 -86.64045 0.8123
## 5 3 4 Public Southeast City 32.36432 -86.29568 0.9787
## 6 3 4 Public Southeast City 33.21187 -87.54598 0.5330
## MidACT AvgSAT Online Enrollment White Black Hispanic Asian Other PartTime
## 1 18 929 0 4824 2.5 90.7 0.9 0.2 5.6 6.6
## 2 25 1195 0 12866 57.8 25.9 3.3 5.9 7.1 25.2
## 3 NA NA 1 322 7.1 14.3 0.6 0.3 77.6 54.4
## 4 28 1322 0 6917 74.2 10.7 4.6 4.0 6.5 15.0
## 5 18 935 0 4189 1.5 93.8 1.0 0.3 3.5 7.7
## 6 28 1278 0 32387 78.5 10.1 4.7 1.2 5.6 7.9
## NetPrice Cost TuitionIn TuitonOut TuitionFTE InstructFTE FacSalary
## 1 15184 22886 9857 18236 9227 7298 6983
## 2 17535 24129 8328 19032 11612 17235 10640
## 3 9649 15080 6900 6900 14738 5265 3866
## 4 19986 22108 10280 21480 8727 9748 9391
## 5 12874 19413 11068 19396 9003 7983 7399
## 6 21973 28836 10780 28100 13574 10894 10016
## FullTimeFac Pell CompRate Debt Female FirstGen MedIncome
## 1 71.3 71.0 23.96 1068 56.4 36.6 23.6
## 2 89.9 35.3 52.92 3755 63.9 34.1 34.5
## 3 100.0 74.2 18.18 109 64.9 51.3 15.0
## 4 64.6 27.7 48.62 1347 47.6 31.0 44.8
## 5 54.2 73.8 27.69 1294 61.3 34.3 22.1
## 6 74.0 18.0 67.87 6430 61.5 22.6 66.7
mean(college$TuitionIn, na.rm = TRUE)
## [1] 21948.55
The average in-state tuition for colleges across different regions is $21948.55. This metric helps compare the tuition burden on students across various colleges.
median(college$AdmitRate, na.rm = TRUE)
## [1] 0.69505
The median admission rate among colleges in this dataset is 69.51% indicating that at least half of the institutions admit approximately 70% or more of their applicants. This can give insight into the selectiveness of institutions.
var(college$NetPrice, na.rm = TRUE)
## [1] 61686826
The variance that exists in net price among colleges is 61,686,826.A higher variance indicates a large spread in net prices, suggesting significant differences in costs across colleges due to factors like financial aid, state funding, and institutional type.
sd(college$AvgSAT, na.rm = TRUE)
## [1] 128.9077
The standard deviation of average SAT scores across colleges is 128.91. This spread can reflect differences in academic rigor or student body composition across institutions.
cor(college$FirstGen, college$CompRate, use = "complete.obs")
## [1] -0.6643909
The percentage of first-generation students that correlates with the completion rate is -0.6644. This negative correlation suggests that as the percentage of first-generation students increases, completion rates tend to decrease.
quantile(college$Debt, probs = c(0.25, 0.5, 0.75), na.rm = TRUE)
## 25% 50% 75%
## 325.00 713.50 2203.25
The 25th quartiles for average debt is $325,50th quartile is $713.50 and 75th quartile is $2203.25 which helps to understand the debt burden on students at various levels within this dataset.
hist(college$Enrollment, main = "Distribution of Undergraduate Enrollment", xlab = "Enrollment")
The histogram shows the distribution of undergraduate enrollment numbers across institutions, highlighting the variation in student body size. This visualization helps to identify if most colleges have large or small enrollments.
boxplot(college$FullTimeFac ~ college$Control, main = "Full-Time Faculty Percentage by Control Type", ylab = "Full-Time Faculty %")
The boxplot displays the distribution of full-time faculty percentages based on the control type of each institution (Private, Public, or Profit). This can reflect differences in staffing models or educational priorities across various institution types.
stem(college$AvgSAT)
##
## The decimal point is 2 digit(s) to the right of the |
##
## 5 | 6
## 6 |
## 6 |
## 7 |
## 7 |
## 8 | 234
## 8 | 555556677789999
## 9 | 00111122222233333334444
## 9 | 55555555555566666666666667777777777777788888888888888888899999999999
## 10 | 00000000000000000000000000011111111111111111111111111111111111111222+103
## 10 | 55555555555555555555555555555555555555555555555556666666666666666666+141
## 11 | 00000000000000000000000000000000000000000000000000011111111111111111+174
## 11 | 55555555555555555555555555555555555666666666666666666666666666666666+85
## 12 | 00000000000000000000000000000000000001111111111111111111111111122222+34
## 12 | 55555555555555555666666666667777777777777777888888888888888889999999
## 13 | 0000000000011111122222222222223333333344444444
## 13 | 555555666666667777778888899999999
## 14 | 00000111111111222233334444444
## 14 | 5555555677788888999999
## 15 | 000111222222334
## 15 | 6
The stem-and-leaf plot provides a quick look at the distribution of SAT scores across colleges.
pie(table(college$Region), main="Percentage of Colleges by Region")
The pie chart illustrates the regional distribution of colleges, showing the proportion of institutions in each region.
This report looks at important details about colleges in the U.S. The average in-state tuition is about 21,948 dollars, showing that college can be expensive. The median admission rate is 69.51%, meaning many colleges accept a good number of applicants. The costs of attending vary a lot, depending on the school. SAT scores differ across schools, showing a range of student preparation levels. There is a negative link between the number of first-generation students and completion rates, suggesting these students face more challenges. The average student debt is high, with the 75th percentile being $2,203.25. The size of schools varies, with some being small and others large. The percentage of full-time faculty changes depending on whether a school is public, private, or for-profit. The distribution of colleges across different regions shows where schools are located in the U.S. This data helps understand patterns in college costs, student success, and types of schools.