We use the data from https://www.lock5stat.com/datapage3e.html to answer the following questions.
I propose the following 10 questions based on my own understanding of the data.
We will explore the questions in detail.
college = read.csv("https://www.lock5stat.com/datasets3e/CollegeScores4yr.csv")
head(college)
## Name State ID Main
## 1 Alabama A & M University AL 100654 1
## 2 University of Alabama at Birmingham AL 100663 1
## 3 Amridge University AL 100690 1
## 4 University of Alabama in Huntsville AL 100706 1
## 5 Alabama State University AL 100724 1
## 6 The University of Alabama AL 100751 1
## Accred
## 1 Southern Association of Colleges and Schools Commission on Colleges
## 2 Southern Association of Colleges and Schools Commission on Colleges
## 3 Southern Association of Colleges and Schools Commission on Colleges
## 4 Southern Association of Colleges and Schools Commission on Colleges
## 5 Southern Association of Colleges and Schools Commission on Colleges
## 6 Southern Association of Colleges and Schools Commission on Colleges
## MainDegree HighDegree Control Region Locale Latitude Longitude AdmitRate
## 1 3 4 Public Southeast City 34.78337 -86.56850 0.9027
## 2 3 4 Public Southeast City 33.50570 -86.79935 0.9181
## 3 3 4 Private Southeast City 32.36261 -86.17401 NA
## 4 3 4 Public Southeast City 34.72456 -86.64045 0.8123
## 5 3 4 Public Southeast City 32.36432 -86.29568 0.9787
## 6 3 4 Public Southeast City 33.21187 -87.54598 0.5330
## MidACT AvgSAT Online Enrollment White Black Hispanic Asian Other PartTime
## 1 18 929 0 4824 2.5 90.7 0.9 0.2 5.6 6.6
## 2 25 1195 0 12866 57.8 25.9 3.3 5.9 7.1 25.2
## 3 NA NA 1 322 7.1 14.3 0.6 0.3 77.6 54.4
## 4 28 1322 0 6917 74.2 10.7 4.6 4.0 6.5 15.0
## 5 18 935 0 4189 1.5 93.8 1.0 0.3 3.5 7.7
## 6 28 1278 0 32387 78.5 10.1 4.7 1.2 5.6 7.9
## NetPrice Cost TuitionIn TuitonOut TuitionFTE InstructFTE FacSalary
## 1 15184 22886 9857 18236 9227 7298 6983
## 2 17535 24129 8328 19032 11612 17235 10640
## 3 9649 15080 6900 6900 14738 5265 3866
## 4 19986 22108 10280 21480 8727 9748 9391
## 5 12874 19413 11068 19396 9003 7983 7399
## 6 21973 28836 10780 28100 13574 10894 10016
## FullTimeFac Pell CompRate Debt Female FirstGen MedIncome
## 1 71.3 71.0 23.96 1068 56.4 36.6 23.6
## 2 89.9 35.3 52.92 3755 63.9 34.1 34.5
## 3 100.0 74.2 18.18 109 64.9 51.3 15.0
## 4 64.6 27.7 48.62 1347 47.6 31.0 44.8
## 5 54.2 73.8 27.69 1294 61.3 34.3 22.1
## 6 74.0 18.0 67.87 6430 61.5 22.6 66.7
mean(college$Cost, na.rm = TRUE)
## [1] 34277.31
The mean cost for the colleges in the data is $34,277.31.
cor(college$Cost, college$AvgSAT, use = "complete.obs")
## [1] 0.5373884
The correlation between cost and average SAT score for the colleges in the data is about 0.537.
hist(college$Cost, main = "Histogram of Cost", xlab = "Cost", col = "red")
The distrubution of cost for the colleges in the data is shown in the histogram above.
mean(college$AdmitRate, na.rm = TRUE)
## [1] 0.6702025
The average admission rate for the colleges in the data is about 67%.
cor(college$AdmitRate, college$CompRate, use = "complete.obs")
## [1] -0.3482341
The correlation between admission rate and completion rate for the colleges in the data is about -0.348.
hist(college$PartTime, main = "Distribution of Part-Time Students", xlab = "Percent Part-Time", col = "blue")
The average percentage of part-time students for the colleges in the data is shown in the histogram above.
sd(college$Debt, na.rm = TRUE)
## [1] 5360.986
The standard deviation of average student debt among the colleges in the data is about $5,360.99.
var(college$FacSalary, na.rm = TRUE)
## [1] 6568988
The variance in faculty salaries across the colleges in the data is 6568988, which is large.
tapply(college$Cost, college$Control, mean, na.rm = TRUE)
## Private Profit Public
## 41350.33 28861.96 21338.61
The control type that has the highest average tuition cost for the colleges in the data is private at $41,350.33, followed by profit at $28,861.96, and then public at $21,338.61.
cor(college$MedIncome, college$NetPrice, use = "complete.obs")
## [1] 0.5151298
The relationship between median family income and average net price for the colleges in the data is about 0.515.
Exploring Colleges This report analyzes data from the CollegeScores4yr dataset to explore patterns among U.S. four-year colleges. Using statistics such as mean, median, variance, standard deviation, correlation, and histograms, this study examines college costs, admission rates, student characteristics, and outcomes. The goal is to explore how elements such as tuition, test scores, and family income influence college affordability and overall student success. The analysis shows that college costs differ a lot across the U.S. Private schools are usually the most expensive, while public ones are more affordable. Colleges with higher SAT scores often have higher costs and better completion rates. Most schools admit about 65–70% of applicants, and around one-third of students attend part-time. Student debt and faculty pay vary widely between colleges. Students from higher-income families also tend to go to schools with higher net prices. Overall, the results show clear links between cost, selectivity, and student outcomes.