We use the data from https://www.lock5stat.com/datapage3e.html
I propose the following 10 questions based on my understanding of the data.
We will explore the questions in detail.
college = read.csv("https://www.lock5stat.com/datasets3e/CollegeScores4yr.csv")
head(college)
## Name State ID Main
## 1 Alabama A & M University AL 100654 1
## 2 University of Alabama at Birmingham AL 100663 1
## 3 Amridge University AL 100690 1
## 4 University of Alabama in Huntsville AL 100706 1
## 5 Alabama State University AL 100724 1
## 6 The University of Alabama AL 100751 1
## Accred
## 1 Southern Association of Colleges and Schools Commission on Colleges
## 2 Southern Association of Colleges and Schools Commission on Colleges
## 3 Southern Association of Colleges and Schools Commission on Colleges
## 4 Southern Association of Colleges and Schools Commission on Colleges
## 5 Southern Association of Colleges and Schools Commission on Colleges
## 6 Southern Association of Colleges and Schools Commission on Colleges
## MainDegree HighDegree Control Region Locale Latitude Longitude AdmitRate
## 1 3 4 Public Southeast City 34.78337 -86.56850 0.9027
## 2 3 4 Public Southeast City 33.50570 -86.79935 0.9181
## 3 3 4 Private Southeast City 32.36261 -86.17401 NA
## 4 3 4 Public Southeast City 34.72456 -86.64045 0.8123
## 5 3 4 Public Southeast City 32.36432 -86.29568 0.9787
## 6 3 4 Public Southeast City 33.21187 -87.54598 0.5330
## MidACT AvgSAT Online Enrollment White Black Hispanic Asian Other PartTime
## 1 18 929 0 4824 2.5 90.7 0.9 0.2 5.6 6.6
## 2 25 1195 0 12866 57.8 25.9 3.3 5.9 7.1 25.2
## 3 NA NA 1 322 7.1 14.3 0.6 0.3 77.6 54.4
## 4 28 1322 0 6917 74.2 10.7 4.6 4.0 6.5 15.0
## 5 18 935 0 4189 1.5 93.8 1.0 0.3 3.5 7.7
## 6 28 1278 0 32387 78.5 10.1 4.7 1.2 5.6 7.9
## NetPrice Cost TuitionIn TuitonOut TuitionFTE InstructFTE FacSalary
## 1 15184 22886 9857 18236 9227 7298 6983
## 2 17535 24129 8328 19032 11612 17235 10640
## 3 9649 15080 6900 6900 14738 5265 3866
## 4 19986 22108 10280 21480 8727 9748 9391
## 5 12874 19413 11068 19396 9003 7983 7399
## 6 21973 28836 10780 28100 13574 10894 10016
## FullTimeFac Pell CompRate Debt Female FirstGen MedIncome
## 1 71.3 71.0 23.96 1068 56.4 36.6 23.6
## 2 89.9 35.3 52.92 3755 63.9 34.1 34.5
## 3 100.0 74.2 18.18 109 64.9 51.3 15.0
## 4 64.6 27.7 48.62 1347 47.6 31.0 44.8
## 5 54.2 73.8 27.69 1294 61.3 34.3 22.1
## 6 74.0 18.0 67.87 6430 61.5 22.6 66.7
mean(college$AdmitRate, na.rm=TRUE)
## [1] 0.6702025
barplot(college$AdmitRate)
mean(college$AvgSAT, na.rm=TRUE)
## [1] 1135.25
median(college$AvgSAT, na.rm=TRUE)
## [1] 1121
var(college$TuitionIn, na.rm=TRUE)
## [1] 199665280
sd(college$TuitionIn, na.rm=TRUE)
## [1] 14130.3
sd(college$CompRate, na.rm=TRUE)
## [1] 21.12272
hist(college$TuitionIn, main="Histogram of Tuition Fees", xlab="Tuition Fees", col="lightblue", border="black", breaks=20)
correlation <- cor(college$TuitionIn, college$CompRate, use="complete.obs")
print(paste("Graduation between Tuition Fees and Graduation Rates:", correlation))
## [1] "Graduation between Tuition Fees and Graduation Rates: 0.547703932183766"
avg_sat <- mean(college$AvgSAT, na.rm=TRUE)
print(paste("Average Combined SAT Score for ALL Colleges:", avg_sat))
## [1] "Average Combined SAT Score for ALL Colleges: 1135.24980422866"
correlation <- cor(college$PartTime, college$FullTimeFac, use="complete.obs")
print(paste("Correlation between Part-Time Students and Full-Time Faculty:", correlation))
## [1] "Correlation between Part-Time Students and Full-Time Faculty: -0.299398043343412"
boxplot(NetPrice ~ State, data = college, main = "Variation of Net Price Across States:", xlab = "State", ylab = "Net Price($)", col = "lightblue", las = 2, cex.axis = 0.7)
high_completion_data <- subset(college, CompRate>80)
avg_faculty_salary <- mean(high_completion_data$FacSalary, na.rm=TRUE)
print(paste("Average Faculty Salary at Colleges with Completion Rate above 80%?:", avg_faculty_salary))
## [1] "Average Faculty Salary at Colleges with Completion Rate above 80%?: 11182.9473684211"
It was truly a learning lesson with R, but once you’ve repeated something a couple times, you got it down. I was surprised at how easy everything was once I knew what to do. I will either accept the grade as is, or I would like to try and do a more thorough report. I spent a lot of time teaching myself the ins and outs. I should have reached out sooner for help. I will reach out sooner on next projects.