We use the data from statistics of 4 year colleges.
I propose the following 10 questions on my own understanding of the data.
Now i will ask ChatGPT for 10 questions based on the data parameters.
We will explore the questions in more detail.
college=read.csv("https://www.lock5stat.com/datasets3e/CollegeScores4yr.csv")
head(college)
## Name State ID Main
## 1 Alabama A & M University AL 100654 1
## 2 University of Alabama at Birmingham AL 100663 1
## 3 Amridge University AL 100690 1
## 4 University of Alabama in Huntsville AL 100706 1
## 5 Alabama State University AL 100724 1
## 6 The University of Alabama AL 100751 1
## Accred
## 1 Southern Association of Colleges and Schools Commission on Colleges
## 2 Southern Association of Colleges and Schools Commission on Colleges
## 3 Southern Association of Colleges and Schools Commission on Colleges
## 4 Southern Association of Colleges and Schools Commission on Colleges
## 5 Southern Association of Colleges and Schools Commission on Colleges
## 6 Southern Association of Colleges and Schools Commission on Colleges
## MainDegree HighDegree Control Region Locale Latitude Longitude AdmitRate
## 1 3 4 Public Southeast City 34.78337 -86.56850 0.9027
## 2 3 4 Public Southeast City 33.50570 -86.79935 0.9181
## 3 3 4 Private Southeast City 32.36261 -86.17401 NA
## 4 3 4 Public Southeast City 34.72456 -86.64045 0.8123
## 5 3 4 Public Southeast City 32.36432 -86.29568 0.9787
## 6 3 4 Public Southeast City 33.21187 -87.54598 0.5330
## MidACT AvgSAT Online Enrollment White Black Hispanic Asian Other PartTime
## 1 18 929 0 4824 2.5 90.7 0.9 0.2 5.6 6.6
## 2 25 1195 0 12866 57.8 25.9 3.3 5.9 7.1 25.2
## 3 NA NA 1 322 7.1 14.3 0.6 0.3 77.6 54.4
## 4 28 1322 0 6917 74.2 10.7 4.6 4.0 6.5 15.0
## 5 18 935 0 4189 1.5 93.8 1.0 0.3 3.5 7.7
## 6 28 1278 0 32387 78.5 10.1 4.7 1.2 5.6 7.9
## NetPrice Cost TuitionIn TuitonOut TuitionFTE InstructFTE FacSalary
## 1 15184 22886 9857 18236 9227 7298 6983
## 2 17535 24129 8328 19032 11612 17235 10640
## 3 9649 15080 6900 6900 14738 5265 3866
## 4 19986 22108 10280 21480 8727 9748 9391
## 5 12874 19413 11068 19396 9003 7983 7399
## 6 21973 28836 10780 28100 13574 10894 10016
## FullTimeFac Pell CompRate Debt Female FirstGen MedIncome
## 1 71.3 71.0 23.96 1068 56.4 36.6 23.6
## 2 89.9 35.3 52.92 3755 63.9 34.1 34.5
## 3 100.0 74.2 18.18 109 64.9 51.3 15.0
## 4 64.6 27.7 48.62 1347 47.6 31.0 44.8
## 5 54.2 73.8 27.69 1294 61.3 34.3 22.1
## 6 74.0 18.0 67.87 6430 61.5 22.6 66.7
mean(college$Cost, na.rm = TRUE)
## [1] 34277.31
The mean cost of of all colleges is $34277.
cor(college$FacSalary, college$CompRate, use="complete.obs")
## [1] 0.577221
The correlation between the pay of faculty and the students who succesfully get there degree is 57.7%.
median(college$MidACT, na.rm = TRUE)
## [1] 23
The median of ACT scores for all colleges in the dataset is a score of 23.
mean(college$TuitonOut, na.rm = TRUE)-mean(college$TuitionIn, na.rm = TRUE)
## [1] 3388.112
on average the out of state tuition cost $3388 more than living in state.
sd(college$TuitionIn, na.rm=TRUE)
## [1] 14130.3
The standard deviation of in state tuition across al schools is $14130.
hist(college$White, main="distribution of whites in colleges", xlab="population white")
cor(college$CompRate, college$FirstGen , use="complete.obs")
## [1] -0.6643909
the correlation between being a first gen college student and completeing college is -66.4%
mean(college$Online, na.rm = TRUE)/mean(college$Enrollment, na.rm = TRUE)
## [1] 3.103015e-06
the percent of students who are enrolled in college that are only online only is about 0.0031%
boxplot(college$Debt, ylab="Dollars of debt")
cor(college$Female, college$FullTimeFac, use="complete.obs")
## [1] -0.1777673
there is a correlation of -17.7% of female students and fulltime faculty.
Four-year college statistics
After doing some statistical analysis on colleges throughout the United States I found some very interesting things. The first stat was the average cost of room, board, and tuition for colleges. The research shows that it costs approximately 34,277 dollars. I find that crazy, but that is because I don’t have to pay for housing. Following that I was curious about how much more out-of-state tuition is vs in-state tuition. The data shows that it costs on average $3,400 more if you go to an out-of-state college. The next thing I was wondering about was if there is any correlation between how much the average faculty salary is and the success of students completing their degrees. after research, it shows that there is a positive 57.7% correlation, showing that if faculty get paid better the more success a student will have. Another curious thing I was wondering was the correlation between a student being a first-generation college student and completing their degree. It shows that there is a negative 66.4% correlation, saying that it is less likely you will not finish your degree if you are a first-generation college student. I was wondering what percent of college students in the US are all online for college, and it showed that approximately 0.0031% of all undergrad students are online. I was very intrigued by this answer, especially in today’s age where almost everything we do is online. I expected it to be a higher percentage. The final thing I wondered about was the amount of debt that students have after completing their degree this box plot shows the data that it produced.
mean(college$Cost, na.rm = TRUE)
## [1] 34277.31
cor(college$FacSalary, college$CompRate, use="complete.obs")
## [1] 0.577221
median(college$MidACT, na.rm = TRUE)
## [1] 23
mean(college$TuitonOut, na.rm = TRUE)-mean(college$TuitionIn, na.rm = TRUE)
## [1] 3388.112
sd(college$TuitionIn, na.rm=TRUE)
## [1] 14130.3
hist(college$White, main="distribution of whites in colleges", xlab="population white")
cor(college$CompRate, college$FirstGen , use="complete.obs")
## [1] -0.6643909
mean(college$Online, na.rm = TRUE)/mean(college$Enrollment, na.rm = TRUE)
## [1] 3.103015e-06
boxplot(college$Debt, ylab="Dollars of debt")
cor(college$Female, college$FullTimeFac, use="complete.obs")
## [1] -0.1777673