1.) Introduction

We will use the data from https://www.lock5stat.com/datapage3e.html to look at 4-year college university statistics

From this data, here are 10 hypothetical questions I will ask about this data:

1.) What is the mean enrollment of all the colleges in the data set? 2.) What is the median cost of living amongst all the colleges in the data set? 3.) What is the standard deviation of mean ACT scores amongst the colleges? 4.) What is the correlation between average SAT score and cost of living amongst all the colleges in the data set? 5.) What is the minimum admission rate amongst all the colleges in the data set? 6.) What is the maximum average net price amongst all the colleges in the data set? 7.) What is the distribution of average monthly salary for full-time faculty amongst all the colleges in the data set? 8.) What is the correlation between the average debt for students who complete program and the average total cost for tuition amongst all the colleges in the data set? 9.) What is the mean percent of faculty that are full-time amongst all the colleges in the data set? 10.) What is the range of percent of undergraduates who report being white?

2.) Analysis

Let’s explore these questions in detail.

college = read.csv('https://www.lock5stat.com/datasets3e/CollegeScores4yr.csv')
head(college)
##                                  Name State     ID Main
## 1            Alabama A & M University    AL 100654    1
## 2 University of Alabama at Birmingham    AL 100663    1
## 3                  Amridge University    AL 100690    1
## 4 University of Alabama in Huntsville    AL 100706    1
## 5            Alabama State University    AL 100724    1
## 6           The University of Alabama    AL 100751    1
##                                                                Accred
## 1 Southern Association of Colleges and Schools Commission on Colleges
## 2 Southern Association of Colleges and Schools Commission on Colleges
## 3 Southern Association of Colleges and Schools Commission on Colleges
## 4 Southern Association of Colleges and Schools Commission on Colleges
## 5 Southern Association of Colleges and Schools Commission on Colleges
## 6 Southern Association of Colleges and Schools Commission on Colleges
##   MainDegree HighDegree Control    Region Locale Latitude Longitude AdmitRate
## 1          3          4  Public Southeast   City 34.78337 -86.56850    0.9027
## 2          3          4  Public Southeast   City 33.50570 -86.79935    0.9181
## 3          3          4 Private Southeast   City 32.36261 -86.17401        NA
## 4          3          4  Public Southeast   City 34.72456 -86.64045    0.8123
## 5          3          4  Public Southeast   City 32.36432 -86.29568    0.9787
## 6          3          4  Public Southeast   City 33.21187 -87.54598    0.5330
##   MidACT AvgSAT Online Enrollment White Black Hispanic Asian Other PartTime
## 1     18    929      0       4824   2.5  90.7      0.9   0.2   5.6      6.6
## 2     25   1195      0      12866  57.8  25.9      3.3   5.9   7.1     25.2
## 3     NA     NA      1        322   7.1  14.3      0.6   0.3  77.6     54.4
## 4     28   1322      0       6917  74.2  10.7      4.6   4.0   6.5     15.0
## 5     18    935      0       4189   1.5  93.8      1.0   0.3   3.5      7.7
## 6     28   1278      0      32387  78.5  10.1      4.7   1.2   5.6      7.9
##   NetPrice  Cost TuitionIn TuitonOut TuitionFTE InstructFTE FacSalary
## 1    15184 22886      9857     18236       9227        7298      6983
## 2    17535 24129      8328     19032      11612       17235     10640
## 3     9649 15080      6900      6900      14738        5265      3866
## 4    19986 22108     10280     21480       8727        9748      9391
## 5    12874 19413     11068     19396       9003        7983      7399
## 6    21973 28836     10780     28100      13574       10894     10016
##   FullTimeFac Pell CompRate Debt Female FirstGen MedIncome
## 1        71.3 71.0    23.96 1068   56.4     36.6      23.6
## 2        89.9 35.3    52.92 3755   63.9     34.1      34.5
## 3       100.0 74.2    18.18  109   64.9     51.3      15.0
## 4        64.6 27.7    48.62 1347   47.6     31.0      44.8
## 5        54.2 73.8    27.69 1294   61.3     34.3      22.1
## 6        74.0 18.0    67.87 6430   61.5     22.6      66.7

Q1: What is the mean enrollment of all the colleges in the data set?

mean(college$Enrollment, na.rm=TRUE)
## [1] 4484.831

The mean enrollment of all the colleges in the data set is 4,484.831.

Q2: What is the median cost of living amongst all the colleges in the data set?

median(college$Cost, na.rm = TRUE)
## [1] 30699

The median cost of living amongst all the colleges in the data set is $30,699.

Q3: What is the standard deviation of mean ACT scores amongst the colleges?

sd(college$MidACT, na.rm = TRUE)
## [1] 3.653612

The standar deviation of mean ACT scores amongst the colleges is 3.653612 points.

Q4: What is the correlation between average SAT score and cost of living amongst all the colleges in the data set?

cor(college$AvgSAT, college$Cost, use = 'complete.obs')
## [1] 0.5373884

The correlation between average SAT score and cost of living amongst all the colleges in the data set is 0.53.

Q5: What is the minimum net tuition revenue per FTE student amongst all the colleges in the data set?

min(college$TuitionFTE, na.rm = TRUE)
## [1] 0

The minumum net tuition revenue per FTE student amongst all the colleges in the data set is $0.

Q6: What is the maximum average net price amongst all the colleges in the data set?

max(college$NetPrice, na.rm = TRUE)
## [1] 55775

The maximum average net price amongst all the colleges in the data set is $55,775.

Q7: What is the distribution of average monthly salary for full-time faculty amongst all the colleges in the data set?

hist(college$FacSalary, main = 'Histogram Of Average Monthly Salary for Full-Time Faculty', xlab = "Cost", col = "yellow")

This shows the distribution of average monthly salary for full-time faculty amongst all the colleges in the data set.

Q8: What is the correlation between the average debt for students who complete program and the average total cost for tuition amongst all the colleges in the data set?

cor(college$Cost, college$Debt, use = 'complete.obs')
## [1] -0.2144525

The correlation between the average debt for students who complete program and the average total cost for tuition amongst all the colleges in the data set is -0.2144525.

Q9: What is the mean percent of faculty that are full-time amongst all the colleges in the data set?

mean(college$FullTimeFac, na.rm = TRUE)
## [1] 64.8313

The mean percent of faculty that are full-time amongst all the colleges in the data set is 64.8313 staff.

Q10: What is the range of percent of undergraduates who report being white?

range(college$White, na.rm = TRUE)
## [1]   0 100

The range of percent of undergraduates who report being white is from 0% to 100%.

3.) Summary

Q1.) 4,484.831 Q2.) $30,699 Q3.) 3.653612 points Q4.) 0.5373884 Q5.) 0 Q6.) $55,775 Q7.) see graph Q8.) -0.2144525 Q9.) 64.8313 staff Q10.) 0% to 100%