Introduction

I will use the data from

I chose the following 10 questions from my own understanding of the data

  1. What is the mean of the college ACT scores?
  2. What is the correlation between average SAT and ACT scores?
  3. What is the average cost for in state tuition at the college?
  4. What is the standard deviation of undergraduate enrollment?
  5. What is the correlation between in state tuition and median family income?
  6. What is the range for in state tuition?
  7. What is the mean of students who are online only?
  8. What is the tuition correlation between full time and part time students?
  9. What is the median debt for students who complete their program? 10.What is the correlation between male and female students?

Analysis

college = read.csv("https://www.lock5stat.com/datasets3e/CollegeScores4yr.csv")
head(college)
##                                  Name State     ID Main
## 1            Alabama A & M University    AL 100654    1
## 2 University of Alabama at Birmingham    AL 100663    1
## 3                  Amridge University    AL 100690    1
## 4 University of Alabama in Huntsville    AL 100706    1
## 5            Alabama State University    AL 100724    1
## 6           The University of Alabama    AL 100751    1
##                                                                Accred
## 1 Southern Association of Colleges and Schools Commission on Colleges
## 2 Southern Association of Colleges and Schools Commission on Colleges
## 3 Southern Association of Colleges and Schools Commission on Colleges
## 4 Southern Association of Colleges and Schools Commission on Colleges
## 5 Southern Association of Colleges and Schools Commission on Colleges
## 6 Southern Association of Colleges and Schools Commission on Colleges
##   MainDegree HighDegree Control    Region Locale Latitude Longitude AdmitRate
## 1          3          4  Public Southeast   City 34.78337 -86.56850    0.9027
## 2          3          4  Public Southeast   City 33.50570 -86.79935    0.9181
## 3          3          4 Private Southeast   City 32.36261 -86.17401        NA
## 4          3          4  Public Southeast   City 34.72456 -86.64045    0.8123
## 5          3          4  Public Southeast   City 32.36432 -86.29568    0.9787
## 6          3          4  Public Southeast   City 33.21187 -87.54598    0.5330
##   MidACT AvgSAT Online Enrollment White Black Hispanic Asian Other PartTime
## 1     18    929      0       4824   2.5  90.7      0.9   0.2   5.6      6.6
## 2     25   1195      0      12866  57.8  25.9      3.3   5.9   7.1     25.2
## 3     NA     NA      1        322   7.1  14.3      0.6   0.3  77.6     54.4
## 4     28   1322      0       6917  74.2  10.7      4.6   4.0   6.5     15.0
## 5     18    935      0       4189   1.5  93.8      1.0   0.3   3.5      7.7
## 6     28   1278      0      32387  78.5  10.1      4.7   1.2   5.6      7.9
##   NetPrice  Cost TuitionIn TuitonOut TuitionFTE InstructFTE FacSalary
## 1    15184 22886      9857     18236       9227        7298      6983
## 2    17535 24129      8328     19032      11612       17235     10640
## 3     9649 15080      6900      6900      14738        5265      3866
## 4    19986 22108     10280     21480       8727        9748      9391
## 5    12874 19413     11068     19396       9003        7983      7399
## 6    21973 28836     10780     28100      13574       10894     10016
##   FullTimeFac Pell CompRate Debt Female FirstGen MedIncome
## 1        71.3 71.0    23.96 1068   56.4     36.6      23.6
## 2        89.9 35.3    52.92 3755   63.9     34.1      34.5
## 3       100.0 74.2    18.18  109   64.9     51.3      15.0
## 4        64.6 27.7    48.62 1347   47.6     31.0      44.8
## 5        54.2 73.8    27.69 1294   61.3     34.3      22.1
## 6        74.0 18.0    67.87 6430   61.5     22.6      66.7

Q1: What is the mean of the college ACT scores?

mean(college$MidACT, na.rm=TRUE)
## [1] 23.53514
hist(college$MidACT, main = "Average ACT Scores", xlab= "ACT Scores", ylab = "Students")

The mean ACT score is 23.53

Q2: What is the correlation between average SAT and average ACT scores?

scatter(coll)

cor(college$AvgSAT, college$MidACT, use = "complete.obs")
## [1] 0.9820588
scatter.smooth(college$AvgSAT, main = "Average SAT scores", xlab= "SAT Scores", ylab= "Students")

scatter.smooth(college$MidACT, main = "Average ACT scores", xlab= "ACT Scores", ylab= "Students")

The correlation is almost 1 to 1. Very strong correleation

Q3: What is the average cost for in state tuition?

mean(college$TuitionIn, na.rm=TRUE)
## [1] 21948.55
hist(college$TuitionIn, main = "In-State Tuition Cost", xlab= "Cost (USD)", ylab = "Students")

The average cost for tuition is $21,948.55

Q4: What is the standard deviation of in-state tuition?

sd(college$TuitionIn, na.rm = TRUE)
## [1] 14130.3
boxplot(college$TuitionIn, ylab = "In-State Tuition (USD)", main = "Standard Deviation of In-State Tuition")

The standard deviation of in-state tuition is $14,130

Q5: What is the correlation between in state tuition and median family income?

cor(college$TuitionIn, college$MedIncome, use="complete.obs")
## [1] 0.5740957
scatter.smooth(college$TuitionIn, main = "Correlation Between In-state Tuition and Median Family Income", ylab ="In-State Tuition (USD)", xlab = "Median Family Income (Per Week)")

The correlation between in state tuition and median family income is 0.574

Q6: What is the range for in state tuition?

range(college$TuitionIn, na.rm=TRUE)
## [1]   480 88000

The range of in state tuition between the data is 480 and 88000

Q7: What is the mean of students who are online only?

mean(college$Online, na.rm = TRUE)
## [1] 0.0139165

The mean of students who are online only is 0.01392

Q8: What is the correlation between full time and part time students?

mean(college$PartTime, na.rm =TRUE)
## [1] 16.46559

16.5% of students are part-time. while the other 83.5% of students are full time

Q9: What is the median debt for students who complete their program?

median(college$Debt, na.rm =TRUE)
## [1] 713.5
hist(college$Debt, main = "Student Debt After GraduationS", xlab= "Cost (USD)", ylab = "Students")

The median amount of debt a students finishes with is $713.50

Q10: What is the correlation between male and female students?

mean(college$Female, na.rm =TRUE)
## [1] 59.29588
scatter.smooth(college$Female, main = "Correlation Between Male and Female Students", ylab ="Percentage of Female Students", xlab = "Schools")

59.3% of students are female while the other 40.7% are males