In this assignment, we will use various statistical methods to examine data from the CollegeScores4yr data set.
We will now explore the above questions in detail.
The average cost of college is $34,277.31
The correlation of of faculty salary to tuition is 0.424201
The variance of college debt is 28740171
The mean ACT score from public colleges is 1118.91 The mean ACT score from private colleges is 1145.839
The standard deviation of female students is 12.34421
The public admission rate on average is 0.7008028 The private college admission rate on average is 0.6499581 The profit college admission rate on average is 0.7448919
Across schools, faculty salary varies by 6568988
The standard deviation between highest degree offered is 0.4724913
Private colleges make up 61.% of all colleges within the database. Public colleges make up 29.8% of all colleges within the database. For-profit colleges make up 8.4% of all colleges within the database.
The correlation between net-price and in-state tuition is 0.7371491.
#Loading in the database:
db = read.csv("https://www.lock5stat.com/datasets3e/CollegeScores4yr.csv")
#Question 1 code:
mean(db$Cost, na.rm = TRUE)
## [1] 34277.31
#Question 2 code:
cor(db$FacSalary, db$Cost, use = "complete.obs")
## [1] 0.424201
#Question 3 code:
var(db$Debt, na.rm = TRUE)
## [1] 28740171
#Question 4 code:
publicColleges <- filter(db, Control == "Public")
privateColleges <- filter(db, Control == "Private")
meanOfPublicACT <- mean(publicColleges$AvgSAT, na.rm = TRUE)
meanOfPublicACT
## [1] 1118.91
meanOfPrivateACT <- mean(privateColleges$AvgSAT, na.rm = TRUE)
meanOfPrivateACT
## [1] 1145.839
#Question 5 code:
sd(db$Female, na.rm = TRUE)
## [1] 12.34421
#Question 6 code:
publicColleges <- filter(db, Control == "Public")
privateColleges <- filter(db, Control == "Private")
profitColleges <- filter(db, Control == "Profit")
publicAdmission <- mean(publicColleges$AdmitRate, na.rm = TRUE)
privateAdmission <- mean(privateColleges$AdmitRate, na.rm = TRUE)
profitAdmission <- mean(profitColleges$AdmitRate, na.rm = TRUE)
publicAdmission
## [1] 0.7008028
privateAdmission
## [1] 0.6499581
profitAdmission
## [1] 0.7448919
barplot(c(publicAdmission,privateAdmission,profitAdmission),names.arg = c("Public Colleges", "Private Colleges", "Profit Colleges"), xlab = "College Type", ylab = "Average Admission Rate", main = "Average Admission Rate Between College Types")
#Question 7 code:
var(db$FacSalary, na.rm = TRUE)
## [1] 6568988
#Question 8 code:
sd(db$HighDegree, na.rm = TRUE)
## [1] 0.4724913
#Question 9 code:
publicColleges <- nrow(filter(db, Control == "Public"))
privateColleges <- nrow(filter(db, Control == "Private"))
profitColleges <- nrow(filter(db, Control == "Profit"))
collegeTypes <- c(publicColleges, privateColleges, profitColleges)
names <- c("Public Colleges", "Private Colleges", "Profit Colleges")
percentages <- round((collegeTypes / sum(collegeTypes)) * 100, 1)
labels <- paste(names, percentages, "%")
pie(collegeTypes, labels = labels, main = "Distribution of College Types")
#Question 10 code:
cor(db$NetPrice,db$TuitionIn, use = "complete.obs")
## [1] 0.7371491