This lab explores data collected from a random sample of 236 undergraduate students.
The data observes individual undergrads, and contains the following variables:
load("drinking.RData")
x<-data
head(x)
## Gender Alcohol Height Cheat
## 1 Female 15 64 0
## 2 Male 14 69 0
## 3 Female NA 66 0
## 4 Female 10 63 0
## 5 Male 30 72 0
## 6 Female 20 67 0
## Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
## 0.000 0.000 1.000 4.539 7.000 36.000 30
We see that most of the data is clustered around the 0-10 range, and heavily skewed right, with subsequent levels experiencing lower frequencies. This skewedness is evidenced by the mean of 4.5, which is higher than the median of 1.
The data suggests that drinking is a problem at the university. Specifically, amongst the more heavy drinkers. Given the Quartile 1 value of 0, it is estimated that around 25% of students don’t drink at all. Even then, 50% drink at most once per week. However, in the higher ranges, we see that 25% of students drink between 7 and 36 alcoholic beverages per week. This is an issue, but not for the entire university population, only a subset of it.
tbl= table(x$Cheat)
names(tbl) <- c("% Would not report cheating", "% Would report cheating")
#100*tbl/sum(tbl)
cheat <- tbl[1] / (tbl[1]+tbl[2])
no_cheat <- tbl[2]/(tbl[1]+tbl[2])
pie(tbl, labels=c(paste(round(cheat*100), "% Would not report cheating"),
paste(round(no_cheat*100), "% Would report cheating")),
main="Would Students Report Cheating?")
round(100*tbl/sum(tbl), 2)
## % Would not report cheating % Would report cheating
## 91.53 8.47
The data shows that around 92% of students would not report cheating during an exam. This is reason enough for the university to reconsider any honor system rules in place. A professor likely wouldn’t know about it, unless almost every non reporter cheated, in which case, there would likely be at least one instance of a reporter witnessing, and proceeding to report a case of cheating.