Raw Data

The data observes individual state university students, who are enrolled in one of four selected introductory statistics courses. Each variable corresponds to their answer for one of the survey questions.

Math: Math SAT score
Verbal: Verbal SAT score
Credits: Number of credits the student is registered for
Year: Year in college (1=Freshman, 2=Sophomore, 3=Junior, 4=Senior)
Exer: Time (in minutes) spent exercising in a typical day
Sleep: Time (in hours) spent sleeping in a typical day
Veg: Are you a vegetarian (yes, no, some)
Cell: Do you own a cell phone (yes, no)

rm(list=ls())
load("cell_phones.RData")
x<-data
x<-na.omit(x)
head(x)

##   Math Verbal Credits Year Exer Sleep Veg. Cell
## 1  640    470      15    1   60   7.0   no  yes
## 2  660    650      14    1   20   7.5   no  yes
## 3  550    580      15    2    0   9.0   no   no
## 4  560    660      16    1   30   7.0   no  yes
## 5  600    790      15    4   45   6.5 some   no
## 6  560    640      16    2   75   4.5  yes   no

Q1. The mean verbal SAT score of all the students in this university is 580. Is this also the case for all stat students at this university? Note that verbal SAT scores in the U.S. have a standard deviation of 111.

hist(x$Verbal, main="Distribution of sample verbal SAT scores", xlab="Verbal SAT Score")

summary(x$Verbal)

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   400.0   542.5   590.0   596.7   650.0   800.0

sd(x$Verbal)

## [1] 77.69806

We see that the distribution of the sampled students’ verbal SAT scores is ~N(597, 78). We will proceed to test whether or not the mean verbal score is statistically distinct from that of the general university population.

sigma = 111
u0 = 580

z = (mean(x$Verbal) - u0) / ( (sigma) / (sqrt(length(x$Verbal))))

paste("Z = ", round(z,2))

## [1] "Z =  2.47"

paste("p = ", round(2*pnorm(abs(z), lower.tail=FALSE),2))

## [1] "p =  0.01"

We performed a 1 sample Z test, and received the following results:

Null Hypothesis: sample mean verbal SAT score == 580
Alternative Hypothesis: sample mean verbal SAT score =/= 580
Test Statistic: z = 2.47
P Value: p = .01

Therefore, we can reject the null hypothesis, and conclude that the mean verbal SAT score of stat students is significantly different than the mean score for the entire student body in this university.

Q2. Based on a recent study, roughly 80% of college students in the U.S. own a cell phone. Do the data provide evidence that the proportion of students who own cell phones in this university is lower than the national figure?

We start by calculating the relative frequency of phone ownership amongst the selected students

tbl = table(x$Cell)
tbl2 = round(100*tbl/sum(tbl),2)
y = c(tbl2[1], tbl2[2])
names(y) <- c("% Doesn't Own", "% Owns")
paste("Phone ownership amongst sample")

## [1] "Phone ownership amongst sample"

## % Doesn't Own        % Owns 
##         21.85         78.15

Then, we can visualize these results.

pie(tbl2, labels=c(paste(tbl2[1], "%  Do not own a phone"), 
                  paste(tbl2[2], "%  Own a phone")),
    main="Phone Ownership Amongst Students")

Now, we can take these results and compare to the ownership proportions of university students in the US, to see if this university’s students own phones at a lower rate.

p0 = .8
n = length(x$Cell)
np = length(x$Cell[x$Cell == "yes"])
prop.test(np, n, p0, alternative="less", correct=FALSE)

## 
##  1-sample proportions test without continuity correction
## 
## data:  np out of n, null probability p0
## X-squared = 0.5787, df = 1, p-value = 0.2234
## alternative hypothesis: true p is less than 0.8
## 95 percent confidence interval:
##  0.0000000 0.8199443
## sample estimates:
##         p 
## 0.7814815

z = (0.7814815 - p0)/sqrt(p0*(1-p0)/n)
z

## [1] -0.760725

We performed a 1 sample Z test for proportions, and received the following results:

Null Hypothesis: sample proportion of phone owners == .8
Alternative Hypothesis: sample proportion of phone owners < .8
Test Statistic: z = -.7
P Value: p = .22

Given our high p value, we fail to reject the null hypothesis. Even though we found that the sample mean is lower than .8, it is not a magnitude lower that may be considered statistically significant.

Q3. Adults in the U.S. average 7 hours of sleep a night. Is this also the mean for all stat students at this university?

hist(x$Sleep, main="Distribution of sample hours slept", 
     xlab="Hours slept in a typical day")

summary(x$Sleep)

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   3.000   6.500   7.000   7.239   8.000  15.000

sd(x$Sleep)

## [1] 1.434884

u0 = 7
t.test(x$Sleep, alternative="two.sided", mu=u0)

## 
##  One Sample t-test
## 
## data:  x$Sleep
## t = 2.7357, df = 269, p-value = 0.00664
## alternative hypothesis: true mean is not equal to 7
## 95 percent confidence interval:
##  7.066963 7.410815
## sample estimates:
## mean of x 
##  7.238889

We performed a 1 sample t test:

Null Hypothesis: sample mean hours slept == 7
Alternative Hypothesis: sample mean hours slept =/= 7
Test Statistic: t = 2.7
P Value: p = .006

Given our small p value, we may reject the null hypothesis and conclude that stat students at this university, slept on average, at a statistically significantly different rate than US adults.

Cell Phones, SAT Scores, and Sleep Habits of University Students

Meilad Imanian

11/10/2020

Background

Raw Data

Q1. The mean verbal SAT score of all the students in this university is 580. Is this also the case for all stat students at this university? Note that verbal SAT scores in the U.S. have a standard deviation of 111.

Q2. Based on a recent study, roughly 80% of college students in the U.S. own a cell phone. Do the data provide evidence that the proportion of students who own cell phones in this university is lower than the national figure?

Q3. Adults in the U.S. average 7 hours of sleep a night. Is this also the mean for all stat students at this university?