Load the data set mtcars in the datasets R package. Calculate a 95% confidence interval to the nearest MPG for the variable mpg.
data("mtcars")
head(mtcars)
## mpg cyl disp hp drat wt qsec vs am gear carb
## Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
## Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
## Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
## Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
## Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2
## Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1
attach(mtcars)
mean(mpg)
## [1] 20.09062
sd(mpg)
## [1] 6.026948
hist(mpg, 15)
# t-intervals
t.test(mpg)
##
## One Sample t-test
##
## data: mpg
## t = 18.857, df = 31, p-value < 2.2e-16
## alternative hypothesis: true mean is not equal to 0
## 95 percent confidence interval:
## 17.91768 22.26357
## sample estimates:
## mean of x
## 20.09062
# confidence intervals
round(t.test(mpg)$conf.int)
## [1] 18 22
## attr(,"conf.level")
## [1] 0.95
Suppose that standard deviation of 9 paired differences is \(1\). What value would the average difference have to be so that the lower endpoint of a 95% students t confidence interval touches zero? Anser: We have n = 9, sd = 1
Use X_bar +/- t_(n-1) * s/sqrt(n). Since we want the lower endpoint of a 95% confidence intervals = 0, then we have X_bar - t.975 * s/sqrt(n) =0. Inserting the number we have, keep in mind that the degree of freedom is n-1 = 8. X_bar = t.975/3
round(qt(.975, 8)/3, 2)
## [1] 0.77
An independent group Student’s T interval is used instead of a paired T interval when: . The observations are paired between the groups. . The observations between the groups are naturally assumed to be statistically independent. . As long as you do it correctly, either is fine.
Answer: We can’t pair them if the groups are independent of each other as well as independent within themselves.
Consider the mtcars dataset. Construct a 95% T interval for MPG comparing 4 to 6 cylinder cars (subtracting in the order of 4 - 6) assume a constant variance.
The question means “Can we do t test between 2 groups using R?”
data("mtcars")
head(mtcars)
## mpg cyl disp hp drat wt qsec vs am gear carb
## Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
## Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
## Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
## Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
## Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2
## Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1
c4 <- mtcars$mpg[mtcars$cyl==4]
c6 <- mtcars$mpg[mtcars$cyl==6]
par(mfrow=c(2,1))
hist (c4)
hist(c6)
t.test(c4, c6, var.equal = TRUE)
##
## Two Sample t-test
##
## data: c4 and c6
## t = 3.8952, df = 16, p-value = 0.001287
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 3.154286 10.687272
## sample estimates:
## mean of x mean of y
## 26.66364 19.74286
Suppose that 18 obese subjects were randomized, 9 each, to a new diet pill and a placebo.Subjects’ body mass indices (BMIs) were measured at a baseline and again after having received the treatment or placebo for four weeks. The average difference from follow-up to the baseline (followup - baseline) was 3 kg/m2 for the treated group and 1 kg/m2 for the placebo group. The corresponding standard deviations of the differences was 1.5 kg/m2 for the treatment group and 1.8 kg/m2 for the placebo group. The study aims to answer whether the change in BMI over the four week period appear to differ between the treated and placebo groups. What is the pooled variance estimate?
- Answer: Here 2 groups have the equal number of subject. So the pooled variance is just the average variance in each group.
round((1.5^2 + 1.8^2)/2, 2)
## [1] 2.75
Or we can follow the formula in page 79 in the section “Confidence interval”
n1 <- n2 <- 9
x1 <- -3 ##treated
x2 <- 1 ##placebo
s1 <- 1.5 ##treated
s2 <- 1.8 ##placebo
So the pooled variance is
spsq <- ( (n1 - 1) * s1^2 + (n2 - 1) * s2^2) / (n1 + n2 - 2)
round(spsq,2)
## [1] 2.75
In addition, the standard deviation is
round(sqrt(spsq),2)
## [1] 1.66