These commands help visualize the \(t\) distribution, make \(t\) probability calculations, e.g., to calculate p-values, and run \(t\) tests for a single sample mean, for paired data, and to compare means from two independent samples. Some commands require mosaic (noted in comments) so you should install that package.
library(mosaic)
Visualizing the \(t\) distribution using plotDist()
:
# plotDist() is in the mosaic package
plotDist(dist='t', df=8, col="blue")
plotDist(dist='t', df=2, col="red", add=TRUE)
# add standard normal for comparison
plotDist(dist='norm', col="black", lty="dashed", add=TRUE)
Making \(t\) cummulative probability calculations using pt()
:
# Note that we are usually use positive $t$ values and
# are interested in upper tail probabilities.
# so use lower.tail=FALSE
T.prob <- pt(q=2.82, df=8, lower.tail=FALSE)
T.prob
## [1] 0.01124694
Making and visualizing \(t\) cummulative probability calculations using xpchisq()
:
# xpt() is in the mosaic package
xpt(q=2.82, df=9, lower.tail=FALSE)
## [1] 0.0100235
Finding critical \(t^*\) values using qt()
:
# for a 95% confidence level, we want 0.025 in the upper tail
T.crit <- qt(p=0.025, df=8, lower.tail=FALSE)
T.crit
## [1] 2.306004
Finding and visualizing critical \(t^*\) values using xqt()
:
# xpt() is in the mosaic package
# for a 95% confidence level, we want 0.025 in the upper tail
# same as 0.975 in the low end
xqt(p=0.975, df=8)
## [1] 2.306004
# We will use the dataset "KidsFeet" from the 'mosaicData' package
# Foot measurements for 39 fourth graders
data(KidsFeet)
head(KidsFeet) #extra
## name birthmonth birthyear length width sex biggerfoot domhand
## 1 David 5 88 24.4 8.4 B L R
## 2 Lars 10 87 25.4 8.8 B L L
## 3 Zach 12 87 24.5 9.7 B R R
## 4 Josh 1 88 25.2 9.8 B L R
## 5 Lang 2 88 25.1 8.9 B L R
## 6 Scotty 3 88 25.7 9.7 B R R
Run a \(t\)-test to get a confidence interval on a single sample mean and test the null hypothesis that \(\mu=0\) using t.test()
:
# t.test() runs a bit differently depending on whether mosaic is loaded or not
# one way to do it
t.test(x=KidsFeet$length, conf.level=0.95) #0.95 is default but can change
##
## One Sample t-test
##
## data: KidsFeet$length
## t = 117.18, df = 38, p-value < 2.2e-16
## alternative hypothesis: true mean is not equal to 0
## 95 percent confidence interval:
## 24.29597 25.15019
## sample estimates:
## mean of x
## 24.72308
# another way
t.test(~length, data=KidsFeet)
##
## One Sample t-test
##
## data: length
## t = 117.18, df = 38, p-value < 2.2e-16
## alternative hypothesis: true mean is not equal to 0
## 95 percent confidence interval:
## 24.29597 25.15019
## sample estimates:
## mean of x
## 24.72308
Test the null hypothesis that \(H_0: \mu=24\) using t.test()
:
t.test(x=KidsFeet$length, mu=24)
##
## One Sample t-test
##
## data: KidsFeet$length
## t = 3.4272, df = 38, p-value = 0.001479
## alternative hypothesis: true mean is not equal to 24
## 95 percent confidence interval:
## 24.29597 25.15019
## sample estimates:
## mean of x
## 24.72308
Test the null hypothesis above with a one-sided alternative, \(H_A: \mu \gt 24\) using t.test()
:
t.test(x=KidsFeet$length, mu=24, alternative="greater")
##
## One Sample t-test
##
## data: KidsFeet$length
## t = 3.4272, df = 38, p-value = 0.0007397
## alternative hypothesis: true mean is greater than 24
## 95 percent confidence interval:
## 24.36737 Inf
## sample estimates:
## mean of x
## 24.72308
Paired data are data in which two variables are measured on the same individuals, such as before vs. after some treatment. Though it does not really make sense, we can compare the width of kids’ feet to their lengths. Data hear are paired because these are the same feet on the same individual kids.
Compare two paired means using t.test()
:
# t.test() runs a bit differently depending on whether mosaic is loaded or not
t.test(x=KidsFeet$length, y=KidsFeet$width, paired=TRUE)
##
## Paired t-test
##
## data: KidsFeet$length and KidsFeet$width
## t = 92.219, df = 38, p-value < 2.2e-16
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 15.38545 16.07609
## sample estimates:
## mean of the differences
## 15.73077
We can compare kids whose Left foot is bigger to those whose Right foot was bigger. First let’s use a figure. Here black data points are jittered, red point is the mean and red bar is the standard error.
ggplot(KidsFeet, aes(x=biggerfoot, y=length)) +
geom_jitter(width=0.1) +
stat_summary(fun.data="mean_se", col="red")
Perform a \(t\)-test to test the null hypothesis \(H_0: \mu_{Lbig}=\mu_{Rbig}\) against the altermative \(H_A: \mu_{Lbig} \ne \mu_{Rbig}\) using t.test()
:
# t.test() runs a bit differently depending on whether mosaic is loaded or not
t.test(length~biggerfoot, data=KidsFeet)
##
## Welch Two Sample t-test
##
## data: length by biggerfoot
## t = 2.1122, df = 31.734, p-value = 0.04264
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 0.03088997 1.71937741
## sample estimates:
## mean in group L mean in group R
## 25.10455 24.22941
Note that when comparing two means the confidence interval is on the difference between means.