I. Confidence Intervals
When conducting exploratory data analysis, we often wish to estimate parameters like the mean and standard deviation for numeric data, or a proportion for category data. Statisticians prefer interval estimates to point estimates. Though the confidence interval gives less apparent precision, the range of values can be constructed so that there is high likelihood the actual parameter is within that range.
Traditional confidence intervals use a theoretical distribution like the normal distribution or \(t\)-distribution. For a 95% confidence interval, two cutoff values are calculated, one left and one right, that will capture exactly 95% of the theoretical distribution between them.
Initializing RStudio
The data set we will use primarily is Data3350 which was produced in 2015 during an undergraduate research project about personality and humor. The VarsData3350 PDF file has descriptions of each variable in the Data3350 file. Both are available for download in D2L. Be sure to put the Data3350 in your R folder in Documents, and make sure your working directory is set the same way (Session menu). The code block below uses the library function to ensure that the Mosaic package is loaded and will import the data frame used in this module: Data3350.
library(mosaic)
library(readxl)
Data3350 = read_excel("Data3350.xlsx")
Example: IQ Scores
Question. Human IQ’s have the \(N(100,15)\) distribution. Find a symmetric interval about the mean that captures exactly 95% of the IQ distribution.
The mosaic function cdist finds the "center of the distribution. We can specify the normal distribution and the level of confidence.
cdist("norm", 0.95)
[1] -1.959964 1.959964

We know that the two cutoff values will lie exactly 1.96 standard deviations above or below the mean. We can the cdist function do the arithmetic for us by multiplying the cutoff values by the standard deviation and adding the population mean.
cdist("norm", 0.95 , plot = FALSE)*15+100
[1] 70.60054 129.39946
We know that 95% of human IQ’s will fall between 70.6 and 129.4.
Example 2: Sample of Student IQ’s
In a sample of 25 college students, the average IQ was 111. Find a 95% confidence interval for the mean IQ.
Solution. We can solve the \(z\)-statistic formula for \(z^*\), the cutoff values.
\[z = \frac{\overline x - \mu}{\sigma / \sqrt{n}}\] \[\overline x - \mu = z\cdot\frac{\sigma}{\sqrt n} \] We need to calculate the two \(z^*\) cutoff values will be identical distances from the mean, so we add the \(\pm\) and simplify.
\[\mu = \overline x \pm z^*\cdot\frac{\sigma}{\sqrt n} \] Here, \(z^* = \pm 1.96\), so we can substitute
\[\mu = 111 \pm 1.96\cdot\frac{15}{\sqrt {25}} \] and find the two cutoff values.
\[\mu = 111 \pm (3) 1.96 =111 \pm 5.88 \] Thus, the 95% confidence interval for the mean of these 25 college students is \[(105.12,116.88)\]
We can accomplish the same thing by correcting the standard deviation by the square root of the sample size and using cdist like before.
cdist("norm", 0.95, plot = FALSE)*3 +111
[1] 105.1201 116.8799
II. A Randomization Approach
We can also simply generate random values from the normal distribution and find the 2.5th and 97.5th percentiles. Try re-running the code block below several times and observe how closely the estimated cutoff values are to \(\pm 1.96\).
qdata(rnorm(500), p=c(0.025, 0.975))
2.5% 97.5%
-1.873852 1.885386
We can create a randomized 95% confidence interval for 25 students. Again, execute the code block several times to see how closely it mirrors the theoretical distribution.
qdata(rnorm(500)*3+111, p=c(0.025, 0.975))
2.5% 97.5%
104.9984 116.1012
III. Why \(t\)-intervals and not \(z\)-intervals
Recall the assumptions for hypothesis tests for the mean of a distribution.
- Normality
- Equal Variances
- Independence of the Observations
The \(z\)-test and \(z\)-interval are often used to show simple examples of a hypothesis testing or creating confidence intervals. Procedures based on the \(z\)-statistic are rarely used in practice because it is quite brittle. In the 1920’s, William Sealy Gossett created the \(t\)-test which provides a more robust statistic, so much so that \(t\)-intervals are almost always used in modern research reports for applications involving normal distributions.
Example 3: \(t\)-interval
Returning to the hypothetical sample of 25 college students with average IQ of 111, let’s assume the sample standard deviation was 9. Find a 95% confidence \(t\)-interval for the mean IQ.
Solution. We can solve for the cutoff values similar to what we did for \(z\), but here \(\sigma\) will be replaced by \(s\). Using the estimate \(s\) rather than the parameter \(\sigma\) makes sense because the population parameters are rarely known in practice.
Degrees of Freedom
The \(t\)-distribution is a series of approximately bell-shaped curves that converge to the normal distribution, so the chosen degrees of freedom selects a particular curve in the series appropriate to our sample size. For this example, \(df = n -1 =24\). Typically, the degrees of freedom for \(t\)-tests is one less than the sample size.
The calculations are very similar.
\[t = \frac{\overline x - \mu}{s / \sqrt{n}}\]
\[\overline x - \mu = t\cdot\frac{\sigma}{\sqrt n} \] We need to calculate the two \(t^*\) cutoff values will be identical distances from the mean, so we add the \(\pm\) and simplify.
\[\mu = \overline x \pm t^*\cdot\frac{\sigma}{\sqrt n} \]
With 24 degrees of freedom, we can use the cdist function to show the situation for the \(t\)-statistic.
cdist("t", df = 24, 0.95)
[1] -2.063899 2.063899

With substitution, we find \[\mu = \overline x \pm 2.06 \cdot \frac{9}{\sqrt{25}} \] which simplifies to \[\mu =111 \pm 2.06 \cdot 1.8 \]
and thus the 95% confidence interval for the mean IQ is \[(107.29,114.71)\]
We can tweak the results of the cdist function like before which confirms our by-hand calculations.
cdist("t", 0.95, df = 24, plot = FALSE)*1.8 + 111
[1] 107.285 114.715
Example 4: Confidence Intervals for the Mean from a Data Frame
Let’s find an estimate for the mean of the Sleep variable in the Data3350 data frame. The assumptions for using \(z\)- and \(t\)-procedures include the normality assumption, so we should first check a histogram.
histogram(~Sleep, data = Data3350)

We have an approximately normal shape but slight skew to the left. Fortunately, the \(t\)-distribution is robust with respect to violations of the normality assumption when the sample size is larger than 40.
favstats(~Sleep, data = Data3350)
Our sample is 146, so the \(t\)-interval estimate should be reasonably accurate.
The Mosaic function confint extracts confidence intervals from hypothesis tests which explains why we’re wrapping it around a \(t\)-test. The particulars of the alternative hypothesis and value for \(\mu_0\) don’t affect the confidence interval estimate, so we leave them off.
confint(t.test(~ Sleep, data = Data3350))
We find that the 95% confidence interval for average Sleep was \((6.03,6.67)\) hours over the previous 48 hours (including naps). College students in this sample are not getting the recommended dosage!
Mosaic’s Randomization Options
- Shuffle. Permutes the values in the sample data.
- Sample. Draws a sub-sample from the sample datawithout replacement.
- Resample. Draws a sub-sample from the sample datawith replacement.
For examples of permutation tests and bootstrapping, see the Randomization Tutorial.
IV. Bootstrapping Approach
Bootstrapping is a randomization technique that allows us to create confidence intervals without assuming we know anything about the distribution. We take random draws from the data set itself using a process called resampling. For details on Bootstrapping see Chapter 3 in Sonderegger & Buscaglia (2020).
Let’s try to understand resampling using the Sleep variable from the Data3350 data frame. We’ll first draw a sample from the observations in Sleep. Note that the draws are with replacement.
temp = resample(Data3350$Sleep, size = 10)
temp
[1] 3.0 9.0 5.5 6.0 6.5 3.0 3.5 10.0 4.0 9.0
We can then then take the mean.
mean(temp)
[1] 5.95
Our plan is to do this hundreds of times while allowing the resample function to choose it’s own sample size. Below, we create a new frame called bootstrap which will keep the means from 500 resamplings.
bootstrap = do(500) * mean(resample(Data3350$Sleep))
bootstrap
Let’s check a density plot to see how our bootstrapping distribution turned out. Remember to re-execute the code blocks directly above and below several times to see what is happening with the bootstrap sampling distribution.
densityplot(~mean, data=bootstrap)

Finally, we use the Mosaic qdata function to find the percentiles of the bootstrapping distribution (or any other distribution we have available). The argument “p=c(0.025, 0.975)” asks for the 2.5th and 97.5th percentiles, so we will have the middle 95% of the distribution.
qdata(~mean, p=c(0.025, 0.975), data=bootstrap)
2.5% 97.5%
6.049924 6.663788
The bootstrapping confidence interval we created using 500 resamples closely approximates what we calculated using the theoretical distribution and can be useful in many situations where we can trust the \(t\)-interval. In practice, we use \(t\)-intervals to estimate means whenever the assumptions are met.
The only assumption required for bootstrapping confidence intervals to work is that the bootstrapping distribution be symmetric. For data which do not meet the requires of \(z\)- and \(t\)- procedures due to small size or lack of normality, we can bootstrap a confidence interval.
Example 5: Confidence Intervals for Proportions.
Consider the variable SitClass from the Data3350 data frame. We see that the options relate to students who prefer sitting in the Front, Middle or Back of class.
tally(~SitClass, data = Data3350)
SitClass
B F M
35 58 72
We can create a 95% confidence interval for, say, those who prefer seats in the front of class. Note that we specify which proportion to estimate by defining “success.” We can use \(z\)-proportion procedures because we have at least 10 successes and 10 failures in our sample (and would for any of these proportions). However, the prop.test function uses \(\chi^2\) for all proportion testing which is even more robust than \(z\). In sum, these data are quite appropriate for this procedure.
confint(prop.test(~SitClass, success = "F", data = Data3350))
We are 95% confident the actual proportion of students who prefer sitting in front is \((28.0\%, 43.0\%)\).
V. Boostrapping for Proportions
We will resample the SitClass observations and place them in temporary storage in the variable seat to show the commands we will use. Notice that the prop function can count a specific level of the category variable.
seat = resample(Data3350$SitClass)
seat
[1] "F" "F" "B" "B" "F" "M" "B" "B" "B" "M" "F" "B" "M" "M" "M" "B" "M"
[18] "B" "M" "F" "F" "M" "M" "B" "M" "F" "F" "B" "M" "M" "B" "M" "F" "M"
[35] "F" "M" "M" "B" "B" "M" "B" "M" "F" "F" "M" "M" "M" "M" "M" "F" "M"
[52] "M" "M" "F" "B" "M" "F" "M" "F" "M" "M" "B" "M" "F" "B" "F" "M" "M"
[69] "B" "M" "M" "B" "M" "M" "M" "F" "F" "M" "F" "M" "M" "M" "M" "F" "M"
[86] "M" "B" "B" "M" "M" "M" "F" "F" "F" "B" "F" "F" "F" "M" "M" "M" "M"
[103] "M" "B" "F" "B" "M" "B" "M" "M" "F" "F" "M" "M" "M" "F" "B" "B" "F"
[120] "M" "F" "M" "F" "B" "B" "M" "F" "F" "M" "M" "F" "M" "F" "F" "F" "M"
[137] "F" "B" "F" "M" "M" "F" "F" "M" "M" "B" "M" "F" "F" "M" "B" "M" "F"
[154] "M" "M" "M" "M" "M" "F" "M" "M" "M" "F" "M" "M"
prop(seat , success = "F")
prop_F
0.3030303
The code block below generates the resamples and immediately checks the proportion that are “F”. The results are stored in a data frame boots with the variable name “prop_F”. The boots distribution is a sample of proportions (or percentages). Execute the code block below several times to visualize what is happening.
boots = do(500) * prop(resample(Data3350$SitClass),success = "F")
histogram(~prop_F, data = boots)

The 95% confidence interval for boots
qdata(~prop_F, p=c(0.025, 0.975), data=boots)
2.5% 97.5%
0.2816667 0.4121212
Once again, the bootstrapping yields a similar confidence interval. For this specific example, the theoretical approach yielded an interval of \((28.0\%, 43.0\%)\).
VI. Hypothesis Testing using Confidence Intervals
Recall the research question from the first example in Module 5: do younger females use less aggressive humor than the overall population? The hypothesis was \[\begin{align*}H_0 &: \mu = 29\\H_a &: \mu < 29\end{align*}\]
Instead of performing a \(t\)-test and generating a \(p\)-value, we can use bootstrapping to estimate the mean of the young female sample and see if \(\mu=29\) is in the interval.
The following two-step subsetting returns the young female sample (\(n = 43\)) in a data frame with only the variable HSAG. These commands are exactly as before except for the na.omit command. There were several empty cells resulting in ages that were unknown in the data frame. The na.omit removes any rows of the data frame where one of the observations is N/A.
fem = subset(Data3350, Sex == "F", c(Age,HSAG))
yF = subset(fem, Age < 20, HSAG)
yFem = na.omit(yF)
yFem
To create a bootstrap distribution, we repeatedly resample the yFem observations, in this case, 500 times.
bootstrap = do(500) * mean (resample(yFem$HSAG))
bootstrap
We need a 90% confidence interval so that the 5th and 95th percentiles of the bootstrap distribution will be the confidence interval endpoints.
qdata(~mean, p=c(0.05, 0.95), data=bootstrap)
5% 95%
24.63026 28.10526
Because \[\mu = 29 \notin (24.2, 28.6)\], we reject the hull hypothesis. We have evidence that young females do indeed use less aggressive humor than the overall population at the \(\alpha = 0.025\) level.
For two-tailed hypothesis tests, we can set \[\alpha = 1 - \text{Confidence Level}\]
For one-tailed hypothesis tests, we must set \[\alpha = 1 - \frac{\text{Confidence Level}}{2}\]
so the areas in the rejection region are identical.
VII. Supplement: How to Change Confidence Level
If we do not specify a confidence level, R defaults to 95%. However, if we specify the parameter conf.level within the \(t\)-test itself, we can change the confidence level. Let’s return to the Sleep variable from the Data3350 data frame. For a baseline, let’s show the 95% confidence interval.
confint(t.test(~ Sleep, data = Data3350))
The 90% confidence interval should be narrower.
confint(t.test(~ Sleep, data = Data3350, conf.level = .9))
We can create a 98% confidence interval. Since our confidence is higher, this interval should be wider than either of the two above.
confint(t.test(~ Sleep, data = Data3350, conf.level = .98))
VIII. Exercises
- How many resamplings should we use when bootstrapping? Try re-running the code blocks from the Sleep example with 50, 100, 500, and 1000 resamplings. How does the accuracy compare to the theoretical confidence interval as number of resamplings increases? Explain why about 500 resamplings is usually good enough.
Code Block: Question 1
bootstrap = do(50) * mean(resample(Data3350$Sleep))
qdata(~mean, p=c(0.025, 0.975), data=bootstrap)
Use the Corps variable in Data3350 where Y / N responses indicate whether the participant’s is in the UNG Corps of Cadets. Assuming the data frame is representative of the UNG Dahlonega campus, create a 90% confidence interval estimate for the percentage of students who are members of the Corps and interpret your findings. Hint: set success = “Y”.
Use the VarsAth variable in Data3350 where Y / N responses indicate whether the participant’s is a varsity UNG athlete. Assuming the data frame is representative of the UNG Dahlonega campus, create a 95% confidence interval estimate for the percentage of students who are varsity athletes and interpret your findings. Hint: set success = “Y”.
Use the TexRel variable in Data3350 where numeric scores represent scores on the Toxic Relationship Beliefs Scale. (Higher scores equate to more toxic beliefs). Assuming the data frame is representative of the UNG Dahlonega campus, create a 99% confidence interval estimate for the mean TxRel score and interpret your findings.
Mosaic’s Randomization Options
- Shuffle. Permutes the values in the sample data.
- Sample. Draws a sub-sample from the sample datawithout replacement.
- Resample. Draws a sub-sample from the sample datawith replacement.
For examples of permutation tests and bootstrapping, see the Randomization Tutorial.
Use the CHS variable in Data3350 where numeric scores represent scores on the Coping Humor Scale. Assuming the data frame is representative of the UNG Dahlonega campus, create a 95% confidence interval estimate for the mean CHS score and interpret your findings.
Use the CHS variable in Data3350 to create a bootstrap confidence interval at the 95% level. Compare and contrast it with the results from the theoretical confidence interval. Use 500 resamplings.
