Part 9: How Confident Can We Be In Our Estimates?

With hypothesis testing, we used our point estimate (sample proportion or mean) to test a value of the population parameter.

Those point estimates gave us a good idea of whether a certain population parameter made sense (or whether we should reject it as the true value), but it didn't tell us the whole story.

To get a more complete picture, we need to also think about the precision of our estimate!

Interval estimates are designed to contain the population parameter with a certain probability, say 90% or 95%. The greater the probability, the more likely the true population parameter is in our interval estimate.

Confidence Interval:

Confidence intervals contain the population parameter with a certain degree of confidence.
We want the confidence level to be high - the higher our confidence level, the more confident we are that the true population parameter is somewhere in the confidence interval.
- 95% is the most common confidence level, but 90% and 99% are also used.

Confidence intervals are created based on the sampling distribution. Given a sample statistic, we can build a sample statistic treating that as the parameter estimate. That sampling distribution will then tell us what values are most believable as the true population parameter.

Confidence Intervals about a Population Parameter

Example: Suppose we want to know p, the true proportion of Nebraskans who watched the Big 12 title game in December 2010. A survey conducted by the Lincoln Journal Star reports that 993 out of 1358 people surveyed had watched the football game. What can we say about $ \rho $ - the population proportion of Nebraskans who watched this game?

Is $ \rho $ equal to the sample proportion, 0.7312? Why or why not?
What is the sampling distribution of $ \hat{p} $?
Approximately where will we find the central 95% of all the sample proportions? (Use the Empirical Rule.)

So 95% of all samples we take will be “lucky enough” that their $ \hat{p} $ will be within about 2 standard errors of $ \rho $. We can take that idea and flip it - $ \rho $ should be within about standard errors of $ \hat{p} $ for 95% of all samples.

Reminder: the Empirical Rule is just an approximation!

This is the basic idea behind a confidence interval. Our interval will be a range of standard errors away from the sample statistic - the width of the range depends on the confidence level we want.

Margin of Error:

The multiplier that we choose depends on the confidence level we want. Usually we are interested in a 90%, 95%, or 99% confidence level. When we want to find a confidence interval about a proportion, the multiplier for a given confidence level is always the same.

Confidence level	Multiplier
90%	1.645
95%	1.96
99%	2.58

The multiplier is the number of standard errors above and below the mean such that __% of the area in a normal distribution is in that range.
For example, 90% of the area in the middle of a normal distribution is within 1.645 standard errors of the mean.

Once we know the multiplier, we need to find the standard error. The exact standard error of a population proportion depends on the true population proportion, $ \rho $. We don't know what the true parameter is; all we have is our point estimate $ \hat{p} $.

Example: Back to the Big 12 title game example.

When we were looking at the sampling distribution of a proportion (Part 6), what did we use for the standard error? What can we use to estimate the standard error for confidence intervals?

For a population proportion, our best guess of the population parameter is our sample proportion, $ \hat{p} $. Since that's our best guess, we'll use it again in the standard error formula.

Standard Error for a Proportion:

Now that we have the multiplier and the standard error, we can find the margin of error. Finally, use that to compute the confidence interval.

Confidence Interval for a Proportion:

\[ \hat{p}\pm z_{CL}\times s.e. \]

Example: Using what we've done above, compute the 95% confidence interval for the proportion of Nebraskans who watched the 2010 Big 12 title game.

What does this interval tell us about the true proportion?
- Does p=0.72?
- Does p=0.75?
- p=0.69?
Based on this interval, what is the true population proportion?

For 95% of all similar random samples of Nebraskans, this method of constructing a confidence interval will produce an interval that contains the true proportion $ \rho $ 95% of the time. In the long run, about 5% of all intervals will “miss” $ \rho $.

Example: Find the 99% confidence interval for the proportion of Nebraskans who watched the Big 12 title game. How does this relate to the 95% confidence interval? Plot both intervals on a number line below.

How does the margin of error change depending on the confidence level?

In R, we can use the same exact functions that we used to do hypothesis tests (binom.test and t.test). Instead of specifying a hypothesis test, we can specify a confidence level.

library(mosaic)
binom.test(x = 993, n = 1358, conf.level = 0.99)

## 
##  Exact binomial test
## 
## data:  x and n
## number of successes = 993, number of trials = 1358, p-value <
## 2.2e-16
## alternative hypothesis: true probability of success is not equal to 0.5
## 99 percent confidence interval:
##  0.6991 0.7617
## sample estimates:
## probability of success 
##                 0.7312

Why doesn't our answer in R exactly match the answer we found by hand above?
Two major reasons:
- Rounding!!
- The sampling distribution we used to find probabilities in Part 6 is an approximation – it doesn't find the exact probabilities but it's pretty darn close. R is using exact values!

Example: In 1994 (the most recent year asked), the General Social Survey asked, “During the last year, did anyone take something from you by using force - such as a stickup, mugging, or threat?” Of 1223 subjects, 31 answered yes and 1192 answered no.

Find the sample proportion of the population who were victims.
Find the standard error of this estimate.
Find the margin of error for a 95% confidence interval.
Construct the 95% confidence interval for the population proportion by hand.
Use R to verify your results. Your intervals should be close, but they may not match exactly.

binom.test(x = 31, n = 1223, conf.level = 0.95)

Can you conclude that fewer than 10% of all adults in the United States were victims? Justify why or why not.

Example: Many people consider themselves “green”, meaning they are supportive of environmental issues. But how many people are truly “green” in practice? For instance, Americans' per capita use of energy is roughly double that of people living in Western Europe. Despite lower use, Western Europeans often pay double the amount for gasoline as Americans do (roughly $6 per gallon). A 2011 Gallup poll of 1,003 Americans found that 41% of Americans believe that the government should prioritize protection of the environment, even at the risk of limiting U.S. energy production - down from a high of 57.93% in 2007.

Use R to find a 90% confidence interval for the population proportion of Americans who believed environmental protection should be a priority in 2007.

0.5793 * 1003
binom.test(x = 581, n = 1003, conf.level = 0.9)

Use R to find a 90% confidence interval for the population proportion of Americans who believed environmental protection should be a priority in 2011.
Graph the two intervals on the same number line. Do they overlap?
Based on the intervals, does it look like there has been a shift in the public's attitude toward “green” policies? Why or why not?

Example: Consider two surveys about college tuition prices, one with 100 respondents and one with 500 respondents. In both surveys, 85% of respondents said that they believed college tuition was rising too quickly. Find the 95% confidence interval for the true proportion of Americans who believe college tuition is rising too quickly. Graph both intervals below. How does the sample size change your interval?

Interpreting Confidence Intervals

Confidence intervals are statements about the ____________________________________________________. They are not about the sample mean or the individual observations. We can only talk about “probability” BEFORE we take the sample. After the sample is taken, we use the term “confidence”.

Basic Interpretation: We are ___________ confident that the true ________________________ lies between _________________________.

Example: Suppose the 95% confidence interval for the average number of hours students study for each class in one week is (2.3, 5.4). Determine what (if anything) is wrong with each interpretation below.

We are 95% confident that the true average of hours students spend studying for each course in one week is between 2.3 hours and 5.4 hours.
We are 95% confident that all students study between 2.3 and 5.4 hours for one course in one week.
The probability that a student studies between 2.3 and 5.4 hours for one course in one week is 0.95.
The probability that the true mean ?? is in the 95% confidence interval is 0.95.
We are 95% confident that the sample mean number of hours that students study is between 2.3 hours and 5.4 hours for one course in one week.

Example: When the General Social Study asked in 2004 (most recent data available), “About how many hours per week do you spend sending and answering e-mail?” the 95% confidence interval was 5.31 hours to 6.75 hours. Interpret this confidence interval.

Confidence Intervals about a Single Population Mean

Confidence intervals for population means have the same form as confidence intervals for a population proportion:

\[ estimate\pm multiplier\times s.e. \]

When we are interested in the population mean, the point estimate is the sample mean, $ \bar{x} $.

Just like before, the margin of error is a multiple of the standard error, depending on which confidence level we are interested in. In Part 7, we used $ s $ to estimate the population standard deviation, $ \sigma $. We'll do the same thing here.

Standard Error for a Mean:

When we started doing hypothesis tests about a mean, we had to switch over to the t score. Confidence intervals also require us to use the t distribution. The t table in the back of your textbook is specifically set up to give us confidence interval multipliers.

Multiplier for a Mean: the t score such that the appropriate area under the t distribution is within “t” standard errors of the mean

The multiplier for a confidence interval about a mean changes depending on both the sample size and the confidence level we want for our estimates. If we wanted to find the multiplier by hand, we would need to use a table. However we'll let R handle finding the appropriate t score for us.

Confidence Interval for a Mean:

\[ \bar{x} \pm t_{CL, n} \times s.e. \]

What's changed between a confidence interval for a proportion and a confidence interval for a mean?

Example: You take a survey of fifteen recent college graduates, and you find that their average starting salary was $40,000 with a standard deviation of $5,600. Find and interpret the 90% confidence interval by handfor the average starting salary after graduation.

In this case, the appropriate multiplier is t = 1.761.

Example: A hospital administrator wants to estimate the mean length of stay for all inpatients using that hospital. Using a random sample of 100 records of patients for the previous year, she reports that the sample mean length of hospital stay was 5.3 days. The sample standard deviation was 2.1 days. Find and interpret the 95% confidence interval for the average length of hospital stays.

In this case, the appropriate multiplier is t = 1.984.

While finding confidence intervals by hand isn't hard, it is unrealistic. With real data we don't usually have summary statistics (although we can get them), nor do we want to look up a t score for every single study. Instead, we'll let a softward package do the work.

Example: An investor was interested in determining the average monthly change in her investment portfolio. She randomly selected 7 monthly statements and recorded the change in her investments for each month. The values she recorded are: $675.37, $423.39, $-214.95, $342.85, $359.58, $243.57, $893.19.

What does it mean when she has a negative monthly change in her investment?
Find the 95% confidence interval for her average monthly returns in the output below. Why would this be important for an investor to know?

invest <- c(675.37, 423.39, -214.95, 342.85, 359.58, 243.57, 893.19)
t.test(x = invest, conf.level = 0.95)

## 
##  One Sample t-test
## 
## data:  invest
## t = 2.958, df = 6, p-value = 0.02534
## alternative hypothesis: true mean is not equal to 0
## 95 percent confidence interval:
##   67.23 710.77
## sample estimates:
## mean of x 
##       389

Find the 99% confidence interval for the investor's average monthly returns. Plot both intervals on a number line below. How does increasing the confidence level affect the interval?
What if the investor added more months to her sample? How would the interval change?

Note: The R function that we're using to do a confidence interval directly corresponds to the function we used for the hypothesis test!

For a single proportion, use binom.test.
For a single mean (quantitative variable), use t.test.
For two means (two groups), use t.test. Don't forget to define your groups!

Example: A local internet service provider (ISP) created two new versions of its software, with alternative ways to implement a new feature. To do this, they need to know if their users have above-average Internet demands. CNet reported in May 2013 that the average Internet user is online 13 hours per week. Use the Computers data set to see if this company's customers are online more than average.

Change the code below from the Part 7 notes to find the 95% confidence interval for the amount of time this company's customers spend online on average per week?

t.test(Computers$comp, alternative = "greater", mu = 13)

## 
##  One Sample t-test
## 
## data:  Computers$comp
## t = -36.36, df = 20782, p-value = 1
## alternative hypothesis: true mean is greater than 13
## 95 percent confidence interval:
##  11.01   Inf
## sample estimates:
## mean of x 
##      11.1

If you were working for this company, would you want to know the results of the hypothesis test or the confidence interval? Explain your choice.

Example: The data set Time contains the results of an experiment done by a health teacher at a small college. This instructor chose 4 random samples of male college students of size 20, and recorded the time they spend exercising in a typical week.

The code below estimates a 95% confidence interval for the average time spent exercising in a typical week for Sample 1 (time 1). Interpret this confidence interval.

data(Time)
t.test(Time$time1, conf.level = 0.95)

## 
##  One Sample t-test
## 
## data:  Time$time1
## t = 13.95, df = 19, p-value = 1.97e-11
## alternative hypothesis: true mean is not equal to 0
## 95 percent confidence interval:
##   8.287 11.213
## sample estimates:
## mean of x 
##      9.75

For Samples 2-4, record the sample mean and the 95% confidence interval for the average time spent exercising in a typical week.
How do you think these confidence intervals would change if we combined the four samples of 20 observations into one large sample of 80 observations?

largesample <- c(Time$time1, Time$time2, Time$time3, Time$time4)
t.test(largesample, conf.level = 0.95)

Confidence Levels for Two Population Means

Confidence intervals can be very informative about the difference between two groups. The confidence interval for the difference in two population means is centered at $ \mu_1 - \mu_2 $, the difference between the two population means.

We already know that a confidence interval gives a range of plausible values for the true population parameter, in this case the true difference in population means. There are three possibilities.

The entire confidence interval is positive:
The entire confidence interval is negative:
The confidence interval contains zero:

The magnitude of the values in the confidence interval tells you how large any true difference is. If all the values in the confidence interval are near 0, the true difference may be relatively small in practical terms.

Example: The data set Drinking records the number of alcoholic beverages consumed per week (Alcohol) and the gender (Gender) of 236 college students at a large state university. Let group 1 be the females and group 2 be the males. Find and interpret a 95% confidence interval for the difference in average number of alcoholic beverages consumed. How much more do male college students tend to drink than female college students?

data(Drinking)
t.test(Alcohol ~ Gender, data = Drinking, conf.level = 0.95)

## 
##  Welch Two Sample t-test
## 
## data:  Alcohol by Gender
## t = -4.287, df = 96.21, p-value = 4.307e-05
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -7.005 -2.571
## sample estimates:
## mean in group Female   mean in group Male 
##                2.842                7.630

Example: A study was conducted at a large state university in order to compare the sleeping habits of undergraduate students to those of graduate students. Random samples of 75 undergraduate students and 50 graduate students were chosen and each of the subjects was asked to report the number of hours he or she sleeps in a typical day. The hypothesis is that since undergraduate students are generally younger and party more during their years in school, they sleep less, on average, than graduate students.

Is this true? Find the 90% confidence interval for the difference in average hours slept in a typical day for graduate students (group 1) and undergraduate students (group 2).

data(Sleep2)
head(Sleep2)

##   hours    status
## 1     6 Undergrad
## 2     5 Undergrad
## 3     6 Undergrad
## 4     6 Undergrad
## 5     8 Undergrad
## 6     7 Undergrad

Example The Pioneer Valley Planning Commission (PVPC) collected data north of Chestnut Street in Florence, MA for ninety days from April 5, 2005 to November 15, 2005. Data collectors set up a laser sensor, with breaks in the laser beam recording when a rail-trail user passed the data collection station. The data is stored in the RailTrail data set. The estimated volume of trail users is recorded in the volume variable. Whether or not it was a weekday is recorded in the weekday variable (0 is a weekend, 1 is a weekday).

Do you think that the trails will be busier on a weekday or weekend? Why?
Why would the PVPC want to know about when the trails are being used?
Find and interpret the 95% confidence interval for the difference in volume of trail users on weekdays v. weekends.
What other variables could impact daily trail volume?