STA 111 Lab 5

How to Submit

Complete all Questions, and submit final documents in html or PDF form on Canvas.

Due date: “8 November 2023”

Goal

In the last lab, we explored sampling variability of a sample proportion. We saw that different samples from the same population can yield different sample proportions. We also saw that we can use the sampling distribution as a way to estimate how much we expect a sample proportion to change if we took a different sample from the sample population. We discussed how examining and understanding the sampling variability allows us to report not just our sample proportion but a range of plausible values for the population proportion. Today we are going to formalize this concept with confidence intervals.

The Data

Gallup is an organization that conducts extensive polls aimed at exploring a variety of facets about societal opinions, political views, and more. In December of 2021, the Gallup organization released an article called The 2021 Update on Americans and Religion (https://news.gallup.com/poll/358364/religious-americans.aspx). In this lab, we are going to use the survey results.

This article is an example of a news article that uses statistics. In this course, we have talked about the need to confirm the reliability of data before trusting any conclusions an article draws from that data. One thing to look for to assess the validity of supplied data is information like margins of error, sampling methods, and sample sizes. If this information is provided, this lends more support to the claim that the data presented can be safely used. It also allows us to assess any potential biases that may result from the data collection methods.

You have to open the article and take a look to find the answers to the questions below. Pay special attention to the Survey Methods section at the end - most of the answers are there.

Question 1

What is the population of interest for this survey?

Ans: American adults

Question 2

How many people were interviewed for this survey, i.e., what is the sample size?

Ans: 1837 people

Question 3

What methods were used to gather information? (Example: mailed in survey, phone survey, in person interviews, etc.)

Ans: To gather information, they used telephone interviews in May 2021 and December 2021. The people they called were 18+ American adults and it was a random sample.

Question 4

According to the survey, what proportion of Americans in 2021 say religion is “very important” in their life?

Ans: 49%.

Question 5

Is this value a sample statistic or population parameter?

Ans: Sample statistic.

Question 6

Based on the collection technique for this Gallup poll, what is one potential source of bias?

Ans: Because the survey method was phone calls, individuals that do not have phones or access to phones could not participate. This potentially could leave out a group of people and bias the results.

Confidence Interval for a Population Proportion

Rather than relying on just the proportion reported in the article, we are going to use a confidence interval to answer the researcher’s question. Remember that a confidence interval is a range of plausible values for a population parameter.

Last time, we used simulations to explore sampling variability. Specifically, we built our sampling distribution of the sample proportion by drawing many samples from a population. However, we can’t do that today, as we do not have population data.

Luckily, we have a beautiful mathematical result that allows us to describe the sampling distribution of the sample proportion without needing to run a simulation. When certain conditions are met, we can assume that the sampling distribution of \(\hat{p}\) is a normal distribution with mean p, and we can approximate the standard error (standard deviation) with the formula of standard error (check lecture)

Question 7

What is the standard error for the sampling distribution of \(\hat{p}\)?

Ans: SE = 0.011662

Once we have computed the standard error, we need to make sure we understand what it tells us.

Question 8

Based on properties of the normal distribution, 95% of all values of \(\hat{p}\) should be about how far away (±) from p?

Ans: +- 1.96 away from p.

The value you have computed in Question 9 is called the margin of error (ME). Basically, if we start from our sample proportion \(\hat{p}\), and stretch out a distance equal to the margin of error above and below \(\hat{p}\), we will catch p in this interval for 95% of our samples.

We call the range of values that this distance spans a confidence interval.

Question 9

Adapt the following code to construct a 95% confidence interval for the proportion of American adults who identified as believing religion is very important in 2021. Hint: This means replacing all the quantities below like phat and ME with numeric values.

# The Lower Bound 
0.49 - 0.022869
# The Upper Bound 
0.49 + 0.022869

Question 10

Interpret the CI from Question 10.

Ans: (0.467,0.512)

Question 11

Use your confidence interval from Question 10 to answer the researcher: Does the evidence given in the article suggest that that less than 50% of all Americans feel religion is very important? Explain your answer.

Ans: The confidence interval indicates that we can be reasonably sure that the proportion of Americans who think religion is important can be as low as 47% and as high as 51% which is more than 50. We are 95% confident that the true proportion falls between these values which could be more or less than 50% based on the values I found.

Question 12

Will a 97% confidence interval be narrower or wider than our 95% confidence interval? Explain.

Ans: A 97% would be wider than a 95% because we are increasing our confidence and thus need to expand the interval to include more values to ensure we are more likely to capture the true proportion.

Question 13

What is the critical value for a 97% confidence interval?

Ans: 2.17

Question 14

Construct and interpret a 97% confidence interval for the proportion of American adults who identified as believing religion is very important in 2021.

Ans: 0.49+- (2.17(0.02286)) –> (0.4404,0.5396). We are 97% confident that the true proportion of American adults who believe religion is very important in 2021 is between 44 and 54%.

Question 15

Use your confidence interval from Question 15 to answer the researcher: Does the evidence given in the article suggest that that less than 50% of all Americans feel religion is very important? Has your answer changed from Question 13? Did we expect it to?

Ans: No, the range is still both above and below 50%. With the 97% CI, the proportion is within .44 and .54, and .54 is above so it could be that more than 50% feel religion is very important. My answer did not change from Q13 because both CIs indicate that the true proportion could be above or below .5. We didn’t expect this to change from Q13 because it already had values above .5 and we knew that the interval was getting wider thus still including values above and below.

Question 16

Why don’t we always create 99% or 100% confidence intervals? Why do we bother with different confidence levels?

Ans: With 99% and 100% confidence intervals, the width of the interval increases and it is hard to make informed and precise statements about the population and the proportion. Too wide of an interval is actually less informative and there needs to be a balance and reasonable range of values etc. which is why we have different confidence levels!

Wrapping it Up

Confidence levels are extremely powerful tools. They allow us to use just one sample to estimate a range of plausible values for a population proportion. We will learn to make confidence intervals for other parameters, like population means, as we move through this course.