STA 111 Lab 5

How to Submit

Complete all Questions, and submit final documents in html or PDF form on Canvas.

Due date: “8 November 2023”

Goal

In the last lab, we explored sampling variability of a sample proportion. We saw that different samples from the same population can yield different sample proportions. We also saw that we can use the sampling distribution as a way to estimate how much we expect a sample proportion to change if we took a different sample from the sample population. We discussed how examining and understanding the sampling variability allows us to report not just our sample proportion but a range of plausible values for the population proportion. Today we are going to formalize this concept with confidence intervals.

The Data

Gallup is an organization that conducts extensive polls aimed at exploring a variety of facets about societal opinions, political views, and more. In December of 2021, the Gallup organization released an article called The 2021 Update on Americans and Religion (https://news.gallup.com/poll/358364/religious-americans.aspx). In this lab, we are going to use the survey results.

This article is an example of a news article that uses statistics. In this course, we have talked about the need to confirm the reliability of data before trusting any conclusions an article draws from that data. One thing to look for to assess the validity of supplied data is information like margins of error, sampling methods, and sample sizes. If this information is provided, this lends more support to the claim that the data presented can be safely used. It also allows us to assess any potential biases that may result from the data collection methods.

You have to open the article and take a look to find the answers to the questions below. Pay special attention to the Survey Methods section at the end - most of the answers are there.

Question 1

What is the population of interest for this survey?

Question 2

How many people were interviewed for this survey, i.e., what is the sample size?

Question 3

What methods were used to gather information? (Example: mailed in survey, phone survey, in person interviews, etc.)

Question 4

According to the survey, what proportion of Americans in 2021 say religion is “very important” in their life?

Question 5

Is this value a sample statistic or population parameter?

Question 6

Based on the collection technique for this Gallup poll, what is one potential source of bias?

Confidence Interval for a Population Proportion

Rather than relying on just the proportion reported in the article, we are going to use a confidence interval to answer the researcher’s question. Remember that a confidence interval is a range of plausible values for a population parameter.

Last time, we used simulations to explore sampling variability. Specifically, we built our sampling distribution of the sample proportion by drawing many samples from a population. However, we can’t do that today, as we do not have population data.

Luckily, we have a beautiful mathematical result that allows us to describe the sampling distribution of the sample proportion without needing to run a simulation. When certain conditions are met, we can assume that the sampling distribution of \(\hat{p}\) is a normal distribution with mean p, and we can approximate the standard error (standard deviation) with the formula of standard error (check lecture)

Question 7

What is the standard error for the sampling distribution of \(\hat{p}\)?

Once we have computed the standard error, we need to make sure we understand what it tells us.

Question 8

Based on properties of the normal distribution, 95% of all values of \(\hat{p}\) should be about how far away (±) from p?

The value you have computed in Question 9 is called the margin of error (ME). Basically, if we start from our sample proportion \(\hat{p}\), and stretch out a distance equal to the margin of error above and below \(\hat{p}\), we will catch p in this interval for 95% of our samples.

We call the range of values that this distance spans a confidence interval.

Question 9

Adapt the following code to construct a 95% confidence interval for the proportion of American adults who identified as believing religion is very important in 2021. Hint: This means replacing all the quantities below like phat and ME with numeric values.

# The Lower Bound 
phat - ME
# The Upper Bound 
phat + ME

Question 10

Interpret the CI from Question 10.

Question 11

Use your confidence interval from Question 10 to answer the researcher: Does the evidence given in the article suggest that that less than 50% of all Americans feel religion is very important? Explain your answer.

Question 12

Will a 97% confidence interval be narrower or wider than our 95% confidence interval? Explain.

Question 13

What is the critical value for a 97% confidence interval?

Question 14

Construct and interpret a 97% confidence interval for the proportion of American adults who identified as believing religion is very important in 2021.

Question 15

Use your confidence interval from Question 15 to answer the researcher: Does the evidence given in the article suggest that that less than 50% of all Americans feel religion is very important? Has your answer changed from Question 13? Did we expect it to?

Question 16

Why don’t we always create 99% or 100% confidence intervals? Why do we bother with different confidence levels?

Wrapping it Up

Confidence levels are extremely powerful tools. They allow us to use just one sample to estimate a range of plausible values for a population proportion. We will learn to make confidence intervals for other parameters, like population means, as we move through this course.