Exercise 1

A survey of 800 randomly selected adults in a country found that 656 of them believed that protecting the rights of those with unpopular views is a very important component of a strong democracy. Find and interpret a 95% confidence interval for the proportion of adults in the country who believe that protecting the rights of those with unpopular views is a very important component of a strong democracy.

Step 1: Summary

# From the question, write down sample size (n) and proportion (p hat)
n <- 800
p_hat <- 656/n

Step 2: Checking the Central Limit Theorem

# The adults are "randomly selected", so we assume Independent & Random
# Thus, we only need to check Large Sample aka Succeses & Failures
# We do not have the population proportion (p), so use p hat!

# Expected No. Successes is AT LEAST 10: n * p >= 10 
n * p_hat >= 10
## [1] TRUE
# Expected No. Failures is AT LEAST 10: n * (1 - p) >= 10 
n * (1 - p_hat) >= 10
## [1] TRUE

Step 3: Construct the CI

# CI = [p hat - z crit * SE, p hat + z crit * SE]

# z crit is the critical value, which depends on the CONFIDENCE LEVEL
# For the 90% CI, the critical value is 1.645
# For the 95% CI, the critical value is 1.960
# For the 99% CI, the critical value is 2.576
z_crit95 <- 1.96

# For the one sample proportion case, the Standard Error (SE) is:
# SE = sqrt(p * (1 - p) / n)
# Again, we do not have p, so approximate with p hat instead:
# SE hat = sqrt(p hat * (1 - p hat) / n)
SE_hat <- sqrt(p_hat * (1 - p_hat) / n)

# Margin of Error (MoE): z_crit * SE
MoE95 <- z_crit95 * SE_hat

# Lower Bound: p hat - z_crit * SE = p hat - MoE
lower95 <- p_hat - MoE95

# Upper Bound: p hat + z_crit * SE = p hat + MoE
upper95 <- p_hat + MoE95

# Use c() to combine the bounds together
ci95 <- c(lower95, upper95)
ci95
## [1] 0.7933772 0.8466228
# The CI's width is the upper bound minus the lower bound
upper95 - lower95
## [1] 0.05324566
# The CI's width is also always equal to twice the MoE
2 * MoE95
## [1] 0.05324566

Answer: Delete this text and type in your answer.

Exercise 2

Would a confidence interval with a confidence level of 90% be wider or narrower than one with a confidence level of 95%? Explain why or why not. Create both CIs using the previous question’s information to support your answer.

# With the approximated SE hat = sqrt(p hat * (1 - p hat) / n):
# CI = [p hat - z_crit * SE hat, p hat + z_crit * SE hat]
# We use the same sample. Thus, p hat, n, & SE hat remains unchanged!

# NOTE: DECREASING THE CONFIDENCE LEVEL DECREASES THE CRITICAL VALUE:
# For the 90% CI, the critical value is 1.645
# For the 95% CI, the critical value is 1.960
z_crit90 <- 1.645

# The steps to calculate the 90% CI is the same as above.
# The only exception is that we are using z crit = 1.645!
# Again, p hat and SE hat are the same since we use the same sample!

# Margin of Error (MoE): z_crit * SE
MoE90 <- z_crit90 * SE_hat

# Lower Bound: p hat - z_crit * SE = p hat - MoE
lower90 <- p_hat - MoE90

# Upper Bound: p hat + z_crit * SE = p hat + MoE
upper90 <- p_hat + MoE90

# Use c() to combine the bounds together
ci90 <- c(lower90, upper90)
ci90
## [1] 0.7976558 0.8423442
# Once again, we calculate the new CI's width in both ways
upper90 - lower90
## [1] 0.04468833
2 * MoE90
## [1] 0.04468833

Answer: Delete this text and type in your answer.

Exercise 3

Download and read births.csv. Sample the weights of 100 randomly selected births with replacement from the dataset. Based on the sample, create and interpret a 95% confidence interval for the proportion of births with a baby weight less than but not including 115 ounces.

Read in the births.csv data:

births_df <- read.csv("births.csv")
head(births_df)
##   Gender Premie weight Apgar1 Fage Mage Feduc Meduc TotPreg Visits   Marital
## 1   Male     No    124      8   31   25    13    14       1     13   Married
## 2 Female     No    177      8   36   26     9    12       2     11 Unmarried
## 3   Male     No    107      3   30   16    12     8       2     10 Unmarried
## 4 Female     No    144      6   33   37    12    14       2     12 Unmarried
## 5   Male     No    117      9   36   33    10    16       2     19   Married
## 6 Female     No     98      4   31   29    14    16       3     20   Married
##   Racemom Racedad Hispmom Hispdad Gained     Habit MomPriorCond BirthDef
## 1   White   White NotHisp NotHisp     40 NonSmoker         None     None
## 2   White   White Mexican Mexican     20 NonSmoker         None     None
## 3   White Unknown Mexican Unknown     70 NonSmoker At Least One     None
## 4   White   White NotHisp NotHisp     50 NonSmoker         None     None
## 5   White   Black NotHisp NotHisp     40 NonSmoker At Least One     None
## 6   White   White NotHisp NotHisp     21 NonSmoker         None     None
##      DelivComp BirthComp
## 1 At Least One      None
## 2 At Least One      None
## 3 At Least One      None
## 4 At Least One      None
## 5         None      None
## 6         None      None

Sampling:

# IMPORTANT: DO NOT CHANGE THIS SEED NUMBER!
set.seed(12)

# From the weight column, randomly sample 100 values
sample_weights <- sample(x = births_df$weight, size = 100)
head(sample_weights)
## [1]  91 147 113 121  77 142

Construct the CI:

# Follow the steps above to construct the 95% CI with your sample!
# You can skip checking the conditions!

# You are given n = 100. Calculate the sample proportion p hat.
n <- 100
p_hat <- mean(sample_weights < 115)

# For the 95% CI, the critical value is 1.960
z_crit95 <- 1.96

# Approximate the Standard Error (SE) with p hat:
# SE hat = sqrt(p hat * (1 - p hat) / n)
SE_hat <- sqrt(p_hat * (1 - p_hat) / n)

# Margin of Error (MoE): z_crit * SE
MoE95 <- z_crit95 * SE_hat

# Lower Bound: p hat - z_crit * SE = p hat - MoE
lower95 <- p_hat - MoE95

# Upper Bound: p hat + z_crit * SE = p hat + MoE
upper95 <- p_hat + MoE95

# Use c() to combine the bounds together
ci95 <- c(lower95, upper95)
ci95
## [1] 0.4423141 0.6376859

Answer: Delete this text and type in your answer.