#Question 1.
Imagine you are a public health researcher tasked with conducting a study to estimate the prevalence of a specific disease (e.g., diabetes, heart disease, cancer) in a given community. Your goal is to determine how large your sample should be to ensure that your estimates are reliable and accurate. Address the following tasks:
For this assignment, we will use diabetes with general prevalence rate of 10% as a example.
# Set the estimated prevalence rate of the disease to 10%
prev <- 0.1
# Set the significance level to 0.05 for a 95% confidence level
alpha <- 0.05
# Calculate the z-score for a 95% confidence level (1 point)
z_score <- qnorm(1 - alpha / 2)
# Display the z-score
print(z_score)
## [1] 1.959964
# Set the desired margin of error to 5%
d<- 0.05
# Calculate the required sample size using the formula for proportions (1 point)
sample_size = z_score^2 * prev * (1 - prev) / (d^2)
sample_size
## [1] 138.2925
# Round up the calculated sample size to the nearest whole number
ceiling(sample_size)
## [1] 139
1.1 Explain clearly what does 5% margin or error means? (1 point) It means that we are 95% confidence that the true mean falls within the range of our sample mean
1.2 If you estimated that 20% of adults in a community have hypertension with a 5% margin of error, does it mean that the actual percentage of the population with hypertension could range from 5% to 20% ? (1 point) No that’s not what it means.
# Sample size calculation for 0.5X prevalence (1 point)
prev.5<- prev*0.5
sample_size.5 <- z_score^2 * prev.5 * (1 - prev.5) / (d^2)
sample_size.5
## [1] 72.98772
# Sample size calculation for 2X prevalence (1 point)
prev2<- prev*2
sample_size2 <- z_score^2 * prev2 * (1 - prev2) / (d^2)
sample_size2
## [1] 245.8534
# Sample size calculation for 3X prevalence (1 point)
prev3<- prev*3
sample_size3 <- z_score^2 * prev3 * (1 - prev3) / (d^2)
sample_size3
## [1] 322.6825
2.1 What happen to the number of samples when the prevalence increases to 2x and 3x of the original 10% prevalence rate? (1 point) The sample size increased.
# Sample size calculation for a 2.5% margin of error (1 point)
d = 0.025
sample_size2.5d = z_score^2 * prev * (1 - prev) / (d^2)
sample_size2.5d
## [1] 553.1701
# Sample size calculation for a 7.5% margin of error (1 point)
d = 0.075
sample_size7.5d = z_score^2 * prev * (1 - prev) / (d^2)
sample_size7.5d
## [1] 61.46334
3.1 What happen to required sample size when we decrease margin of error? Why? (2 point) When we decrease the margin of error, we have to be more acccurate, so we need to sample more people.
# Sample size calculation for a 90% confidence level (1 point)
alpha <- 0.1
d<- 0.05
z_score <- qnorm(1 - alpha / 2)
sample_size90CI = z_score^2 * prev * (1 - prev) / (d^2)
sample_size90CI
## [1] 97.39956
# Sample size calculation for a 99% confidence level (1 point)
alpha <- 0.01
d<- 0.05
z_score <- qnorm(1 - alpha / 2)
sample_size99CI = z_score^2 * prev * (1 - prev) / (d^2)
sample_size99CI
## [1] 238.8563
Report how changing confidence interval affect the number of required sample? (1 point)