library(tidyverse)
library(openintro)
library(infer)
us_adults <- tibble(
climate_change_affects = c(rep("Yes", 62000), rep("No", 38000))
)
ggplot(us_adults, aes(x = climate_change_affects)) +
geom_bar() +
labs(
x = "", y = "",
title = "Do you think climate change is affecting your local community?"
) +
coord_flip()

us_adults %>%
dplyr::count(climate_change_affects) %>%
dplyr::mutate(p = n /sum(n))
## # A tibble: 2 x 3
## climate_change_affects n p
## <chr> <int> <dbl>
## 1 No 38000 0.38
## 2 Yes 62000 0.62
n <- 60
samp <- us_adults %>%
sample_n(size = n)
Exercise 1
- What percent of the adults in your sample think climate change affects their local community? Hint: Just like we did with the population, we can calculate the proportion of those in this sample who think climate change affects their local community. 53% of the adults in your sample think climate change affects their local community
samp %>%
dplyr::count(climate_change_affects) %>%
dplyr::mutate(p = n /sum(n))
## # A tibble: 2 x 3
## climate_change_affects n p
## <chr> <int> <dbl>
## 1 No 19 0.317
## 2 Yes 41 0.683
Exercise 2
Would you expect another student’s sample proportion to be identical to yours? Would you expect it to be similar? Why or why not? Another students sample proportion I expect would be similar but not identical. This is given that possibly another student’s proportion depends on the selected 60 adults.
samp %>%
specify(response = climate_change_affects, success = "Yes") %>%
generate(reps = 1000, type = "bootstrap") %>%
calculate(stat = "prop") %>%
get_ci(level = 0.95)
## # A tibble: 1 x 2
## lower_ci upper_ci
## <dbl> <dbl>
## 1 0.567 0.8
Exercise 3
In the interpretation above, we used the phrase “95% confident”. What does “95% confidence” mean?
I believe 95% confident can be interpreted as a range. Meaning we are stating we are 95% certain. Also we can interpret as being 95% certain of true population proportion
In this case, you have the rare luxury of knowing the true population proportion (62%) since you have data on the entire population. ### Exercise 4 Does your confidence interval capture the true population proportion of US adults who think climate change affects their local community? If you are working on this lab in a classroom, does your neighbor’s interval capture this value? I believe it would be a similar value. My lower_ci and upper_ci demonstrate [0.4][0.65]
Exercise 5
Each student should have gotten a slightly different confidence interval. What proportion of those intervals would you expect to capture the true population mean? Why?
Different sample of US adults selected is why each student gets slightlt different confidence interval. The general use of the 95% confidence interval certainty is why i would expect captures of the true population mean.
Exercise 6
Given a sample size of 60, 1000 bootstrap samples for each interval, and 50 confidence intervals constructed (the default values for the above app), what proportion of your confidence intervals include the true population proportion? Is this proportion exactly equal to the confidence level? If not, explain why. Make sure to include your plot in your answer.
Good majority of the the confidence interval includes true population. As closely approaching .9 true population proportion is seen. Overall it is safe to assume 95% of the confidence intervals demonstrated would include true population
Exercise 7
Choose a different confidence level than 95%. Would you expect a confidence interval at this level to me wider or narrower than the confidence interval you calculated at the 95% confidence level? Explain your reasoning. I chose a confidence level of 85% and left the defaults sample size: 60, Number of resamples: 1000, Number of confidence intervals 50. I would assume since confidence level is lower, ther interval would be more narrow. You basically have that 15% range of incorrectness in this case. When the precision of the confidence interval increases we could assume true population proportion interval decreases.
Exercise 8
Using code from the infer package and data fromt the one sample you have (samp), find a confidence interval for the proportion of US Adults who think climate change is affecting their local community with a confidence level of your choosing (other than 95%) and interpret it.
Chose 85%. Showing 85% confidence in true proportion of the adults who believe climate change affects in the local community.[0.5][0.667]
samp %>%
specify(response = climate_change_affects, success = "Yes") %>%
generate(reps = 1000, type = "bootstrap") %>%
calculate(stat = "prop") %>%
get_ci(level = 0.85)
## # A tibble: 1 x 2
## lower_ci upper_ci
## <dbl> <dbl>
## 1 0.6 0.767
Exercise 9
Using the app, calculate 50 confidence intervals at the confidence level you chose in the previous question, and plot all intervals on one plot, and calculate the proportion of intervals that include the true population proportion. How does this percentage compare to the confidence level selected for the intervals?
Using the app with 85% confidence level shows the percentage of intervals that include the true population proportion is lower. As we stated before it become more narrow as confidence level decrease.
Exercise 10
Lastly, try one more (different) confidence level. First, state how you expect the width of this interval to compare to previous ones you calculated. Then, calculate the bounds of the interval using the infer package and data from samp and interpret it. Finally, use the app to generate many intervals and calculate the proportion of intervals that are capture the true population proportion.
Ran the app with 98% confidence level. We see our percentage of intervals that include the true population proportion is greater than when we ran as 85% level.
samp %>%
specify(response = climate_change_affects, success = "Yes") %>%
generate(reps = 1000, type = "bootstrap") %>%
calculate(stat = "prop") %>%
get_ci(level = 0.99)
## # A tibble: 1 x 2
## lower_ci upper_ci
## <dbl> <dbl>
## 1 0.517 0.833
Exercise 11
Using the app, experiment with different sample sizes and comment on how the widths of intervals change as sample size changes (increases and decreases).
Upon using different sample size as 50, 60 an 70 w experience very similar outputs with the same confidence level.
As stated before when sample size increases the width of confidence intervals decreases. Similarly, when the sample size decreases, the width of confidence intervals increases.
Exercise 12
Finally, given a sample size (say, 60), how does the width of the interval change as you increase the number of bootstrap samples. Hint: Does changing the number of bootstap samples affect the standard error?
Used as an example boostrap to 2000. We see when we increased the bootstrap the standard error decreases. More precise estimates will lead from larger bootstrap.
