HW 6

Part 1 - Chapter 8: Posterior Inference & Prediction

.9 (Hypothesis tests: Part I) For parameter π, suppose you have a Beta(1,0.8) prior model and a Beta(4,3) posterior. You wish to test the null hypothesis that π ≤ 0.4 versus the alternative that π > 0.4.

a) What is the posterior probability for the alternative hypothesis?
# Posterior probability that π > 0.4
post_prob_1 <- 1 - pbeta(0.4, 4, 3)
cat("Posterior probability for the alternative hypothesis is:",post_prob_1)
## Posterior probability for the alternative hypothesis is: 0.8208
b) Calculate and interpret the posterior odds.
# Posterior odds
post_odds_1 <- post_prob_1 / (1 - post_prob_1)
cat("Posterior odds are:", post_odds_1,"that the posterior odds of 4.58 to 1 is in favor of the alternative, meaning ~4.6x more likely that π > 0.4 than π ≤ 0.4.")
## Posterior odds are: 4.580357 that the posterior odds of 4.58 to 1 is in favor of the alternative, meaning ~4.6x more likely that π > 0.4 than π ≤ 0.4.
c) Calculate and interpret the prior odds.
# Posterior odds
post_odds_1 <- post_prob_1 / (1 - post_prob_1)
cat("Posterior odds are:", post_odds_1,"that the posterior odds of 4.58 to 1 is in favor of the alternative, meaning ~4.6x more likely that π > 0.4 than π ≤ 0.4.")
## Posterior odds are: 4.580357 that the posterior odds of 4.58 to 1 is in favor of the alternative, meaning ~4.6x more likely that π > 0.4 than π ≤ 0.4.
# Prior probability that pi π ≤ 0.4
prior_prob_1 <- pbeta(0.4, 1, 0.8)
cat("Prior prob is:", prior_prob_1,"\n")
## Prior prob is: 0.3354602
# Prior odds
prior_odds_1 <- prior_prob_1 / (1 - prior_prob_1)
cat("Prior odds is:", prior_odds_1,"meaning that before observing the data, the null hypothesis π ≤ 0.4 was about half as likely as the alternative π > 0.4.")
## Prior odds is: 0.5048008 meaning that before observing the data, the null hypothesis π ≤ 0.4 was about half as likely as the alternative π > 0.4.
d) Calculate and interpret the Bayes Factor.
# Bayes factor
BF_1 <- post_odds_1 / prior_odds_1
cat("The Bayes Factor is:",BF_1,"meaning that the data increased the odds in favor of the alternative hypothesis (Ï€ > 0.4) by a factor of approx. 9.")
## The Bayes Factor is: 9.073594 meaning that the data increased the odds in favor of the alternative hypothesis (Ï€ > 0.4) by a factor of approx. 9.
e) Putting this together, explain your conclusion about these hypotheses to someone who is unfamiliar with Bayesian statistics.
cat("Before the observation of the data, it was thought to  be about 2 to 1 that π > 0.4.\n\nAfter the data observation, the belief rose to ~4.6 to 1, alternatively an 82% that π > 0.4.\n\nThe Bayes Factor being approx. 9 meant that the data made the case for π > 0.4 9x stronger than before.\n\nWe can reasonably proceed assuming π > 0.4, but more data is better if the decision needs more precision.")
## Before the observation of the data, it was thought to  be about 2 to 1 that π > 0.4.
## 
## After the data observation, the belief rose to ~4.6 to 1, alternatively an 82% that π > 0.4.
## 
## The Bayes Factor being approx. 9 meant that the data made the case for π > 0.4 9x stronger than before.
## 
## We can reasonably proceed assuming π > 0.4, but more data is better if the decision needs more precision.

.14 (Climate change: estimation) Let π denote the proportion of U.S. adults that do not believe in climate change. To learn about π, we’ll use survey data on n adults and count up the number of these that don’t believe in climate change, Y.

a) Explain which Bayesian model is appropriate for this analysis: Beta-Binomial, Gamma-Poisson, or Normal-Normal.
cat("Given π as a proportion between 0 and 1, and Y as the count of 'non-believer,' so Y|π~Binomial(n,π). \n\nA beta prior for π is conjugate to the binomial likelihood, to which the Beta-Binomial model would be appropriate for this analysis.")
## Given π as a proportion between 0 and 1, and Y as the count of 'non-believer,' so Y|π~Binomial(n,π). 
## 
## A beta prior for π is conjugate to the binomial likelihood, to which the Beta-Binomial model would be appropriate for this analysis.
b) Specify and discuss your own prior model for π.
cat("Going with a neutral prior, so I think π~Beta(1,1) would be best, not favoring low or high values, a neutral stance prior to observing the data in order to not create a strong prior belief about the proportion of U.S. adults who do not believe in climate change.\n\n")
## Going with a neutral prior, so I think π~Beta(1,1) would be best, not favoring low or high values, a neutral stance prior to observing the data in order to not create a strong prior belief about the proportion of U.S. adults who do not believe in climate change.
alpha_1 <- 1; beta_1 <- 1

prior_mean_1 <- alpha_1/(alpha_1+beta_1)
prior_ess_1 <- alpha_1+beta_1
prior_q_1 <- qbeta(c(0.025, 0.5, 0.975), alpha_1, beta_1)

cat("Using a neutral prior Beta(1,1):\n\nPrior mean is", prior_mean_1,".","\nEffective sample size is", prior_ess_1,".","\n95% prior interval is", prior_q_1[1], "to",prior_q_1[3],".")
## Using a neutral prior Beta(1,1):
## 
## Prior mean is 0.5 . 
## Effective sample size is 2 . 
## 95% prior interval is 0.025 to 0.975 .
c) For the remainder of the exercise, we’ll utilize the authors’ Beta(1,2) prior for π. How does your prior understanding differ from that of the authors?
alpha_2 <- 1; beta_2 <- 2

prior_mean_2 <- alpha_2/(alpha_2+beta_2)
prior_ess_2 <- alpha_2+beta_2
prior_q_2 <- qbeta(c(0.025, 0.5, 0.975), alpha_2, beta_2)

cat("Using a neutral prior Beta(1,1):\n\nPrior mean is", prior_mean_2,".","\nEffective sample size is", prior_ess_2,".","\n95% prior interval is", prior_q_2[1], "to",prior_q_2[3],".")
## Using a neutral prior Beta(1,1):
## 
## Prior mean is 0.3333333 . 
## Effective sample size is 3 . 
## 95% prior interval is 0.01257912 to 0.8418861 .
cat("\n\nThe author's Beta(1,2) prior assumes that low non-belief rates are more plausible, whereas my Beta(1,1) prior attempts assume complete uncertainty.")
## 
## 
## The author's Beta(1,2) prior assumes that low non-belief rates are more plausible, whereas my Beta(1,1) prior attempts assume complete uncertainty.
d) Using the pulse_of_the_nation data from the bayesrules package, report the sample proportion of surveyed adults with the opinion that climate_change is Not Real At All.
data("pulse_of_the_nation")

summary(pulse_of_the_nation$climate_change)
##               Not Real At All     Real and Caused by People 
##                           150                           655 
## Real but not Caused by People 
##                           195
not_real_at_all <- pulse_of_the_nation$climate_change == "Not Real At All"

sample_prop <- mean(not_real_at_all)
cat("\nThe sample proportion of surveyed adults with the opinion that climate change is not real at all is :", sample_prop)
## 
## The sample proportion of surveyed adults with the opinion that climate change is not real at all is : 0.15
e) In light of the Beta(1,2) prior and data, calculate and interpret a (middle) 95% posterior credible interval for π. NOTE: You’ll first need to specify your posterior model of π.
Y <- 150; n <- 1000

post_a <- alpha_2 + Y
post_b <- beta_2 + (n-Y)

post_ci <- qbeta(c(0.025, 0.975), post_a, post_b)
post_mean <- post_a / (post_a+post_b)

cat("Posterior model is Beta(", post_a, ",", post_b,")\n\nPosterior mean is", round(post_mean, 4), "\n\n95% posterior credible interval for π:", post_ci[1], "to", post_ci[2])
## Posterior model is Beta( 151 , 852 )
## 
## Posterior mean is 0.1505 
## 
## 95% posterior credible interval for π: 0.1291 to 0.1733164

.15 (Climate change: hypothesis testing) Continuing the analysis from exercise 8.14, suppose you with to test a researcher’s claim that more that 10% of people believe in climate change: H0: π ≤ 0.1 versus Ha: π > 0.1.

a) What decision might you make about these hypotheses utilizing the credible interval from the previous exercise?
cat("Observing that the 95% posterior credible interval for π: 0.1291 to 0.1733164, then the alternative hypothesis π > 0.1 is more likely than H0: π ≤ 0.1 as the 95% posterior credible interval seems to contain no values below 0.1.")
## Observing that the 95% posterior credible interval for π: 0.1291 to 0.1733164, then the alternative hypothesis π > 0.1 is more likely than H0: π ≤ 0.1 as the 95% posterior credible interval seems to contain no values below 0.1.
b) Calculate and interpret the posterior probability of Ha.
post_prob_Ha <- 1 - pbeta(0.1, post_a, post_b)
cat("Posterior probability for π > 0.1 is", post_prob_Ha, "meaning there's essentially 99.99997% close to 100% posterior probability that more than 10% of adults do not believe in climate change.")
## Posterior probability for π > 0.1 is 0.9999997 meaning there's essentially 99.99997% close to 100% posterior probability that more than 10% of adults do not believe in climate change.
c) Calculate and interpret the Bayes Factor for your hypothesis test.
prior_prob_Ha <- pbeta(0.1, 1,2, lower.tail = FALSE)
prior_prob_Ha
## [1] 0.81
prior_odds_2 <- prior_prob_Ha / (1-prior_prob_Ha)
prior_odds_2
## [1] 4.263158
post_odds_2 <- post_prob_Ha / (1-post_prob_Ha)
post_odds_2
## [1] 3197444
BF_2 <- post_odds_2 / prior_odds_2
cat("Bayes Factor (Ha vs H0) is", BF_2, "or 750k to 1 in favor of the Ha: π > 0.1, becoming decisive evidence.")
## Bayes Factor (Ha vs H0) is 750017.6 or 750k to 1 in favor of the Ha: π > 0.1, becoming decisive evidence.
d) Putting this together, explain your conclusion about π.
cat("The data strongly supports the conclusion that π > 0.1, from the confidence interval given in solution 14 as the 95% posterior credible interval seems to contain no values below 0.1, to the posterior probability being close to 100%, and high favor all support the conlusion reached at the beginning of this sentence.")
## The data strongly supports the conclusion that π > 0.1, from the confidence interval given in solution 14 as the 95% posterior credible interval seems to contain no values below 0.1, to the posterior probability being close to 100%, and high favor all support the conlusion reached at the beginning of this sentence.

.16 (Climate change with MCMC: simulation) In the next exercises, you’ll repeat and build upon your climate change analysis using MCMC simulation.

a) Simulate the posterior model of , the proportion of U.S. adults that do not believe in climate change, with rstan using 4 chains and 10000 iterations per chain.
climate_model <-"
data {
  real<lower=0> post_a;
  real<lower=0> post_b;
}
parameters {
  real<lower=0, upper=1> pi;
}
model {
  pi ~ beta(post_a, post_b);
}
"

climate_sim <- stan(model_code = climate_model, data = list(post_a = post_a, post_b = post_b), chains = 4, iter = 10000*2, seed = 36501)
## Trying to compile a simple C file
## Running /opt/R/4.5.1/lib/R/bin/R CMD SHLIB foo.c
## using C compiler: ‘gcc (Ubuntu 9.4.0-1ubuntu1~20.04.2) 9.4.0’
## gcc -I"/opt/R/4.5.1/lib/R/include" -DNDEBUG   -I"/cloud/lib/x86_64-pc-linux-gnu-library/4.5/Rcpp/include/"  -I"/cloud/lib/x86_64-pc-linux-gnu-library/4.5/RcppEigen/include/"  -I"/cloud/lib/x86_64-pc-linux-gnu-library/4.5/RcppEigen/include/unsupported"  -I"/cloud/lib/x86_64-pc-linux-gnu-library/4.5/BH/include" -I"/cloud/lib/x86_64-pc-linux-gnu-library/4.5/StanHeaders/include/src/"  -I"/cloud/lib/x86_64-pc-linux-gnu-library/4.5/StanHeaders/include/"  -I"/cloud/lib/x86_64-pc-linux-gnu-library/4.5/RcppParallel/include/"  -I"/cloud/lib/x86_64-pc-linux-gnu-library/4.5/rstan/include" -DEIGEN_NO_DEBUG  -DBOOST_DISABLE_ASSERTS  -DBOOST_PENDING_INTEGER_LOG2_HPP  -DSTAN_THREADS  -DUSE_STANC3 -DSTRICT_R_HEADERS  -DBOOST_PHOENIX_NO_VARIADIC_EXPRESSION  -D_HAS_AUTO_PTR_ETC=0  -include '/cloud/lib/x86_64-pc-linux-gnu-library/4.5/StanHeaders/include/stan/math/prim/fun/Eigen.hpp'  -D_REENTRANT -DRCPP_PARALLEL_USE_TBB=1   -I/usr/local/include    -fpic  -g -O2  -c foo.c -o foo.o
## In file included from /cloud/lib/x86_64-pc-linux-gnu-library/4.5/RcppEigen/include/Eigen/Core:19,
##                  from /cloud/lib/x86_64-pc-linux-gnu-library/4.5/RcppEigen/include/Eigen/Dense:1,
##                  from /cloud/lib/x86_64-pc-linux-gnu-library/4.5/StanHeaders/include/stan/math/prim/fun/Eigen.hpp:22,
##                  from <command-line>:
## /cloud/lib/x86_64-pc-linux-gnu-library/4.5/RcppEigen/include/Eigen/src/Core/util/Macros.h:679:10: fatal error: cmath: No such file or directory
##   679 | #include <cmath>
##       |          ^~~~~~~
## compilation terminated.
## make: *** [/opt/R/4.5.1/lib/R/etc/Makeconf:202: foo.o] Error 1
## 
## SAMPLING FOR MODEL 'anon_model' NOW (CHAIN 1).
## Chain 1: 
## Chain 1: Gradient evaluation took 5e-06 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.05 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1: 
## Chain 1: 
## Chain 1: Iteration:     1 / 20000 [  0%]  (Warmup)
## Chain 1: Iteration:  2000 / 20000 [ 10%]  (Warmup)
## Chain 1: Iteration:  4000 / 20000 [ 20%]  (Warmup)
## Chain 1: Iteration:  6000 / 20000 [ 30%]  (Warmup)
## Chain 1: Iteration:  8000 / 20000 [ 40%]  (Warmup)
## Chain 1: Iteration: 10000 / 20000 [ 50%]  (Warmup)
## Chain 1: Iteration: 10001 / 20000 [ 50%]  (Sampling)
## Chain 1: Iteration: 12000 / 20000 [ 60%]  (Sampling)
## Chain 1: Iteration: 14000 / 20000 [ 70%]  (Sampling)
## Chain 1: Iteration: 16000 / 20000 [ 80%]  (Sampling)
## Chain 1: Iteration: 18000 / 20000 [ 90%]  (Sampling)
## Chain 1: Iteration: 20000 / 20000 [100%]  (Sampling)
## Chain 1: 
## Chain 1:  Elapsed Time: 0.041 seconds (Warm-up)
## Chain 1:                0.05 seconds (Sampling)
## Chain 1:                0.091 seconds (Total)
## Chain 1: 
## 
## SAMPLING FOR MODEL 'anon_model' NOW (CHAIN 2).
## Chain 2: 
## Chain 2: Gradient evaluation took 2e-06 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.02 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2: 
## Chain 2: 
## Chain 2: Iteration:     1 / 20000 [  0%]  (Warmup)
## Chain 2: Iteration:  2000 / 20000 [ 10%]  (Warmup)
## Chain 2: Iteration:  4000 / 20000 [ 20%]  (Warmup)
## Chain 2: Iteration:  6000 / 20000 [ 30%]  (Warmup)
## Chain 2: Iteration:  8000 / 20000 [ 40%]  (Warmup)
## Chain 2: Iteration: 10000 / 20000 [ 50%]  (Warmup)
## Chain 2: Iteration: 10001 / 20000 [ 50%]  (Sampling)
## Chain 2: Iteration: 12000 / 20000 [ 60%]  (Sampling)
## Chain 2: Iteration: 14000 / 20000 [ 70%]  (Sampling)
## Chain 2: Iteration: 16000 / 20000 [ 80%]  (Sampling)
## Chain 2: Iteration: 18000 / 20000 [ 90%]  (Sampling)
## Chain 2: Iteration: 20000 / 20000 [100%]  (Sampling)
## Chain 2: 
## Chain 2:  Elapsed Time: 0.041 seconds (Warm-up)
## Chain 2:                0.047 seconds (Sampling)
## Chain 2:                0.088 seconds (Total)
## Chain 2: 
## 
## SAMPLING FOR MODEL 'anon_model' NOW (CHAIN 3).
## Chain 3: 
## Chain 3: Gradient evaluation took 2e-06 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.02 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3: 
## Chain 3: 
## Chain 3: Iteration:     1 / 20000 [  0%]  (Warmup)
## Chain 3: Iteration:  2000 / 20000 [ 10%]  (Warmup)
## Chain 3: Iteration:  4000 / 20000 [ 20%]  (Warmup)
## Chain 3: Iteration:  6000 / 20000 [ 30%]  (Warmup)
## Chain 3: Iteration:  8000 / 20000 [ 40%]  (Warmup)
## Chain 3: Iteration: 10000 / 20000 [ 50%]  (Warmup)
## Chain 3: Iteration: 10001 / 20000 [ 50%]  (Sampling)
## Chain 3: Iteration: 12000 / 20000 [ 60%]  (Sampling)
## Chain 3: Iteration: 14000 / 20000 [ 70%]  (Sampling)
## Chain 3: Iteration: 16000 / 20000 [ 80%]  (Sampling)
## Chain 3: Iteration: 18000 / 20000 [ 90%]  (Sampling)
## Chain 3: Iteration: 20000 / 20000 [100%]  (Sampling)
## Chain 3: 
## Chain 3:  Elapsed Time: 0.044 seconds (Warm-up)
## Chain 3:                0.048 seconds (Sampling)
## Chain 3:                0.092 seconds (Total)
## Chain 3: 
## 
## SAMPLING FOR MODEL 'anon_model' NOW (CHAIN 4).
## Chain 4: 
## Chain 4: Gradient evaluation took 2e-06 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.02 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4: 
## Chain 4: 
## Chain 4: Iteration:     1 / 20000 [  0%]  (Warmup)
## Chain 4: Iteration:  2000 / 20000 [ 10%]  (Warmup)
## Chain 4: Iteration:  4000 / 20000 [ 20%]  (Warmup)
## Chain 4: Iteration:  6000 / 20000 [ 30%]  (Warmup)
## Chain 4: Iteration:  8000 / 20000 [ 40%]  (Warmup)
## Chain 4: Iteration: 10000 / 20000 [ 50%]  (Warmup)
## Chain 4: Iteration: 10001 / 20000 [ 50%]  (Sampling)
## Chain 4: Iteration: 12000 / 20000 [ 60%]  (Sampling)
## Chain 4: Iteration: 14000 / 20000 [ 70%]  (Sampling)
## Chain 4: Iteration: 16000 / 20000 [ 80%]  (Sampling)
## Chain 4: Iteration: 18000 / 20000 [ 90%]  (Sampling)
## Chain 4: Iteration: 20000 / 20000 [100%]  (Sampling)
## Chain 4: 
## Chain 4:  Elapsed Time: 0.042 seconds (Warm-up)
## Chain 4:                0.046 seconds (Sampling)
## Chain 4:                0.088 seconds (Total)
## Chain 4:
print(climate_sim, pars = "pi", probs = c(0.025, 0.5, 0.975))
## Inference for Stan model: anon_model.
## 4 chains, each with iter=20000; warmup=10000; thin=1; 
## post-warmup draws per chain=10000, total post-warmup draws=40000.
## 
##    mean se_mean   sd 2.5%  50% 97.5% n_eff Rhat
## pi 0.15       0 0.01 0.13 0.15  0.17 15637    1
## 
## Samples were drawn using NUTS(diag_e) at Wed Nov 12 04:55:46 2025.
## For each parameter, n_eff is a crude measure of effective sample size,
## and Rhat is the potential scale reduction factor on split chains (at 
## convergence, Rhat=1).
as.array(climate_sim, pars = "pi") %>% 
  head(4)
## , , parameters = pi
## 
##           chains
## iterations   chain:1   chain:2   chain:3   chain:4
##       [1,] 0.1382014 0.1553439 0.1516677 0.1459827
##       [2,] 0.1537943 0.1547467 0.1501377 0.1468035
##       [3,] 0.1472950 0.1451355 0.1442579 0.1406576
##       [4,] 0.1472586 0.1470572 0.1553130 0.1460884
b) Produce and discuss trace plots, overlaid density plots, and autocorrelation plots for the four chains.
mcmc_trace(climate_sim, pars = "pi", size = 0.1)

# Density plots of individual chains *(from bayesrulesbook, overlaid with climate_sim)
mcmc_dens_overlay(climate_sim, pars = "pi") + 
  ylab("density")

# Histogram of the Markov chain values *(from bayesrulesbook, overlaid with climate_sim)
plot_1 <- mcmc_hist(climate_sim, pars = "pi") + 
  yaxis_text(TRUE) + 
  ylab("count")

# Density plot of the Markov chain values *(from bayesrulesbook, overlaid with climate_sim)
plot_2 <- mcmc_dens(climate_sim, pars = "pi") + 
  yaxis_text(TRUE) + 
  ylab("density")

plot_1 + plot_2
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

mcmc_acf(climate_sim, pars = "pi") #*(from bayesrulesbook, overlaid with climate_sim)

cat("The trace, density, and autocorrelation diagnostics confirm that all four chains for π mixed well and converged around the same posterior distribution with the data seeming to fall in between 0.125 and 0.175 close to the interval calculated in exercise 14 e and with the data peaking close to .15 aligns with the posterior mean in this part as well.")
## The trace, density, and autocorrelation diagnostics confirm that all four chains for π mixed well and converged around the same posterior distribution with the data seeming to fall in between 0.125 and 0.175 close to the interval calculated in exercise 14 e and with the data peaking close to .15 aligns with the posterior mean in this part as well.
c) Report the effective sample size ratio and R-hat values for your simulation, explaining what these values mean in context.
size_ratio <- neff_ratio(climate_sim, pars = "pi") #*(from bayesrulesbook, overlaid with climate_sim)
size_ratio
## [1] 0.3909137
r_hat <- rhat(climate_sim, pars = "pi") #*(from bayesrulesbook, overlaid with climate_sim)
r_hat
## [1] 1.000023
cat("\nEffective sample size ratio ~0.39, means that the effective # of independent samples is about 39% of the total # of posterior draws, being 0.39 * 40k = 15.6k. Mixing well and autocorellation between consecutive draws is minimal.\n\nR-hat ~ 1.00 means that all chains are samlping from the same stationary distribution and shows excellent convergence.")
## 
## Effective sample size ratio ~0.39, means that the effective # of independent samples is about 39% of the total # of posterior draws, being 0.39 * 40k = 15.6k. Mixing well and autocorellation between consecutive draws is minimal.
## 
## R-hat ~ 1.00 means that all chains are samlping from the same stationary distribution and shows excellent convergence.

.17 (Climate change with MCMC: estimation and hypothesis testing)

a) Utilize your MCMC simulation to approximate a (middle) 95% posterior credible interval for .Do so using the tidy() shortcut function as well as a direct calculation from your chain values.
tidy(climate_sim, pars = "pi", conf.int = TRUE, conf.level = 0.95) #*(from bayesrulesbook, overlaid with climate_sim)
## # A tibble: 1 × 5
##   term  estimate std.error conf.low conf.high
##   <chr>    <dbl>     <dbl>    <dbl>     <dbl>
## 1 pi       0.151    0.0113    0.129     0.174
climate_chains_df <- as.data.frame(climate_sim, pars = "pi", include = TRUE)
dim(climate_chains_df)
## [1] 40000     1
# Calculate posterior summaries of pi *(from bayesrulesbook, overlaid with climate_sim)
climate_chains_df %>% 
  summarize(post_mean = mean(pi), 
            post_median = median(pi),
            post_mode = sample_mode(pi),
            lower_95 = quantile(pi, 0.025),
            upper_95 = quantile(pi, 0.975))
##   post_mean post_median post_mode  lower_95 upper_95
## 1 0.1506123   0.1504179 0.1486488 0.1290131  0.17352
b) Utilize your MCMC simulation to approximate the posterior probability that π > 0.1.
post_draws <- rstan::extract(climate_sim)$pi

post_prob_approx <- mean(post_draws > 0.1)

cat("posterior probability that π > 0.1 is", post_prob_approx)
## posterior probability that π > 0.1 is 1
c) How close are the approximations in parts a and b to the actual corresponding posterior values you calculated in Exercises 8.14 and 8.15?
cat("Results for the 95% posterior credible interval ~0.13 to ~0.17 and mean being 0.15 are practically identical approximation in part a above and the results in 8.14.\n\nAs for part b, the posterior probability being 1 is close to the 99.99997% ~100% shown in exercise 8.15.\n\nDifferences are well within Monte Carlo error.")
## Results for the 95% posterior credible interval ~0.13 to ~0.17 and mean being 0.15 are practically identical approximation in part a above and the results in 8.14.
## 
## As for part b, the posterior probability being 1 is close to the 99.99997% ~100% shown in exercise 8.15.
## 
## Differences are well within Monte Carlo error.

.18 (Climate change with MCMC: prediction)

a) Suppose you were to survey 100 more adults. Use your MCMC simulation to approximate the posterior predictive model of Y′, the number that don’t believe in climate change. Construct a histogram visualization of this model.
set.seed(36501)
n_update <- 100; y_rep <- rbinom(length(post_draws), size = n_update, prob = post_draws)

predict_mean <- mean(y_rep)
predict_ci <- quantile(y_rep, c(0.025, 0.975))

cat("Posterior predictive E[Y′] is ", predict_mean,"\n\n90% predictive interval for Y′ is", predict_ci[1], "to", predict_ci[2])
## Posterior predictive E[Y′] is  15.0622 
## 
## 90% predictive interval for Y′ is 8 to 23
ggplot(data.frame(y_rep = y_rep), aes(x=y_rep)) +
  geom_histogram(binwidth = 1) + labs(
    title = "Posterior Predictive Distribution of Y′ (n′=100)", x = "# who state 'Not Real At All' among 100 new adults(Y′)", y = "Count of posterior draws"
  )

##### b) Summarize your observations of the posterior predictive model of Y′.

cat("The mean falls around ~15.1 which is still close to the previous 0.15 accepted probability scaled up instead from 1 to 100. The data seems to most be between 10 and 20 which is the equivalent of being between 0.1 and 0.2. The data seems to however be more inclusive of values below 10 and and above 20, meaning pi below .1 is possible but still most reasonable to be 15 people or 0.15 probability as observed in earlier exercises.")
## The mean falls around ~15.1 which is still close to the previous 0.15 accepted probability scaled up instead from 1 to 100. The data seems to most be between 10 and 20 which is the equivalent of being between 0.1 and 0.2. The data seems to however be more inclusive of values below 10 and and above 20, meaning pi below .1 is possible but still most reasonable to be 15 people or 0.15 probability as observed in earlier exercises.
c) Approximate the probability that at least 20 of the 100 people don’t believe in climate change.
prob_20 <- mean(y_rep >= 20)

cat("Probability of Y′ >= 20 is approx:", prob_20)
## Probability of Y′ >= 20 is approx: 0.121875

Part 2 - Chapter 9: Simple Normal Regression

.9 (How humid is too humid: model building) Throughout this chapter, we explored how bike ridership fluctuates with temperature. But what about humidity? In the next exercises, you will explore the Normal regression model of rides ( ) by humidity ( ) using the bikes dataset. Based on past bikeshare analyses, suppose we have the following prior understanding of this relationship:

head(bikes)
##         date season year month day_of_week weekend holiday temp_actual
## 1 2011-01-01 winter 2011   Jan         Sat    TRUE      no    57.39952
## 2 2011-01-03 winter 2011   Jan         Mon   FALSE      no    46.49166
## 3 2011-01-04 winter 2011   Jan         Tue   FALSE      no    46.76000
## 4 2011-01-05 winter 2011   Jan         Wed   FALSE      no    48.74943
## 5 2011-01-07 winter 2011   Jan         Fri   FALSE      no    46.50332
## 6 2011-01-08 winter 2011   Jan         Sat    TRUE      no    44.17700
##   temp_feel humidity windspeed weather_cat rides
## 1  64.72625  80.5833  10.74988      categ2   654
## 2  49.04645  43.7273  16.63670      categ1  1229
## 3  51.09098  59.0435  10.73983      categ1  1454
## 4  52.63430  43.6957  12.52230      categ1  1518
## 5  50.79551  49.8696  11.30464      categ2  1362
## 6  46.60286  53.5833  17.87587      categ2   891
a) Tune the Normal regression model (9.6) to match our prior understanding. Use careful notation to write out the complete Bayesian structure of this model.
bikes <- bikes %>%
  mutate(humidity_b = humidity - mean(humidity, na.rm=TRUE))

bike_model_humid <- stan_glm(
  rides ~ humidity_b, data = bikes, family = gaussian(), 
  prior_intercept = normal(5000, 2000),
  prior = normal(-10, 5),
  prior_aux = exponential(rate = 1/2000),
  chains =4, iter=10000, seed=36501)
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 1).
## Chain 1: 
## Chain 1: Gradient evaluation took 4.5e-05 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.45 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1: 
## Chain 1: 
## Chain 1: Iteration:    1 / 10000 [  0%]  (Warmup)
## Chain 1: Iteration: 1000 / 10000 [ 10%]  (Warmup)
## Chain 1: Iteration: 2000 / 10000 [ 20%]  (Warmup)
## Chain 1: Iteration: 3000 / 10000 [ 30%]  (Warmup)
## Chain 1: Iteration: 4000 / 10000 [ 40%]  (Warmup)
## Chain 1: Iteration: 5000 / 10000 [ 50%]  (Warmup)
## Chain 1: Iteration: 5001 / 10000 [ 50%]  (Sampling)
## Chain 1: Iteration: 6000 / 10000 [ 60%]  (Sampling)
## Chain 1: Iteration: 7000 / 10000 [ 70%]  (Sampling)
## Chain 1: Iteration: 8000 / 10000 [ 80%]  (Sampling)
## Chain 1: Iteration: 9000 / 10000 [ 90%]  (Sampling)
## Chain 1: Iteration: 10000 / 10000 [100%]  (Sampling)
## Chain 1: 
## Chain 1:  Elapsed Time: 0.333 seconds (Warm-up)
## Chain 1:                0.299 seconds (Sampling)
## Chain 1:                0.632 seconds (Total)
## Chain 1: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 2).
## Chain 2: 
## Chain 2: Gradient evaluation took 1.1e-05 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.11 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2: 
## Chain 2: 
## Chain 2: Iteration:    1 / 10000 [  0%]  (Warmup)
## Chain 2: Iteration: 1000 / 10000 [ 10%]  (Warmup)
## Chain 2: Iteration: 2000 / 10000 [ 20%]  (Warmup)
## Chain 2: Iteration: 3000 / 10000 [ 30%]  (Warmup)
## Chain 2: Iteration: 4000 / 10000 [ 40%]  (Warmup)
## Chain 2: Iteration: 5000 / 10000 [ 50%]  (Warmup)
## Chain 2: Iteration: 5001 / 10000 [ 50%]  (Sampling)
## Chain 2: Iteration: 6000 / 10000 [ 60%]  (Sampling)
## Chain 2: Iteration: 7000 / 10000 [ 70%]  (Sampling)
## Chain 2: Iteration: 8000 / 10000 [ 80%]  (Sampling)
## Chain 2: Iteration: 9000 / 10000 [ 90%]  (Sampling)
## Chain 2: Iteration: 10000 / 10000 [100%]  (Sampling)
## Chain 2: 
## Chain 2:  Elapsed Time: 0.462 seconds (Warm-up)
## Chain 2:                0.308 seconds (Sampling)
## Chain 2:                0.77 seconds (Total)
## Chain 2: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 3).
## Chain 3: 
## Chain 3: Gradient evaluation took 1.2e-05 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.12 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3: 
## Chain 3: 
## Chain 3: Iteration:    1 / 10000 [  0%]  (Warmup)
## Chain 3: Iteration: 1000 / 10000 [ 10%]  (Warmup)
## Chain 3: Iteration: 2000 / 10000 [ 20%]  (Warmup)
## Chain 3: Iteration: 3000 / 10000 [ 30%]  (Warmup)
## Chain 3: Iteration: 4000 / 10000 [ 40%]  (Warmup)
## Chain 3: Iteration: 5000 / 10000 [ 50%]  (Warmup)
## Chain 3: Iteration: 5001 / 10000 [ 50%]  (Sampling)
## Chain 3: Iteration: 6000 / 10000 [ 60%]  (Sampling)
## Chain 3: Iteration: 7000 / 10000 [ 70%]  (Sampling)
## Chain 3: Iteration: 8000 / 10000 [ 80%]  (Sampling)
## Chain 3: Iteration: 9000 / 10000 [ 90%]  (Sampling)
## Chain 3: Iteration: 10000 / 10000 [100%]  (Sampling)
## Chain 3: 
## Chain 3:  Elapsed Time: 0.265 seconds (Warm-up)
## Chain 3:                0.31 seconds (Sampling)
## Chain 3:                0.575 seconds (Total)
## Chain 3: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 4).
## Chain 4: 
## Chain 4: Gradient evaluation took 1.2e-05 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.12 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4: 
## Chain 4: 
## Chain 4: Iteration:    1 / 10000 [  0%]  (Warmup)
## Chain 4: Iteration: 1000 / 10000 [ 10%]  (Warmup)
## Chain 4: Iteration: 2000 / 10000 [ 20%]  (Warmup)
## Chain 4: Iteration: 3000 / 10000 [ 30%]  (Warmup)
## Chain 4: Iteration: 4000 / 10000 [ 40%]  (Warmup)
## Chain 4: Iteration: 5000 / 10000 [ 50%]  (Warmup)
## Chain 4: Iteration: 5001 / 10000 [ 50%]  (Sampling)
## Chain 4: Iteration: 6000 / 10000 [ 60%]  (Sampling)
## Chain 4: Iteration: 7000 / 10000 [ 70%]  (Sampling)
## Chain 4: Iteration: 8000 / 10000 [ 80%]  (Sampling)
## Chain 4: Iteration: 9000 / 10000 [ 90%]  (Sampling)
## Chain 4: Iteration: 10000 / 10000 [100%]  (Sampling)
## Chain 4: 
## Chain 4:  Elapsed Time: 0.453 seconds (Warm-up)
## Chain 4:                0.303 seconds (Sampling)
## Chain 4:                0.756 seconds (Total)
## Chain 4:
prior_summary(bike_model_humid)
## Priors for model 'bike_model_humid' 
## ------
## Intercept (after predictors centered)
##  ~ normal(location = 5000, scale = 2000)
## 
## Coefficients
##  ~ normal(location = -10, scale = 5)
## 
## Auxiliary (sigma)
##  ~ exponential(rate = 5e-04)
## ------
## See help('prior_summary.stanreg') for more details
b) To explore our combined prior understanding of the model parameters, simulate the Normal regression prior model with 5 chains run for 8000 iterations each. HINT: You can use the same stan_glm() syntax that you would use to simulate the posterior, but include prior_PD = TRUE .
bike_prior <- stan_glm(
  rides ~ humidity_b, data = bikes, family = gaussian(), 
  prior_intercept = normal(5000, 2000),
  prior = normal(-10, 5),
  prior_aux = exponential(rate = 1/2000),
  prior_PD = TRUE,
  chains =5, iter=8000*2, seed=36501)
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 1).
## Chain 1: 
## Chain 1: Gradient evaluation took 1.8e-05 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.18 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1: 
## Chain 1: 
## Chain 1: Iteration:     1 / 16000 [  0%]  (Warmup)
## Chain 1: Iteration:  1600 / 16000 [ 10%]  (Warmup)
## Chain 1: Iteration:  3200 / 16000 [ 20%]  (Warmup)
## Chain 1: Iteration:  4800 / 16000 [ 30%]  (Warmup)
## Chain 1: Iteration:  6400 / 16000 [ 40%]  (Warmup)
## Chain 1: Iteration:  8000 / 16000 [ 50%]  (Warmup)
## Chain 1: Iteration:  8001 / 16000 [ 50%]  (Sampling)
## Chain 1: Iteration:  9600 / 16000 [ 60%]  (Sampling)
## Chain 1: Iteration: 11200 / 16000 [ 70%]  (Sampling)
## Chain 1: Iteration: 12800 / 16000 [ 80%]  (Sampling)
## Chain 1: Iteration: 14400 / 16000 [ 90%]  (Sampling)
## Chain 1: Iteration: 16000 / 16000 [100%]  (Sampling)
## Chain 1: 
## Chain 1:  Elapsed Time: 0.608 seconds (Warm-up)
## Chain 1:                0.185 seconds (Sampling)
## Chain 1:                0.793 seconds (Total)
## Chain 1: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 2).
## Chain 2: 
## Chain 2: Gradient evaluation took 9e-06 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.09 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2: 
## Chain 2: 
## Chain 2: Iteration:     1 / 16000 [  0%]  (Warmup)
## Chain 2: Iteration:  1600 / 16000 [ 10%]  (Warmup)
## Chain 2: Iteration:  3200 / 16000 [ 20%]  (Warmup)
## Chain 2: Iteration:  4800 / 16000 [ 30%]  (Warmup)
## Chain 2: Iteration:  6400 / 16000 [ 40%]  (Warmup)
## Chain 2: Iteration:  8000 / 16000 [ 50%]  (Warmup)
## Chain 2: Iteration:  8001 / 16000 [ 50%]  (Sampling)
## Chain 2: Iteration:  9600 / 16000 [ 60%]  (Sampling)
## Chain 2: Iteration: 11200 / 16000 [ 70%]  (Sampling)
## Chain 2: Iteration: 12800 / 16000 [ 80%]  (Sampling)
## Chain 2: Iteration: 14400 / 16000 [ 90%]  (Sampling)
## Chain 2: Iteration: 16000 / 16000 [100%]  (Sampling)
## Chain 2: 
## Chain 2:  Elapsed Time: 0.532 seconds (Warm-up)
## Chain 2:                0.17 seconds (Sampling)
## Chain 2:                0.702 seconds (Total)
## Chain 2: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 3).
## Chain 3: 
## Chain 3: Gradient evaluation took 9e-06 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.09 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3: 
## Chain 3: 
## Chain 3: Iteration:     1 / 16000 [  0%]  (Warmup)
## Chain 3: Iteration:  1600 / 16000 [ 10%]  (Warmup)
## Chain 3: Iteration:  3200 / 16000 [ 20%]  (Warmup)
## Chain 3: Iteration:  4800 / 16000 [ 30%]  (Warmup)
## Chain 3: Iteration:  6400 / 16000 [ 40%]  (Warmup)
## Chain 3: Iteration:  8000 / 16000 [ 50%]  (Warmup)
## Chain 3: Iteration:  8001 / 16000 [ 50%]  (Sampling)
## Chain 3: Iteration:  9600 / 16000 [ 60%]  (Sampling)
## Chain 3: Iteration: 11200 / 16000 [ 70%]  (Sampling)
## Chain 3: Iteration: 12800 / 16000 [ 80%]  (Sampling)
## Chain 3: Iteration: 14400 / 16000 [ 90%]  (Sampling)
## Chain 3: Iteration: 16000 / 16000 [100%]  (Sampling)
## Chain 3: 
## Chain 3:  Elapsed Time: 0.615 seconds (Warm-up)
## Chain 3:                0.262 seconds (Sampling)
## Chain 3:                0.877 seconds (Total)
## Chain 3: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 4).
## Chain 4: 
## Chain 4: Gradient evaluation took 9e-06 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.09 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4: 
## Chain 4: 
## Chain 4: Iteration:     1 / 16000 [  0%]  (Warmup)
## Chain 4: Iteration:  1600 / 16000 [ 10%]  (Warmup)
## Chain 4: Iteration:  3200 / 16000 [ 20%]  (Warmup)
## Chain 4: Iteration:  4800 / 16000 [ 30%]  (Warmup)
## Chain 4: Iteration:  6400 / 16000 [ 40%]  (Warmup)
## Chain 4: Iteration:  8000 / 16000 [ 50%]  (Warmup)
## Chain 4: Iteration:  8001 / 16000 [ 50%]  (Sampling)
## Chain 4: Iteration:  9600 / 16000 [ 60%]  (Sampling)
## Chain 4: Iteration: 11200 / 16000 [ 70%]  (Sampling)
## Chain 4: Iteration: 12800 / 16000 [ 80%]  (Sampling)
## Chain 4: Iteration: 14400 / 16000 [ 90%]  (Sampling)
## Chain 4: Iteration: 16000 / 16000 [100%]  (Sampling)
## Chain 4: 
## Chain 4:  Elapsed Time: 0.468 seconds (Warm-up)
## Chain 4:                0.166 seconds (Sampling)
## Chain 4:                0.634 seconds (Total)
## Chain 4: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 5).
## Chain 5: 
## Chain 5: Gradient evaluation took 9e-06 seconds
## Chain 5: 1000 transitions using 10 leapfrog steps per transition would take 0.09 seconds.
## Chain 5: Adjust your expectations accordingly!
## Chain 5: 
## Chain 5: 
## Chain 5: Iteration:     1 / 16000 [  0%]  (Warmup)
## Chain 5: Iteration:  1600 / 16000 [ 10%]  (Warmup)
## Chain 5: Iteration:  3200 / 16000 [ 20%]  (Warmup)
## Chain 5: Iteration:  4800 / 16000 [ 30%]  (Warmup)
## Chain 5: Iteration:  6400 / 16000 [ 40%]  (Warmup)
## Chain 5: Iteration:  8000 / 16000 [ 50%]  (Warmup)
## Chain 5: Iteration:  8001 / 16000 [ 50%]  (Sampling)
## Chain 5: Iteration:  9600 / 16000 [ 60%]  (Sampling)
## Chain 5: Iteration: 11200 / 16000 [ 70%]  (Sampling)
## Chain 5: Iteration: 12800 / 16000 [ 80%]  (Sampling)
## Chain 5: Iteration: 14400 / 16000 [ 90%]  (Sampling)
## Chain 5: Iteration: 16000 / 16000 [100%]  (Sampling)
## Chain 5: 
## Chain 5:  Elapsed Time: 0.379 seconds (Warm-up)
## Chain 5:                0.177 seconds (Sampling)
## Chain 5:                0.556 seconds (Total)
## Chain 5:
prior_summary(bike_prior)
## Priors for model 'bike_prior' 
## ------
## Intercept (after predictors centered)
##  ~ normal(location = 5000, scale = 2000)
## 
## Coefficients
##  ~ normal(location = -10, scale = 5)
## 
## Auxiliary (sigma)
##  ~ exponential(rate = 5e-04)
## ------
## See help('prior_summary.stanreg') for more details
c) Plot 100 prior plausible model lines (β0+β1X) and 4 datasets simulated under the priors.
hum_bar <- mean(bikes$humidity, na.rm=TRUE)

grid_h <- data.frame(
  humidity = seq(min(bikes$humidity, na.rm = TRUE),
                 max(bikes$humidity, na.rm = TRUE), length.out=200))

grid_h$humidity_b <- grid_h$humidity - hum_bar

lines_df <- add_epred_draws(
  bike_prior, newdata = grid_h, ndraws = 100)

ggplot(lines_df, aes(humidity, .epred, group = .draw)) + geom_line() + labs(
  title = "100 Prior-Plausible Regression 
  lines", x = "Humidity %", y = "Rides")

set.seed(36501)
pred_df <- add_predicted_draws(bike_prior, newdata = bikes, ndraws = 4)

ggplot(pred_df, aes(humidity, .prediction)) + geom_point() + facet_wrap(~ .draw) + labs(
  title = "Four Datasets Simmed Under The Priors", x = "Humidity %", y = "Rides")

##### d) Describe our overall prior understanding of the relationship between ridership and humidity.

cat("Observing the 1st plot, we can see much uncertainty between 1000 and 9000, with data centered around 5000, but with a visibly small negative but uncertain linear relationship between humidity and ridership, with lots of noise around the mean trend.")
## Observing the 1st plot, we can see much uncertainty between 1000 and 9000, with data centered around 5000, but with a visibly small negative but uncertain linear relationship between humidity and ridership, with lots of noise around the mean trend.

.10 (How humid is too humid: data)

a) Plot and discuss the observed relationship between ridership and humidity in the bikes data.
b) Does simple Normal regression seem to be a reasonable approach to modeling this relationship? Explain

.11 (How humid is too humid: posterior)

a)
b)
c)

.12 (How much humid is too humid: posterior interpretation)

a)
b)
c)
d)

.13 (How humid is too humid: prediction)

a)
b)
c)
d)