Untitled

a.p-hat= 47/80 = 0.5875 is the point estimate for p. The estimate of the true proportion or percentage of adults who are lactose intolerant from the population is 58.75%.

b.p-hat= 0.6 is the bayesian estimate for p.

Variance = 0.1536057

There is 0.0125 difference between the point estimate in (a) and Bayesian estimate in (b).

library(gt)
library(dplyr)
p <- seq(0.4, 0.7, by= 0.1)
prior <- c(0.1,0.2,0.44,0.26)
likelihood <- p^{47}*(1-p)^{33}
PL <- prior*likelihood
posterior <- PL/sum(PL)
posteriordist <- as.data.frame(cbind(p,prior,likelihood,posterior))
posteriordist %>%
  gt() %>% 
  cols_align(align="center", columns = everything())

p	prior	likelihood	posterior
0.4	0.10	9.458251e-27	0.0006491288
0.5	0.20	8.271806e-25	0.1135403807
0.6	0.44	2.761140e-24	0.8337986003
0.7	0.26	2.914804e-25	0.0520118903

var(posteriordist$posterior)

## [1] 0.1536057

The posterior probabilities in (b) becomes smaller in (c) except for p=0.6 which its posterior probabilty becomes bigger(from approxiamtely 0.83 to 0.99). Though the Bayesian estimate of p= 0.6 is still the same, again the probability of occurring is much bigger in (c). Variance= 0.2499982 The variance of p in (b) becomes bigger in (c).

p <- seq(0.4, 0.7, by= 0.1)
prior <- c(0.1,0.2,0.44,0.26)
likelihood <- p^{470}*(1-p)^{330}
PL <- prior*likelihood
posterior <- PL/sum(PL)
posteriordist <- as.data.frame(cbind(p,prior,likelihood,posterior))
posteriordist %>%
  gt() %>% 
  cols_align(align="center", columns = everything())

p	prior	likelihood	posterior
0.4	0.10	5.729392e-261	5.055567e-26
0.5	0.20	1.499697e-241	2.646640e-06
0.6	0.44	2.575638e-236	9.999974e-01
0.7	0.26	4.426830e-246	1.015611e-10

var(posteriordist$posterior)

## [1] 0.2499982

mean = 0.5999999 variance = 0.2499995

The above calculations show that sample size matters as well as the prior probabilities as it affects the posterior distribution. The bigger the sample size, the distribution will be more concentrated to the p with the biggest prior probability(p=0.6) given the same sample proportion.Problem C and D also shows that prior probabilties will greatly affect the posterior distribution; the bigger probability given in prior, the bigger its corresponding posterior probabilty compared to other p which has smaller prior probability.

p <- seq(0.4, 0.7, by= 0.1)
prior <- c(0.1,0.2,0.44,0.26)
likelihood <- p^{47}*(1-p)^{33}
PL <- prior*likelihood
posterior1 <- PL/sum(PL)

prior1 <- posterior1
likelihood <- p^{470}*(1-p)^{330}
PL <- prior1*likelihood
posterior <- PL/sum(PL)
posteriordist <- as.data.frame(cbind(p,prior1,likelihood,posterior))
posteriordist %>%
  gt() %>% 
  cols_align(align="center", columns = everything())

p	prior1	likelihood	posterior
0.4	0.0006491288	5.729392e-261	1.731781e-28
0.5	0.1135403807	1.499697e-241	7.928800e-07
0.6	0.8337986003	2.575638e-236	9.999992e-01
0.7	0.0520118903	4.426830e-246	1.072134e-11

mean = sum(p*posterior)
mean

## [1] 0.5999999

var(posteriordist$posterior)

## [1] 0.2499995

Untitled

Trixia Gonzales

2023-03-09