I want you to start familiarizing yourself with R Quarto, a medium for compiling text, code, and output in one reproducible document. For this problem, use this .qmd file as a starting point for writing your simulation studies.
When finished with your solutions:
Click the Publish button
Select RPubs
Follow the prompts for creating an account on RPubs and publishing your solution
Submit BOTH your .qmd file and a link to your published html.
Question 1
Recall the following problem. Let \(X\) represents the number of cars driving past a parking ramp in any given hour, and suppose that \(X\sim POI(\lambda)\). Given \(X=x\) cars driving past a ramp, let \(Y\) represent the number that decide to park in the ramp, with \(Y|X=x\sim BIN(x,p)\). We’ve shown analytically that unconditionally, \(Y\sim POI(\lambda p)\). Verify this result via a simulation study over a grid of \(\lambda \in \{10, 20, 30\}\) and \(p\in \{0.2, 0.4, 0.6\}\) using the framework presented in your notes, with 10,000 simulated outcomes per \((\lambda, p)\) combination. Create a faceted plot of the overlaid analytic and empirial CDFs as well as the p-p plot.
(ggplot(aes(x=Y),data=pois_binom)+geom_step(aes(y =Fhat_Y, color ='Simulated CDF'))+geom_step(aes(y = F_Y, color ='Analytic CDF'))+labs(y=expression(P(Y <= y)), x='y', color='', title ='CDF plots')+theme_classic(base_size =12)+facet_grid(lambda~p, labeller = label_both))(ggplot(aes(x=Fhat_Y, y = F_Y),data=pois_binom)+geom_point(shape ='.')+labs(y=expression(P(Y<= y)), x=expression(hat(P)(Y <= y)),title ='p-p plot')+geom_abline(intercept=0, slope=1,color='red')+theme_classic(base_size =10)+facet_grid(lambda~p, labeller = label_both))
Question 2
In this problem we will study the relationship between \(\alpha\) and \(Var(Y)\) for fixed means.
Recall that if \(Y\sim BETA(\alpha,\beta)\) and with \(\mu \equiv E(Y)\), \(\beta = \alpha \cdot \frac{1-\mu}{\mu}\).
Use this fact to simulate 10,000 realizations of \(Y\sim BETA(\alpha,\beta)\) for each combination of \(\alpha \in \{2,4,8,16\}\) and \(\mu\in\{0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7\}\).
Create a plot of the simulated variances as a function of \(\alpha\). Use lines as your geometry, and color-code each line by \(\mu\).
ggplot(data = simmed_vars) +geom_line(aes(x = alpha, y = var_y, col = mu, group = mu), linewidth=1) +labs(y ='Simulated variances', x =expression(alpha), color =expression(mu)) +scale_color_gradient(low='cornflowerblue', high ='goldenrod')+theme_classic(base_size =14)
D)
Comment on how the combination of \(\alpha\) and \(\mu\) impact variance.
For a fixed \(\mu\), the variance decreases as \(\alpha\) increases. For a fixed \(\alpha\), variance and mean are positively associated - as \(\mu\) increases so does the variance.