I want you to start familiarizing yourself with R Quarto, a medium for compiling text, code, and output in one reproducible document. For this problem, use this .qmd file as a starting point for writing your simulation studies.
When finished with your solutions:
Click the Publish button
Select RPubs
Follow the prompts for creating an account on RPubs and publishing your solution
Submit BOTH your .qmd file and a link to your published html.
Question 1
Recall the following problem. Let \(X\) represents the number of cars driving past a parking ramp in any given hour, and suppose that \(X\sim POI(\lambda)\). Given \(X=x\) cars driving past a ramp, let \(Y\) represent the number that decide to park in the ramp, with \(Y|X=x\sim BIN(x,p)\). We’ve shown analytically that unconditionally, \(Y\sim POI(\lambda p)\). Verify this result via a simulation study over a grid of \(\lambda \in \{10, 20, 30\}\) and \(p\in \{0.2, 0.4, 0.6\}\) using the framework presented in your notes, with 10,000 simulated outcomes per \((\lambda, p)\) combination. Create a faceted plot of the overlaid analytic and empirial CDFs as well as the p-p plot.
(ggplot(aes(x=X), data=simstudy)+geom_step(aes(y =Fhat_X, color ='Simulated CDF'))+geom_step(aes(y = F_X, color ='Analytic CDF'))+labs(y=expression(P(X <= x)), x='x', color='')+theme_classic(base_size =12)+facet_wrap(~lambda, labeller = label_both))
Question 2
In this problem we will study the relationship between \(\alpha\) and \(Var(Y)\) for fixed \(\mu\) for the beta distribution.
Recall that if \(Y\sim BETA(\alpha,\beta)\) and with \(\mu \equiv E(Y)\), \(\beta = \alpha \cdot \frac{1-\mu}{\mu}\).
Use this fact to simulate 10,000 realizations of \(Y\sim BETA(\alpha,\beta)\) for each combination of \(\alpha \in \{2,4,8,16\}\) and \(\mu\in\{0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7\}\).
A)
Create a faceted plot of the empirical densities overlaid with the analytic densities for each \((\alpha, \mu)\) combination.
Create a plot of the simulated variances as a function of \(\alpha\). Use lines as your geometry, and color-code each line by \(\mu\).
(ggplot(beta_var, aes(x = alpha, y = varY, group = mu, color =factor(mu)))+geom_line(linewidth =1.1)+geom_point(size =2)+scale_color_manual(values =c("#F4A7B9","#F6C272","#C3D676","#9ACD97","#F59BAF","#FFB570", "#A5E3B9" ))+labs(x =expression(alpha),y ="Simulated Var(Y)",color =expression(mu))+theme_classic(base_size =12)+theme(legend.position ="right",panel.grid =element_blank(), ))
D)
Comment on how the combination of \(\alpha\) and \(\mu\) impact variance.
As α increases, the variance of Y gets smaller the distribution becomes tighter and more concentrated around its mean. When α is small, the shape is wider and more variable. For any fixed α, variance changes with μ: it’s lowest near 0 or 1 and reaches its highest point around the middle (roughly μ ≈ 0.5). Together, α controls how stable the distribution is, while μ controls where that stability sits along the 0–1 range.