Practice Set 8.1

library(tidyverse)
Warning: package 'tidyverse' was built under R version 4.3.3
Warning: package 'tibble' was built under R version 4.3.3
Warning: package 'tidyr' was built under R version 4.3.3
Warning: package 'readr' was built under R version 4.3.3
Warning: package 'purrr' was built under R version 4.3.3
Warning: package 'dplyr' was built under R version 4.3.3
Warning: package 'stringr' was built under R version 4.3.3
Warning: package 'forcats' was built under R version 4.3.3
Warning: package 'lubridate' was built under R version 4.3.3
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.5
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   4.0.1     ✔ tibble    3.2.1
✔ lubridate 1.9.4     ✔ tidyr     1.3.1
✔ purrr     1.0.2     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(purrrfect)

Attaching package: 'purrrfect'

The following objects are masked from 'package:base':

    replicate, tabulate
library(knitr)
Warning: package 'knitr' was built under R version 4.3.3

Practice Set 8.1

Problem 9

Creating Table comparing estimators.

B <- 5
theta <- qexp(.5,1/B)
lax <- \(x){
  y <- case_when(x=="theta1"~"$\\hat\\theta_1$",
                 x=="theta2"~"$\\hat\\theta_2$",
                 x=="theta3"~"$\\hat\\theta_3$"
              )
  return(y)
}
p9df <- (parameters(~n,c(3,11,51))
    %>%add_trials(10000)
    %>%mutate(Yn = map(n,\(n) rexp(n,1/B)),Y_hat = map_dbl(Yn, mean),Y_med = map_dbl(Yn,median))
    %>%mutate(theta1 = Y_hat,theta2 = Y_hat*log(2),theta3 = Y_med)
)

tab9 <-  (p9df %>% 
  pivot_longer(theta1:theta2:theta3,
               names_to = 'estimator',
               values_to = 'estimate') %>%
  select(-Yn) %>%
  summarize(bias = mean(estimate - theta),
            variance = var(estimate),
            .by = c(n,estimator))
  %>%mutate(mse = bias^2 + variance) %>%
    pivot_longer(bias:variance:mse,
                 names_to = "statistic",
                 values_to = "estimate") %>%
    pivot_wider(names_from = n,values_from = estimate) %>%
    mutate(estat = factor(statistic,c("bias","variance","mse")),latx = map(estimator,lax),stat_est = paste(estat,latx,sep = " "))%>%
    arrange(estat) %>%
    select(-estimator,-statistic,-estat)
)
Warning in x:y: numerical expression has 2 elements: only the first used
Warning in x:y: numerical expression has 2 elements: only the first used
tab9 <- tab9[,c("stat_est",3,11,51)]
kable(tab9,digits =2,col.names = c(" ","n=3","n=11","n=51")) 
n=3 n=11 n=51
bias \(\hat\theta_1\) 1.55 1.52 1.53
bias \(\hat\theta_2\) 0.01 -0.01 -0.01
bias \(\hat\theta_3\) 0.68 0.21 0.04
variance \(\hat\theta_1\) 8.26 2.28 0.48
variance \(\hat\theta_2\) 3.97 1.09 0.23
variance \(\hat\theta_3\) 8.74 2.34 0.49
mse \(\hat\theta_1\) 10.65 4.60 2.81
mse \(\hat\theta_2\) 3.97 1.09 0.23
mse \(\hat\theta_3\) 9.20 2.38 0.49

Discussion:

Overall the results match about what we would expect, where \(\hat\theta_1\) appears to be the worst estimator, as it has a relatively high bias of around \(\frac{1}{\ln(2)} = 1.44\) which was what are analytic results suggested, along with a variance that is only slightly lower than \(\hat\theta_3\), which results in the MSE of \(\hat\theta_1\) to be the largest across all \(n\) . As for the other two, \(\hat\theta_3\) has the highest variance among the three estimators, but makes up for it in the relatively low bias, allowing its MSE to be close to \(\hat\theta_2\) for large \(n\), and \(\hat\theta_2\) has the lowest bias, and variance at every given \(n\), and therefore also the lowest MSE. Thus \(\hat\theta_2\) is most likely the most effeiceint estimator for \(\theta\) .