Data

We’re looking at deaths as a proportion of all hospital discharges.

Binomial experiment assumptions:

  • Each discharge is a “trial” with 2 possible outcomes - death/no death
  • Trials are independent (this may be questionable)
  • Constant probability of “success”1 on each trial (this may be questionable)
df1.data <- 
  tibble::tribble(
    ~fiscal_year, ~site, ~discharges, ~in_hospital_deaths,
         "16/17", "VGH",        4288,                  32,
         "17/18", "VGH",        4476,                  31,
         "18/19", "VGH",        4450,                  39
    )

df1.data <- 
  df1.data %>% 
  mutate(p_hat = in_hospital_deaths/discharges)  

df1.data %>% 
  kable() %>% 
  kable_styling(bootstrap_options = c("striped",
              "condensed", 
              "responsive"))
fiscal_year site discharges in_hospital_deaths p_hat
16/17 VGH 4288 32 0.0074627
17/18 VGH 4476 31 0.0069258
18/19 VGH 4450 39 0.0087640

Confidence interval for binomial proportion:

See Brown, Cai, DasGupta (2001), Statistical Science

The standard “textbook” CI for binomial proportion has poor coverage when the actual proportion is close to 0 or 1

Better option: use the Wilson interval.

Reference: http://math.furman.edu/~dcs/courses/math47/R/library/Hmisc/html/binconf.html

“Following Agresti and Coull, the Wilson interval is to be preferred and so is the default.”

Updating the data

df2.data_with_ci <-
  df1.data %>% 
  mutate(ci_lower = map2_dbl(in_hospital_deaths, 
                             discharges, 
                             function(x, y){
                               binconf(x, y, 
                                       method = "wilson")[2]
                               
                             }), 
         ci_upper = map2_dbl(in_hospital_deaths, 
                             discharges, 
                             function(x, y){
                               binconf(x, y, 
                                       method = "wilson")[3]
                               
                             }))

df2.data_with_ci %>% 
  kable() %>% 
  kable_styling(bootstrap_options = c("striped",
              "condensed", 
              "responsive"))
fiscal_year site discharges in_hospital_deaths p_hat ci_lower ci_upper
16/17 VGH 4288 32 0.0074627 0.0052912 0.0105158
17/18 VGH 4476 31 0.0069258 0.0048836 0.0098137
18/19 VGH 4450 39 0.0087640 0.0064178 0.0119576

     


  1. Yes, it’s morbid, but a death is a “success” here