We’re looking at deaths as a proportion of all hospital discharges.
Binomial experiment assumptions:
df1.data <-
tibble::tribble(
~fiscal_year, ~site, ~discharges, ~in_hospital_deaths,
"16/17", "VGH", 4288, 32,
"17/18", "VGH", 4476, 31,
"18/19", "VGH", 4450, 39
)
df1.data <-
df1.data %>%
mutate(p_hat = in_hospital_deaths/discharges)
df1.data %>%
kable() %>%
kable_styling(bootstrap_options = c("striped",
"condensed",
"responsive"))
| fiscal_year | site | discharges | in_hospital_deaths | p_hat |
|---|---|---|---|---|
| 16/17 | VGH | 4288 | 32 | 0.0074627 |
| 17/18 | VGH | 4476 | 31 | 0.0069258 |
| 18/19 | VGH | 4450 | 39 | 0.0087640 |
See Brown, Cai, DasGupta (2001), Statistical Science
The standard “textbook” CI for binomial proportion has poor coverage when the actual proportion is close to 0 or 1
Better option: use the Wilson interval.
Reference: http://math.furman.edu/~dcs/courses/math47/R/library/Hmisc/html/binconf.html
“Following Agresti and Coull, the Wilson interval is to be preferred and so is the default.”
binconf(32, 4288, method = "wilson")
## PointEst Lower Upper
## 0.007462687 0.005291247 0.01051583
# binconf(32, 4288, method = "wilson")[2] # lower est
# binconf(32, 4288, method = "wilson")[3] # upper est
# is this a symmetric CI? No.
# binconf(32, 4288, method = "wilson")[1] - binconf(32, 4288, method = "wilson")[2]
# binconf(32, 4288, method = "wilson")[3] - binconf(32, 4288, method = "wilson")[1]
# for comparison, using the "textbook interval"
binconf(32, 4288, method = "asymptotic")
## PointEst Lower Upper
## 0.007462687 0.004886711 0.01003866
# is this a symmetric CI? Yes.
# binconf(32, 4288, method = "asymptotic")[1] - binconf(32, 4288, method = "asymptotic")[2]
# binconf(32, 4288, method = "asymptotic")[3] - binconf(32, 4288, method = "asymptotic")[1]
df2.data_with_ci <-
df1.data %>%
mutate(ci_lower = map2_dbl(in_hospital_deaths,
discharges,
function(x, y){
binconf(x, y,
method = "wilson")[2]
}),
ci_upper = map2_dbl(in_hospital_deaths,
discharges,
function(x, y){
binconf(x, y,
method = "wilson")[3]
}))
df2.data_with_ci %>%
kable() %>%
kable_styling(bootstrap_options = c("striped",
"condensed",
"responsive"))
| fiscal_year | site | discharges | in_hospital_deaths | p_hat | ci_lower | ci_upper |
|---|---|---|---|---|---|---|
| 16/17 | VGH | 4288 | 32 | 0.0074627 | 0.0052912 | 0.0105158 |
| 17/18 | VGH | 4476 | 31 | 0.0069258 | 0.0048836 | 0.0098137 |
| 18/19 | VGH | 4450 | 39 | 0.0087640 | 0.0064178 | 0.0119576 |
Yes, it’s morbid, but a death is a “success” here↩