In “Unlucky Numbers” (Science, 20 January 2023), an infographic (p. 231) demonstrates the effect of investigative bias in a case of suspected serial murder. By slightly changing how many deaths are attributed to a nurse under suspicion, biases could have a drastic effect on how suspicious a sequence of deaths appears to be.
This script shows how the probabilities in the infographic were calculated. The method is adapted from Appendices 5 and 8 in the 2022 Royal Statistical Society report “Healthcare serial killer or coincidence? Statistical issues in investigation of suspected medical misconduct.” In these appendices, the authors describe hypothetical examples that show how small investigative biases can accumulate.
The Science story adapted these examples to create the case of Nurse X, who comes under suspicion of murder after a string of unexplained deaths. The case focuses on “unexplained” deaths because some deaths in a hospital will be unambiguously due to natural causes. Real-world medical murder cases have centred around more ambiguous deaths. For instance, these might be deaths that were unexpected, or that had anomalous autopsy results.
The case of Nurse X is not a real murder case, but draws on real-world investigative mistakes.
This section sets up the basic parameters of the investigation. The script is written so that these basic parameters can easily be changed. This means that it’s possible to explore how the results are affected by changing a few crucial variables:
Set the time period of the investigation (in years), as well as how many shifts the hospital has per day.
The infographic uses a period of 2 years, with 3 shifts per day at the hospital.
n_years = 2
shifts_per_day = 3
Set how many shifts Nurse X works, and what proportion of these shifts are morning shifts.
The infographic uses 250 working days per year for Nurse X, with 80% of her shifts in the morning.
working_days_per_year = 250
nurse_x_proportion_morning = 0.8
This is a high number of working days, assuming 50 working weeks of 5 working days each. Some nurses work longer shifts (such as 10 or 12 hours) for fewer days per week (such as 3 or 4 days), but some nurses in some countries do work shorter shifts (8 hours) 5 days per week. A lower number of working days for Nurse X results in lower p-values.
Based on the parameters set above, we can calculate how many shifts Nurse X worked over the investigation time period:
nurse_x_shifts <- working_days_per_year * n_years
nurse_x_shifts_morning <- nurse_x_proportion_morning * nurse_x_shifts
nurse_x_shifts_not_morning <- nurse_x_shifts - nurse_x_shifts_morning
Nurse X works 500 shifts over 2 years. 400 of these are morning shifts, and 100 are shifts at other times.
We can also calculate how many shifts did not have Nurse X present:
n_days <- n_years * 365
not_x_shifts <- n_days * shifts_per_day - nurse_x_shifts
not_x_shifts_morning <- n_days - nurse_x_shifts_morning
not_x_shifts_not_morning <- not_x_shifts - not_x_shifts_morning
There are 1690 shifts without Nurse X present. 330 of these are morning shifts, and 1360 are shifts at other times.
Set the true number of unexplained deaths that occurred on Nurse X’s shifts, in the morning and at other times. This is the number of deaths that would be uncovered by an unbiased investigation.
nurse_x_deaths_morning = 7
nurse_x_deaths_not_morning = 2
nurse_x_deaths_unbiased = nurse_x_deaths_morning + nurse_x_deaths_not_morning
9 deaths occur on Nurse X’s shifts.
Set the true number of unexplained deaths that occurred during other shifts, in the morning and at other times. This is the number of deaths that would be uncovered by an unbiased investigation.
not_x_deaths_morning = 2
not_x_deaths_not_morning = 4
not_x_deaths_unbiased = not_x_deaths_morning + not_x_deaths_not_morning
6 deaths occur on shifts without Nurse X present.
In an unblinded investigation, deaths may be assigned to Nurse X’s shifts because of bias. For instance, investigators might decide that a death occurring shortly after her shift ended was, for all intents and purposes, on Nurse X’s shift. Or investigators might reassess deaths on Nurse X’s shifts that were not considered suspicious at the time, looking for any evidence that points to the death being less innocent than previously thought.
Set the number of deaths that are assigned to Nurse X in a biased investigation: those that did not occur on her shifts, but are assigned to her shifts anyway (reassigned_to_x), and those that were previously not considered suspicious, but are reconsidered on the basis that Nurse X was present (reassigned_as_suspicious).
Add these deaths to those that truly occurred during Nurse X’s shifts, to calculate the total deaths attributed to her in a biased investigation.
The infographic added only two deaths to Nurse X’s count in a biased investigation. Because the total number of deaths is low, even a small change like this can have a big impact. If there were hundreds of deaths in each category, small changes in the totals would make less of a difference.
reassigned_to_x = 1
reassigned_as_suspicious = 1
nurse_x_deaths_biased = nurse_x_deaths_unbiased +
reassigned_to_x + reassigned_as_suspicious
11 deaths are assigned to Nurse X’s shifts in a biased investigation.
If deaths are reassigned to Nurse X’s shifts, they are no longer counted in the total for shifts without Nurse X present. Subtract the reassigned deaths from the death count for other shifts:
not_x_deaths_biased = not_x_deaths_unbiased - reassigned_to_x
5 deaths are assigned to other shifts in a biased investigation.
This section uses the investigation parameters defined above to explore how four possible investigative “pathways” would play out.
In this pathway, investigators examine deaths that took place on Nurse X’s shifts. Investigators know which nurses were present on each shift, and that Nurse X is under suspicion. With close scrutiny on Nurse X’s cases, they are vulnerable to cognitive biases. The total number of deaths they find is the biased death count calculated above.
They do not reassess any deaths that took place without Nurse X present, or compare Nurse X’s incident rate with any broader datasets or trends. This may seem like an unrealistically blinkered investigative process, but it is what happened in the early stages of the Lucia de Berk case: After the Juliana Children’s Hospital became suspicious, other hospitals where De Berk had worked were asked to examine deaths and resuscitations on or just after De Berk’s shifts.
Produce a simple data table showing the deaths on Nurse X’s shifts, compared to those on other shifts (in this case, zero):
severely_biased <- data.frame(deaths = c(nurse_x_deaths_biased, 0),
nurse_present = as.factor(c("yes", "no")),
number_of_shifts = c(nurse_x_shifts, not_x_shifts))
severely_biased
## deaths nurse_present number_of_shifts
## 1 11 yes 500
## 2 0 no 1690
A statistical model tells us about the relationship between the variables in the table. How strongly are unexplained deaths associated with the presence of Nurse X?
If Nurse X killed the patients, we should find a relationship between her presence and deaths. There should be more deaths when Nurse X is present.
There may be other methods that could reasonably be used here; this method (log-linear model and chi-sq test) is the one used in the RSS report.
fit_severely_biased <- glm(deaths~nurse_present+offset(log(number_of_shifts)),
family = poisson(),
severely_biased)
anova_severely_biased <- anova(fit_severely_biased, test='Chisq')
anova_severely_biased
## Analysis of Deviance Table
##
## Model: poisson, link: log
##
## Response: deaths
##
## Terms added sequentially (first to last)
##
##
## Df Deviance Resid. Df Resid. Dev Pr(>Chi)
## NULL 1 32.495
## nurse_present 1 32.495 0 0.000 0.00000001195 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
The prosecutors’ hypothesis is that Nurse X killed the patients. The null hypothesis, or defense hypothesis, is that Nurse X did not kill the patients.
The p-value in the statistical test tells us the probability of finding the pattern of deaths that we see (or an even more extreme pattern) if the null hypothesis is true. In other words, if Nurse X did not kill the patients, how improbable would this pattern of deaths be?
anova_severely_biased[[5]][2]
## [1] 0.00000001194952
Expressing the p-value differently may make it easier to interpret:
severely_biased_p <- 1/anova_severely_biased[[5]][2]
The p-value is equivalent to 1 in 83685397.8589264.
With the investigation parameters used in the infographic, this is 1 in 83,685,398. This figure will be different if you have used different parameters.
The p-value tells us about the probability of the pattern of deaths if Nurse X is innocent, but it cannot tell us how probable it is that Nurse X is innocent. To calculate an approximate probability of her innocence or guilt, we would have to factor in all the other available evidence.
In this pathway, investigators reexamine all deaths over the past 2 years, not just those on Nurse X’s shifts. But because investigators know which nurses were present on which shift, they are still vulnerable to cognitive biases. The death counts they find for Nurse X and other shifts are the biased death counts calculated above.
less_biased <- data.frame(deaths = c(nurse_x_deaths_biased, not_x_deaths_biased),
nurse_present = as.factor(c("yes", "no")),
number_of_shifts=c(nurse_x_shifts, not_x_shifts))
less_biased
## deaths nurse_present number_of_shifts
## 1 11 yes 500
## 2 5 no 1690
How strongly are unexplained deaths associated with the presence of Nurse X?
fit_less_biased <- glm(deaths~nurse_present+offset(log(number_of_shifts)),
family = poisson(),
less_biased)
anova_less_biased <- anova(fit_less_biased, test='Chisq')
anova_less_biased
## Analysis of Deviance Table
##
## Model: poisson, link: log
##
## Response: deaths
##
## Terms added sequentially (first to last)
##
##
## Df Deviance Resid. Df Resid. Dev Pr(>Chi)
## NULL 1 15.212
## nurse_present 1 15.212 0 0.000 0.00009609 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
If Nurse X did not kill the patients, how probable would this pattern of deaths (or a more extreme pattern) be?
anova_less_biased[[5]][2]
## [1] 0.00009608893
Expressing the p-value differently may make it easier to interpret:
less_biased_p <- 1/anova_less_biased[[5]][2]
The p-value is equivalent to 1 in 10407.0261731.
With the investigation parameters used in the infographic, this is 1 in 10,407.
Blinded investigators are not unintentionally steered by knowing which nurse worked each shift. They find the unbiased death counts calculated above.
blinded <- data.frame(deaths = c(nurse_x_deaths_unbiased, not_x_deaths_unbiased),
nurse_present = as.factor(c("yes", "no")),
number_of_shifts=c(nurse_x_shifts, not_x_shifts))
blinded
## deaths nurse_present number_of_shifts
## 1 9 yes 500
## 2 6 no 1690
How strongly are unexplained deaths associated with the presence of Nurse X?
fit_blinded <- glm(deaths~nurse_present+offset(log(number_of_shifts)),
family = poisson(),
blinded)
anova_blinded <- anova(fit_blinded, test='Chisq')
anova_blinded
## Analysis of Deviance Table
##
## Model: poisson, link: log
##
## Response: deaths
##
## Terms added sequentially (first to last)
##
##
## Df Deviance Resid. Df Resid. Dev Pr(>Chi)
## NULL 1 9.5066
## nurse_present 1 9.5066 0 0.0000 0.002047 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
If Nurse X did not kill the patients, how probable would this pattern of deaths (or a more extreme pattern) be?
anova_blinded[[5]][2]
## [1] 0.002047338
Expressing the p-value differently may make it easier to interpret:
blinded_p <- 1/anova_blinded[[5]][2]
The p-value is equivalent to 1 in 488.4391279.
With the investigation parameters used in the infographic, this is 1 in 488.
In this investigative pathway, not only are investigators blinded, but they try to consider other factors that may influence the pattern of deaths. They know that hospital patients are more likely to die—and have their deaths recorded—in the mornings. They also know that busy morning shifts often have higher staffing rates, meaning that more nurses are present when these deaths are recorded. They find that Nurse X worked mainly in the mornings. They check to see whether accounting for this changes the results.
careful <- data.frame(deaths = c(nurse_x_deaths_morning, nurse_x_deaths_not_morning,
not_x_deaths_morning, not_x_deaths_not_morning),
nurse_present = as.factor(c("yes", "yes", "no", "no")),
morning = as.factor(c("yes", "no", "yes", "no")),
number_of_shifts=c(nurse_x_shifts_morning, nurse_x_shifts_not_morning,
not_x_shifts_morning, not_x_shifts_not_morning))
careful
## deaths nurse_present morning number_of_shifts
## 1 7 yes yes 400
## 2 2 yes no 100
## 3 2 no yes 330
## 4 4 no no 1360
How strongly are unexplained deaths associated with the presence of Nurse X?
fit_careful <- glm(deaths~morning+nurse_present+offset(log(number_of_shifts)),
family = poisson(),
careful)
anova_careful <- anova(fit_careful, test='Chisq')
anova_careful
## Analysis of Deviance Table
##
## Model: poisson, link: log
##
## Response: deaths
##
## Terms added sequentially (first to last)
##
##
## Df Deviance Resid. Df Resid. Dev Pr(>Chi)
## NULL 3 10.1670
## morning 1 4.4503 2 5.7167 0.03490 *
## nurse_present 1 5.2238 1 0.4929 0.02228 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
If Nurse X did not kill the patients, how probable would this pattern of deaths (or a more extreme pattern) be?
anova_careful[[5]][3]
## [1] 0.02227937
Expressing the p-value differently may make it easier to interpret:
careful_p <- 1/anova_careful[[5]][3]
The p-value is equivalent to 1 in 44.8845675.
With the investigation parameters used in the infographic, this is 1 in 45.
This would still be considered “statistically significant” because the p-value is less than 1 in 20 (0.05), but it is nonetheless substantially less worrying than the p-values generated by other investigative pathways. In a thorough investigation, other factors could be considered, such as how Nurse X’s rates compare to other nurses with similar shift patterns; how the hospital’s death rates over this period compare to historical trends; and whether there are any other changes at the hospital that should be taken into account, such as increasing admission rates or understaffing.
This R Markdown file can be downloaded at github.com/cathleen-ogrady/science-magazine-unlucky-numbers.
Contact the author, Cathleen O’Grady, at unlucky-numbers@cathleenogrady.com.
This R Markdown script is licensed under a Creative Commons Attribution 4.0 International License (CC BY 4.0).