Loading required package: dagitty
Warning: package 'dagitty' was built under R version 4.2.2
A recently published study on 15 risk factors associated with sudden death in racing thoroughbreds has received attention in the press :
JAVMA Paper: Lasix Among Risk Factors in Sudden Death - BloodHorse
Although fifteen risk factors made it into the final multivariable model the focus has been on the relationship between furosemide administration and sudden death. The quote in the BloodHorse article from one of the authors suggests,
“We identified for the first time a relationship between the use of Lasix in the racehorse and sudden death,” Parkin said. “The use of Lasix in a race increases the risk by about 62%.”
The 62 % figure is taken from the adjusted odds ratio in the final model of 1.62 comparing horses that received lasix compared to horses that did not. It is the point estimate for the odds of sudden death in those receiving furosemide compared to those that did not adjusted for the 14 other risk factors. The full result (point estimate, 95% CI and p-value) is 1.62(1.01 to 2.61) p= 0.047.
The two points of interest in this note are the method of inference and the method of presenting results.
The model is based upon the collection of a large number of variables and associating them with a binary outcome sudden death or not. The purpose of these models can be to describe, predict or explain. Each requires a different approach. It is the explanatory model which the public and the press may be most interested in, does variable x cause variable y? This is different from the question does variable x predict variable y, although a variable can accomplish both. This paper was not designed to predict nor was it designed to explain. Variable selection requires domain expertise. This is more important than statistical testing of variables. Traditionally investigators will select a large number of potential predictor or explanatory variables first evaluate them in a univariable fashion and keep those that “pass” a prespecified threshold. Those that pass the threshold are analyzed in a “stepwise” fashion until the optimal model “fit” is attained. This algorithmic approach is considered suboptimal by medical statisiticians and there is a vast literature of theory and simulations to back this up. In addition if the investigation or interpretation is causal in nature then interpreting all the adjusted odds ratios in the model as if they are the primary exposure rather than control variables is problematic. It is known as the type 2 fallacy since these parameters are frequently presented in table 2 in these manuscripts.
The ideal model for the question of interest “Does furosemide increase sudden death in thoroughbred racehorses?” requires a randomized controlled trial so that both known and unknown confounding variables are balanced and exchageability between exposed and unexposed groups is achieved. In lieu of this an observational study accounting for confounders which impact both the administration of Lasix and the probability of sudden death will be necessary. Including variables which are not confounders can distort the causal relationship, particularly those known as colliders. For complex problems this may require the drafting of a causal graph which identifies relationships among variables of interest. Variable selection solely using statistical testing cannot accomplish this. Commonly known as DAGs (directed acyclic graphs) they can be generated in R or online using
DAGitty - drawing and analyzing causal diagrams (DAGs)
For the paper of interest there are likely many known and unknown confounders
Loading required package: dagitty
Warning: package 'dagitty' was built under R version 4.2.2
It is possible that the level or severity of EIPH is related to both whether or not a horse is treated with furosemide and the horse’s probability of sudden death since EIPH itself has been shown to be one of the most common non musculoskeletal causes of sudden death in racehorses. Those with domain expertise can draw a much more complex graph due to their subject knowledge. It is unlikely that all confounders can be accounted for, but an attempt should be made. A horse may have an underlying condition leading to poor performance which leads a trainer to treat the horse with furosemide.
What is important before making a causal statement is to rule out alternative causes that could explain a statistical relationship. Statistical relationships alone cannot determine causality. No matter how much data is available excessive p-value testing will lead to spurious associations and has been referred to as “data dredging”.
Additionally the presentation of information can greatly impact the publics reaction. In this case an odds ratio was presented as furosemide increases the risk of sudden death in horses by 62%. This is fairly accurate since when the outcome is of low prevalence as in this case the odds ratio and risk ratio are virtually identical. Based on the data from this study the incidence case rate of sudden death was :
overall_ir<- 536/4198073
print(overall_ir)[1] 0.0001276776
nolasix<- 18/233276
print(nolasix)[1] 7.716182e-05
lasix<- 518/3964797
print(lasix)[1] 0.0001306498
absolute<- lasix-nolasix
absolute_risk<-absolute * 100
absolute_risk[1] 0.0053488
With the increase in absolute risk being 0.005% it does not seem as dramatic as 62%. In frequency terms it would mean 5 more horses receiving furosemide would experience death compared to those not receiving furosemide per 100,000 starts. Some authors like Gigerenzer consider the presentation of absolute risk increase/reduction to be more transparent than ratios.