This is an update on my look at the Jung et al paper Female hurricanes are deadlier than male hurricanes: see my previous look, and the blogpost. This is partly a respose to the authors’ comments, and also some observations by Jeremy Freese.
First, read in the data, and do some simple arrangements.
library(gdata)
library(MASS)
library(knitr)
# Read in the data
Data=read.xls("http://www.pnas.org/content/suppl/2014/05/30/1402786111.DCSupplemental/pnas.1402786111.sd01.xlsx", nrows=92, as.is=TRUE)
# Data=read.xls("~/git-repos/Hurricanes/pnas.1402786111.sd01.xlsx", nrows=92, as.is=TRUE)
Data$Category=factor(Data$Category)
Data$Gender_MF=factor(Data$Gender_MF)
#Data$ColourMF=c("lightblue", "pink")[as.numeric(Data$Gender_MF)]
Data$ColourMF=c("blue", "red")[as.numeric(Data$Gender_MF)]
BigH=which(Data$alldeaths>100) # Select hurricanes with > 100 deaths
# scale the covariates
Data$Minpressure.2014.sc=scale(Data$Minpressure_Updated.2014)
Data$NDAM.sc=scale(Data$NDAM)
Data$MasFem.sc=scale(Data$MasFem)
This is the claim of the title of the paper, but it is never actually tested, and I haven’t seen anyone really look at it. What we do know is that there is no effect of femininity at average normalised damage and minimum pressure, but if the effect of normalised damage is linear on the log scale, then when we back-transform, the effect will curve up, so adding the deaths will mean that the effect of hurricanes with more deaths is larger. Thus, overall we might expect to see an effect, averaged over all hurricanes.
I looked at this with a thought experiment. Use all of the hurricanes in the data, and imagine they all had the same feminity of name. Add up the total number of deaths. Then do this for different femininities.
modJSVH=glm.nb(alldeaths~MasFem*(Minpressure_Updated.2014+NDAM), data=Data) # run the authors' model
# Function to simulate deaths
SimAllDeaths=function(masfem, GLM, dat) {
#ndam=0; GLM=modJSVH; dat=Data
dat$MasFem=rep(masfem,nrow(dat))
pred=predict(GLM, newdata=dat, se.fit=TRUE, type="link")
Mean=exp(rnorm(length(pred$fit), pred$fit, pred$se.fit))
Sim=rnegbin(Mean, theta=GLM$theta)
sum(Sim)
}
# SimAllDeaths(masfem=mean(Data$MasFem), GLM=modJSVH, dat=Data)
PredMasFem=seq(min(Data$MasFem), max(Data$MasFem), length=10)
library(parallel)
PredDeaths.lst=mclapply(PredMasFem, function(masfem, GLM,dat, nsim) {
sims=replicate(nsim, SimAllDeaths(masfem, GLM=GLM, dat=dat))
c(Mean=mean(sims), LCI=quantile(sims, 0.025), UCI=quantile(sims,0.975))
}, GLM=modJSVH, dat=Data, nsim=1e3)
PredDeaths=data.frame(
Mean=unlist(lapply(PredDeaths.lst, function(lst) lst["Mean"])),
LCI=unlist(lapply(PredDeaths.lst, function(lst) lst["LCI.2.5%"])),
UCI=unlist(lapply(PredDeaths.lst, function(lst) lst["UCI.97.5%"]))
)
When we look at the plot, we see that there is indeed an increasing number of deaths when the femininity is increased:
par(mar=c(4.1,4.1,1,1), mfrow=c(1,1))
plot(PredMasFem, log(PredDeaths$Mean), type="l", ylim=range(log(PredDeaths)),
xlab="Feminity Index", ylab="log(Deaths)")
polygon(c(PredMasFem,rev(PredMasFem)), log(c(PredDeaths$LCI, rev(PredDeaths$UCI))), col="grey70", border=NA)
lines(PredMasFem, log(PredDeaths$Mean))
Note that hte y axis is on the log scale, but of course if we back-transform, we still get an increase.
It is pleasing to know that the big claim that made the press is substantiated (at least assuming the model is correct etc. etc).
The authors respond to me with this statement:
Bob O’Hara: [http://rpubs.com/oharar/19171] claims that we should have fit a quadratic to the normalized damage predictor. His model involves taking standardized continuous predictors and then further transforming them by smoothing or by squaring them. Linearizing an already standardized continuous predictor is not the appropriate approach as it makes the predictor quite difficult to interpret if not uninterpretable. In the Appendix we show that although there may “look” to be a quadratic effect, statistically there is not. It should also be noted that there is no conceptual reason to expect a quadratic effect of normalized damage.
The second sentence is wrong: what I did was take a square root of the unstandardised continuous predictors, and then add that as an extra term in the analysis. In fairness, I didn’t state this clearly in my RPubs document, but still (a) I didn’t work transform standardised predictors, (b) I didn’t smooth them (I did look at smoothing the effects later: see my updates, but that got messy).
Then, looking at the Appendix, this is what they wrote:
Should we have fit a quadratic to standardized normalized damage, as Bob O’Hara argues? First, it should be noted that O’Hara did not use the same model we used. His major reason was that he felt that he had to smooth the standardized predictors used in the analysis. We noticed that three outlying hurricanes on the far right of the graph pulled down the curve, causing the apparent quadratic effect he mentions. The other values were linear in effect. We used the “ladder” test to determine the optimal transform for both ZMasFem and ZNDAM
First, I didn’t say “fit a quadratic to standardized normalized damage”, although I can understand some of the confusion. Second, I thought I had fitted the same model. This is how it is described in the text:
A series of negative binomial regression analyses (26, 27) were performed to investigate effects of perceived masculinity-femininity of hurricane names (MFI), minimum pressure, normalized damage (NDAM) (28), and the interactions among them on the number of deaths caused by the hurricanes
which suggests a model of
Deaths ~ MFI + minimum pressure + NDAM + MFI:minimum pressure + MFI:NDAM + minimum pressure:NDAM + MFI:minimum pressure:NDAM
assuming a negative binomial response, and a log link. But the results (Table S1) only report effects for a model
Deaths ~ MFI + minimum pressure + NDAM + MFI:minimum pressure + MFI:NDAM
which is the model I used.
Then note that I did not “smooth the standardized predictors used in the analysis”. I don’t even know what that means in this context! What I felt had to be done was to fit a non-linear effect of NDAM, and that was because the residual plots showed that the linear assumption was wrong.
Anyway, the authors use a “ladder test” to look for transformations, which they describe like this:
The ladder test uses Stata’s sktest to determine normality transforms, based on the work of D’Agostino, Belanger, and D’Agostino Jr (1990), with an adjustment made by Patrick Royston, (1991), one of the leading statisticians worldwide in smoothing and transforms. He is the original author of fractional polynomials, which is much better than GAM at smoothing on complex situations. The results are the same with the Shapiro-Wilk and Shapiro-Francia tests for normality.
So I think they transformed the predictors to make them more normal. But this baffles me: it’s nothing to do with what I was saying, so why do it? I had used a square root transformation in my residual plots, because that made it easier to see the mis-fit, but I don’t care if the predictors are normally distributed or not. What I was arguing was that the model should describe the relationship between the predictors and the response, and if normalised damage is used untransformed in the model, the relationship is wrong. Thus, we need to look at how transformations affect teh relationship with the response, the number of deaths.
As it happens, John Tukey did suggest using this ladder idea for regression, but he suggested transforming the covariate, and then running the regression with the transformed covariate. So let’s do this, and see if the models are any better. To compare teh models, I used the differene between the AIC for the authors’ model, and the models with the same transformations of NDAM as the authors tried. A lower AIC is better, so negative values suggest the model is better than the original.
mod3=glm.nb(alldeaths~MasFem*(Minpressure_Updated.2014+I(NDAM^3)), data=Data)$aic
mod2=glm.nb(alldeaths~MasFem*(Minpressure_Updated.2014+I(NDAM^2)), data=Data, init.theta=0.7)$aic
mod1=glm.nb(alldeaths~MasFem*(Minpressure_Updated.2014+NDAM), data=Data)$aic
mod02=glm.nb(alldeaths~MasFem*(Minpressure_Updated.2014+sqrt(NDAM)), data=Data)$aic
modlog=glm.nb(alldeaths~MasFem*(Minpressure_Updated.2014+log(NDAM)), data=Data, init.theta=0.9)$aic
mod02Inv=glm.nb(alldeaths~MasFem*(Minpressure_Updated.2014+I(NDAM^-0.5)), data=Data, init.theta=0.9)$aic
mod2Inv=glm.nb(alldeaths~MasFem*(Minpressure_Updated.2014+I(NDAM^-2)), data=Data)$aic
mod3Inv=glm.nb(alldeaths~MasFem*(Minpressure_Updated.2014+I(NDAM^-3)), data=Data)$aic
AICs=matrix(c(mod3, mod2, mod1, mod02, modlog, mod02Inv, mod2Inv, mod3Inv)-mod1, ncol=1)
rownames(AICs)=c("Original", "Cube", "Square", "Square root", "Log", "Inverse Square root", "Inverse Square", "Inverse Cube")
colnames(AICs)=c("AIC")
Note that some models spew out lots of warnings, but these are the models that fit worse than the original model. We see that both the square root and log transformations give better fits:
kable(AICs, align=c('l','l'))
| AIC | |
|---|---|
| Original | 28.68 |
| Cube | 25.15 |
| Square | 0.00 |
| Square root | -18.65 |
| Log | -14.85 |
| Inverse Square root | 22.17 |
| Inverse Square | 26.97 |
| Inverse Cube | 27.40 |
The authors go on:
Only 3 outliers pull the line down more than warranted. The other ZNDAM values were quite normal. The statistical test balances kurtosis and skewness to access normality, and does not base an entire transform on three values out of 92. Readers should be able to see this by looking at O’Hara’s graph.
I had already looked at these three ‘outliers’, in response to a comment on the blog post, and removing them didn’t remove the curvature.
Overall, I think the authors have not really answered my objections.
On his blog, Jeremy Freese pointed out that the effects of rpessure and damage went in opposite directions, and thus the authors could have argued in the opposite direction. The authors’ response was, basically, to argue that the effect wasn’t robust, so it could be overlooked, which Jeremy didn’t find terribly convincing.
A problem is that damage and minimum pressure are correlated (r=-0.56):
par(mar=c(4.1,4.1,1,1), mfrow=c(1,1))
plot(Data$NDAM, Data$Minpressure_Updated.2014, xlab="Normalised damage", ylab="Minimum pressure")
so, even though a higher damage means that the model suggests that more feminine names lead to more deaths, they are associated with lower pressure, which suggests that more feminine names lead to fewer deaths.
In the battle of damage and pressure who wins? Strangely, neither Freese or the authors answered this, which means it’s up to me. I simply calculate the slope of the effect of femininity for each hurricane, i.e. each combination of pressure and damage. Then I plot them against pressure and damage (and add a straight regression line), to see the effect:
MasFemEff=coef(modJSVH)["MasFem"]+as.matrix(Data[,c("Minpressure_Updated.2014", "NDAM")])%*%
as.matrix(coef(modJSVH)[c("MasFem:Minpressure_Updated.2014", "MasFem:NDAM")])
par(mfrow=c(1,2), mar=c(4.1,2,3,1), oma=c(0,2,0,0))
plot(Data$Minpressure_Updated.2014, MasFemEff, xlab="Minimum pressure", ylab="", main="Pressure")
abline(lm(MasFemEff~Data$Minpressure_Updated.2014))
plot(Data$NDAM[Data$NDAM<4e4], MasFemEff[Data$NDAM<4e4], xlab="Normalised damage", ylab="", main="Damage")
abline(lm(MasFemEff[Data$NDAM<4e4]~Data$NDAM[Data$NDAM<4e4]))
abline(lm(MasFemEff~Data$NDAM), col=2)
mtext("Femininity index", 2, outer=TRUE, line=0.5)
I think this is quite striking. The effect of pressure isn’t obvious, although that’s partly because the data have a bit of an odd shape. But the damage effect is clearly positive, and that is even without the three point that have a large damage (adding these three gives the regression line in red). So, the opposite effect of pressure may be real, but it isn’t strong enough to counter the damage effect.
I haven’t changed my overall view of the paper: I don’t think my criticisms were really addressed. There are a couple of other points of interest where I think the authors could haev been more effective in answering their critics.