Since zero-inflated negative binomial has both a count model and a logit model, each of the two models should have good predictors. The two models do not necessarily need to use the same predictors. To model the count process, we use the variables child and camper to model the count in the part of negative binomial model and the variable persons in the logit part of the model: a subject has gone fishing vs. not gone fishing.
All of the predictors in both the count and inflation portions of the model are statistically significant:
. negative binomial regression model part predicting number of fish caught (count):
.. child and camper are both significant predictors.
. logit model part predicting excessive zeros:
.. The predictor “persons” is statistically significant.
Interpretation: . The expected change in log(count) for a one-unit increase in child is -1.515255 holding other variables constant.
. A camper (camper = 1) has an expected log(count) of 0.879051 higher than that of a non-camper (camper = 0) holding other variables constant.
. The log odds of being an excessive zero would decrease by 1.67 for every additional person in the group. In other words, the more people in the group, the less likely that the zero would be due to not gone fishing. Put plainly, the larger the group the person was in, the more likely that the person went fishing.
Call:
zeroinfl(formula = count ~ child + camper | persons, data = zinb, dist = "negbin")
Pearson residuals:
Min 1Q Median 3Q Max
-0.5861 -0.4617 -0.3886 -0.1974 18.0135
Count model coefficients (negbin with log link):
Estimate Std. Error z value Pr(>|z|)
(Intercept) 1.3710 0.2561 5.353 8.64e-08 ***
child -1.5153 0.1956 -7.747 9.41e-15 ***
camper1 0.8791 0.2693 3.265 0.0011 **
Log(theta) -0.9854 0.1760 -5.600 2.14e-08 ***
Zero-inflation model coefficients (binomial with logit link):
Estimate Std. Error z value Pr(>|z|)
(Intercept) 1.6031 0.8365 1.916 0.0553 .
persons -1.6666 0.6793 -2.453 0.0142 *
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Theta = 0.3733
Number of iterations in BFGS optimization: 60
Log-likelihood: -432.9 on 6 Df
As all of the predictors in both the count and inflation portions of the model are statistically significant, we can check whether this model fits the data significantly better than the null model, i.e., the intercept-only model. To show that this is the case, we can compare with the current model to a null model without predictors using a chi-squared test on the difference of log likelihoods.
The overall model is statistically significant.
'log Lik.' 1.27988e-13 (df=6)
We can calculate confidence intervals for the parameters and the exponentiated parameters using bootstrapping. The bootstrapped confidence intervals are considerably wider than the normal-based approximation. For the negative binomial model, these would be incident risk ratios, for the zero inflation model, odds ratios.
Est pLL pUL bcaLL bcaUL
count_(Intercept) 1.3710504 0.56762367 2.061973 0.72263663 2.2923399
count_child -1.5152609 -2.13817466 -1.088688 -2.01750663 -0.9592912
count_camper1 0.8790522 0.04314645 1.833129 -0.20155391 1.6669059
zero_(Intercept) 1.6031354 0.43444324 8.237988 0.02821844 3.5197423
zero_persons -1.6665917 -8.54358496 -1.100199 -7.83291616 -1.0781500
Now we can estimate the incident risk ratio (IRR) for the negative binomial model and odds ratio (OR) for the logistic (zero inflation) model.
Est pLL pUL bcaLL bcaUL
count_(Intercept) 3.9394864 1.7640705 7.8614638 2.0598571 9.8980709
count_child 0.2197509 0.1178700 0.3366578 0.1329866 0.3831644
count_camper1 2.4086158 1.0440914 6.2534316 0.8174595 5.2957571
zero_(Intercept) 0.3733061 0.2758342 0.6420442 0.2210920 0.5509417
zero_persons 4.9685866 1.5441034 3781.9642312 1.0286204 33.7757238