Interpretation of coefficients (reference program: general):
. math: for each unit-increase in math score, the expected log count for number of days absence decreases by 0.006.
. academic program: expected log count difference relative to the general program. In this case, it decreases by 0.44 than the expected log count for general.
. vocational program: expected log count difference relative to the general program. In this case, it decreases by 1.28, lower than the expected log count for general.
Call:
glm.nb(formula = daysabs ~ prog + math, data = dat, init.theta = 1.032713156,
link = log)
Deviance Residuals:
Min 1Q Median 3Q Max
-2.1547 -1.0192 -0.3694 0.2285 2.5273
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 2.615265 0.197460 13.245 < 2e-16 ***
progAcademic -0.440760 0.182610 -2.414 0.0158 *
progVocational -1.278651 0.200720 -6.370 1.89e-10 ***
math -0.005993 0.002505 -2.392 0.0167 *
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Dispersion parameter for Negative Binomial(1.0327) family taken to be 1)
Null deviance: 427.54 on 313 degrees of freedom
Residual deviance: 358.52 on 310 degrees of freedom
AIC: 1741.3
Number of Fisher Scoring iterations: 1
Theta: 1.033
Std. Err.: 0.106
2 x log-likelihood: -1731.258
To determine if program type itself is statistically significant, we can test models with and without the prog variable. Previously, we had the choice of likelyhood ratio tests or deviance tables, but in the case of negative binomial models, a deviance table does not recalculate theta, so the overdispersion parameter is held constant. So, we cannot use a deviance table, we have to fit separate models.
The 2-degree-of-freedom chi-square test indicates that program is statistically significant predictor of daysabs, with a p-value < 0.001. The theta overdispersion parameter is 1.0033.
Likelihood ratio tests of Negative Binomial Models
Response: daysabs
Model theta Resid. df 2 x log-lik. Test df LR stat.
1 math 0.8558565 312 -1776.306
2 prog + math 1.0327132 310 -1731.258 1 vs 2 2 45.04798
Pr(Chi)
1
2 1.65179e-10
The Poisson model is actually nested in the NB model, so we can use a LRT to compare them and see whether the assumption that the conditional means are not equal to the conditional variances holds. The LRT is significant with a p-value < 0.001, indicating that the NB model is more appropriate than the Poisson model.
'log Lik.' 2.157298e-203 (df=5)
Estimate 2.5 % 97.5 %
(Intercept) 2.615265446 2.24205576 3.012935926
progAcademic -0.440760012 -0.81006586 -0.092643481
progVocational -1.278650721 -1.68348970 -0.890077623
math -0.005992988 -0.01090086 -0.001066615
Exponentiating model coefficients gives us incidence rate ratios IRR. This is the ratio of the number of events of one category to the number of events in the other category.
Interpretation of coefficients:
. The percent change in incidence rate of days absence is a 1% decrease for every unit increase in math test score.
. The IRR for the academic program is the 64% relative to the general program, and the IRR for the vocational program is 0.28 times the IRR for the general program, holding the other variables constant.
Estimate 2.5 % 97.5 %
(Intercept) 13.6708448 9.4126616 20.3470498
progAcademic 0.6435471 0.4448288 0.9115184
progVocational 0.2784127 0.1857247 0.4106239
math 0.9940249 0.9891583 0.9989340