## Freq status year
## 1 98 normal 5.8
## 2 51 normal 15.0
## 3 34 normal 21.5
## 4 35 normal 27.5
## 5 32 normal 33.5
## 6 23 normal 39.5
## [1] 24 3
## [1] "Freq" "status" "year"
## [1] "mild" "normal" "severe"
Freq: Number of coal miners classified at each status, divided into the different year categories
Status: radiological classification of pneumoconiosis: normal, mild, severe (response variable)
Year: number of years spent working at the coal face, 8 categories total
To start off, I visualized the change in case incidence as a funciton of years working
## # weights: 9 (4 variable)
## initial value 407.585159
## iter 10 value 208.724810
## final value 208.724782
## converged
## Call:
## multinom(formula = status ~ year, data = pneumo, weights = Freq)
##
## Coefficients:
## (Intercept) year
## normal 4.2916723 -0.08356506
## severe -0.7681706 0.02572027
##
## Residual Deviance: 417.4496
## AIC: 425.4496
## [1] normal
## Levels: mild normal severe
## mild normal severe
## 0.09148821 0.82778696 0.08072483
The nominal model predicts 25 years of exposure will most likely have no case, with p = 0.828.
## Call:
## polr(formula = status ~ year, data = pneumo, weights = Freq)
##
## Coefficients:
## year
## 0.09590405
##
## Intercepts:
## normal|mild mild|severe
## 3.955843 4.869048
##
## Residual Deviance: 416.9188
## AIC: 422.9188
## [1] normal
## Levels: normal mild severe
## normal mild severe
## 0.82610096 0.09601474 0.07788430
The ordinal model also predicts 25 years of exposure will most likely have no case, with p = 0.826.
##
## Call:
## glm(formula = cbind(case, no_case) ~ year, family = "binomial",
## data = pneumo_binom_df[pneumo_binom_df$status != "mild",
## ])
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -2.5267 -0.7054 -0.5013 0.8103 1.4294
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -3.966352 0.241869 -16.40 <2e-16 ***
## year 0.096269 0.007138 13.49 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 292.692 on 23 degrees of freedom
## Residual deviance: 34.697 on 22 degrees of freedom
## AIC: 116.23
##
## Number of Fisher Scoring iterations: 5
## 1
## -1.559621
## 1
## 0.826299
The top layer predicts a normal case at year = 25, with p = 0.826.
## # weights: 3 (2 variable)
## initial value 56.838069
## final value 55.444195
## converged
## Call:
## multinom(formula = status ~ year, data = pneumo_case, weights = Freq)
##
## Coefficients:
## (Intercept) year
## -1.11342251 0.03547178
##
## Residual Deviance: 110.8884
## AIC: 114.8884
## [1] mild
## Levels: mild severe
## 1
## 0.5564158
Given that there is a case at year = 25, it is most likely to be a mild case, with p = 0.556.
The three analyses all appear to give similar results, with similar probabilities and all predicting the same normal status after 25 years of coal mining. However, this treatment of pneumo data has obvious limitations because of bias introduced from survivorship. In the most severe cases, miners may die and remain uncounted at the highest levels of exposure. Normal and mild cases may be overrepresented in the highest yearly quantiles for this reason.