Proportional odds model

References

Load packages

## for read.dta()
library(foreign)
## for analysis
library(VGAM)
## Loading required package: splines
## Loading required package: stats4
## 
## Attaching package: 'VGAM'
## 
## The following object is masked from 'package:stats4':
## 
##     coef
## 
## The following objects are masked from 'package:splines':
## 
##     bs, ns
## 
## The following objects are masked from 'package:stats':
## 
##     case.names, coef, coefficients, df.residual, dfbeta, fitted, fitted.values, formula, hatvalues,
##     poly, residuals, variable.names, weights
## 
## The following objects are masked from 'package:base':
## 
##     identity, scale.default

Data prepration

## Load data
arthritis <- read.dta("http://www.hsph.harvard.edu/fitzmaur/ala2e/arthritis.dta")
head(arthritis)
##   id trt age y1 y2 y3 y4
## 1  1   1  54  3  2  1  1
## 2  2   0  41  2  2  2  2
## 3  3   1  48  3  3  2  2
## 4  4   0  40  2  2  3  2
## 5  5   1  29  2  2  3  2
## 6  6   1  43  3  3  4  3
summary(arthritis)
##        id             trt             age             y1             y2             y3             y4      
##  Min.   :  1.0   Min.   :0.000   Min.   :21.0   Min.   :1.00   Min.   :1.00   Min.   :1.00   Min.   :1.00  
##  1st Qu.: 76.5   1st Qu.:0.000   1st Qu.:42.0   1st Qu.:3.00   1st Qu.:2.00   1st Qu.:2.00   1st Qu.:2.00  
##  Median :152.0   Median :1.000   Median :54.0   Median :3.00   Median :3.00   Median :3.00   Median :3.00  
##  Mean   :152.0   Mean   :0.508   Mean   :50.4   Mean   :3.17   Mean   :2.83   Mean   :2.83   Mean   :2.66  
##  3rd Qu.:227.5   3rd Qu.:1.000   3rd Qu.:59.8   3rd Qu.:4.00   3rd Qu.:3.00   3rd Qu.:3.00   3rd Qu.:3.00  
##  Max.   :303.0   Max.   :1.000   Max.   :66.0   Max.   :5.00   Max.   :5.00   Max.   :5.00   Max.   :5.00  
##                                  NA's   :1                     NA's   :3      NA's   :6      NA's   :9

Proportional odds model

The model assumes the same treatment response regardless of the comparison made.

The treatment effect is significant (Z = 2.8347). After adjusting for the baseline age, the treatment group has a higher log odds of being in the lower treatment response category by 0.607896. This corresponds to an odd ratio of \( e^{0.607896} = 1.836563 \) for being in the more favorable treatment response category.

## Parallel
modelParallel <- vglm(formula = y4 ~ age + trt,
                      data    = arthritis,
                      family  = cumulative(link = "logit", parallel = TRUE, reverse = FALSE))
summary(modelParallel)
## 
## Call:
## vglm(formula = y4 ~ age + trt, family = cumulative(link = "logit", 
##     parallel = TRUE, reverse = FALSE), data = arthritis)
## 
## Pearson residuals:
##                  Min    1Q Median    3Q  Max
## logit(P[Y<=1]) -0.78 -0.58  -0.26 -0.18 3.31
## logit(P[Y<=2]) -1.57 -0.95  -0.35  1.10 1.67
## logit(P[Y<=3]) -3.17  0.17   0.29  0.68 0.88
## logit(P[Y<=4]) -5.88  0.10   0.13  0.17 0.49
## 
## Coefficients:
##               Estimate Std. Error z value
## (Intercept):1    -1.21     0.5202    -2.3
## (Intercept):2     0.53     0.5123     1.0
## (Intercept):3     2.15     0.5271     4.1
## (Intercept):4     4.14     0.6029     6.9
## age              -0.02     0.0096    -2.1
## trt               0.61     0.2144     2.8
## 
## Number of linear predictors:  4 
## 
## Names of linear predictors: logit(P[Y<=1]), logit(P[Y<=2]), logit(P[Y<=3]), logit(P[Y<=4])
## 
## Dispersion Parameter for cumulative family:   1
## 
## Residual deviance: 814.1 on 1166 degrees of freedom
## 
## Log-likelihood: -407.1 on 1166 degrees of freedom
## 
## Number of iterations: 4

Check for proportional odds assumption

A non-parallel model that assumes different treatment effect for each comparison. The treatment effects ranges from 0.41 (Y<=2 vs Y>2 ) to 1.40 (Y<=4 vs Y > 4). By the likelihood ratio test (6 degrees of freedom), the difference between these models are not significant. Thus, we conclude the proportional odds model assuming an identical treatment effect is sufficient based on the data.

## Non-parallel
modelNonParallel <- vglm(formula = y4 ~ age + trt,
                         data    = arthritis,
                         family  = cumulative(link = "logit", parallel = FALSE, reverse = FALSE))
summary(modelNonParallel)
## 
## Call:
## vglm(formula = y4 ~ age + trt, family = cumulative(link = "logit", 
##     parallel = FALSE, reverse = FALSE), data = arthritis)
## 
## Pearson residuals:
##                 Min     1Q Median    3Q  Max
## logit(P[Y<=1]) -1.3 -0.392  -0.24 -0.16 4.54
## logit(P[Y<=2]) -1.4 -1.016  -0.38  1.22 1.44
## logit(P[Y<=3]) -2.8  0.177   0.31  0.65 0.84
## logit(P[Y<=4]) -7.9  0.087   0.12  0.19 0.58
## 
## Coefficients:
##               Estimate Std. Error z value
## (Intercept):1   -0.627      0.767   -0.82
## (Intercept):2    0.426      0.565    0.75
## (Intercept):3    1.696      0.727    2.33
## (Intercept):4    3.825      1.689    2.26
## age:1           -0.041      0.015   -2.73
## age:2           -0.017      0.011   -1.58
## age:3           -0.012      0.014   -0.87
## age:4           -0.019      0.031   -0.60
## trt:1            1.166      0.389    3.00
## trt:2            0.416      0.237    1.75
## trt:3            0.630      0.300    2.10
## trt:4            1.403      0.799    1.76
## 
## Number of linear predictors:  4 
## 
## Names of linear predictors: logit(P[Y<=1]), logit(P[Y<=2]), logit(P[Y<=3]), logit(P[Y<=4])
## 
## Dispersion Parameter for cumulative family:   1
## 
## Residual deviance: 805.2 on 1160 degrees of freedom
## 
## Log-likelihood: -402.6 on 1160 degrees of freedom
## 
## Number of iterations: 5

## Likelihood ratio test
lrtest(modelNonParallel, modelParallel)
## Likelihood ratio test
## 
## Model 1: y4 ~ age + trt
## Model 2: y4 ~ age + trt
##    #Df LogLik Df Chisq Pr(>Chisq)
## 1 1160   -403                    
## 2 1166   -407  6  8.93       0.18