library("MASS")
library("car")
library("VGAM")
library("effects")The data set housing in the MASS package gives a 3 × 3 × 4 × 2 table in frequency form relating (a) satisfaction (Sat) of residents with their housing (High, Medium, Low), (b) perceived degree of influence (Infl) they have on the management of the property (High, Medium, Low), (c) Type of rental (Tower, Atrium, Apartment, Terrace), and (d) contact (Cont) residents have with other residents (Low, High). Consider satisfaction as the ordinal response variable.
data("housing")Fit the proportional odds model with additive (main) effects of housing type, influence in management, and contact with neighbors to this data. (Hint: Using polr(), with the data in frequency form, you need to use the weights argument to supply the Freq variable.). Call model PO.mod.
PO.mod <- polr(Sat ~ Type + Infl + Cont, data = housing, weights = Freq, Hess = TRUE)
PO.mod## Call:
## polr(formula = Sat ~ Type + Infl + Cont, data = housing, weights = Freq,
## Hess = TRUE)
##
## Coefficients:
## TypeApartment TypeAtrium TypeTerrace InflMedium InflHigh
## -0.5723501 -0.3661866 -1.0910149 0.5663937 1.2888191
## ContHigh
## 0.3602841
##
## Intercepts:
## Low|Medium Medium|High
## -0.4961353 0.6907083
##
## Residual Deviance: 3479.149
## AIC: 3495.149
Run car::Anova test and explain what you see.
Anova(PO.mod)From the Anova test, all three variables, the perceived degree of influence they have on the management of the property, the type of rental, and the contact residents have with other residents are highly significant in determining the satisfaction that residents have with their housing.
Use VGAM to check if proportional odd condition is held (explain your response).
PO.mod.coeff.reversed <- vglm(Sat ~ Type + Infl + Cont, data = housing,
weights = Freq, family = cumulative(parallel = TRUE))
PO.mod.coeff.reversed##
## Call:
## vglm(formula = Sat ~ Type + Infl + Cont, family = cumulative(parallel = TRUE),
## data = housing, weights = Freq)
##
##
## Coefficients:
## (Intercept):1 (Intercept):2 TypeApartment TypeAtrium TypeTerrace
## -0.4961346 0.6907087 0.5723499 0.3661866 1.0910147
## InflMedium InflHigh ContHigh
## -0.5663937 -1.2888184 -0.3602845
##
## Degrees of Freedom: 144 Total; 136 Residual
## Residual deviance: 3479.149
## Log-likelihood: -1739.575
NPO.mod <- vglm(Sat ~ Type + Infl + Cont, data = housing,
weights = Freq, family = cumulative(parallel = FALSE))
NPO.mod##
## Call:
## vglm(formula = Sat ~ Type + Infl + Cont, family = cumulative(parallel = FALSE),
## data = housing, weights = Freq)
##
##
## Coefficients:
## (Intercept):1 (Intercept):2 TypeApartment:1 TypeApartment:2
## -0.4461683 0.6466011 0.6011532 0.5379260
## TypeAtrium:1 TypeAtrium:2 TypeTerrace:1 TypeTerrace:2
## 0.1911686 0.4830450 1.0790012 1.1189824
## InflMedium:1 InflMedium:2 InflHigh:1 InflHigh:2
## -0.5924647 -0.5497942 -1.2191597 -1.3077420
## ContHigh:1 ContHigh:2
## -0.4304931 -0.2956644
##
## Degrees of Freedom: 144 Total; 130 Residual
## Residual deviance: 3470.579
## Log-likelihood: -1735.289
coef(PO.mod.coeff.reversed, matrix = TRUE)## logit(P[Y<=1]) logit(P[Y<=2])
## (Intercept) -0.4961346 0.6907087
## TypeApartment 0.5723499 0.5723499
## TypeAtrium 0.3661866 0.3661866
## TypeTerrace 1.0910147 1.0910147
## InflMedium -0.5663937 -0.5663937
## InflHigh -1.2888184 -1.2888184
## ContHigh -0.3602845 -0.3602845
coef(NPO.mod, matrix = TRUE)## logit(P[Y<=1]) logit(P[Y<=2])
## (Intercept) -0.4461683 0.6466011
## TypeApartment 0.6011532 0.5379260
## TypeAtrium 0.1911686 0.4830450
## TypeTerrace 1.0790012 1.1189824
## InflMedium -0.5924647 -0.5497942
## InflHigh -1.2191597 -1.3077420
## ContHigh -0.4304931 -0.2956644
lrtest(PO.mod.coeff.reversed, NPO.mod)## Likelihood ratio test
##
## Model 1: Sat ~ Type + Infl + Cont
## Model 2: Sat ~ Type + Infl + Cont
## #Df LogLik Df Chisq Pr(>Chisq)
## 1 136 -1739.6
## 2 130 -1735.3 -6 8.5706 0.1992
tab <- cbind(
Deviance = c(deviance(NPO.mod), deviance(PO.mod.coeff.reversed)),
df = c(df.residual(NPO.mod), df.residual(PO.mod.coeff.reversed))
)
tab <- rbind(tab, diff(tab))
rownames(tab) <- c("GenLogit", "PropOdds", "LR test")
tab <- cbind(tab, pvalue=1-pchisq(tab[,1], tab[,2]))
tab## Deviance df pvalue
## GenLogit 3470.578700 130 0.000000
## PropOdds 3479.149299 136 0.000000
## LR test 8.570599 6 0.199206
To test if the proportional odds condition is held, we made a contrast between the PO model and a generalized NPO model that allows different effects (slopes) of the predictors across the response categories. The Likelihood ratio test indicates that the PO model is significantly different from the generalized NPO model and hence the proportional odds assumption holds.
Plot the effect plot of influence and type on original model and explain what you see in effect plot.
plot(Effect(c("Infl", "Type"), PO.mod))The variation in satisfaction levels between residents by type of rental and influence on management is more pronounced for those who are enjoy high and low levels of satisfaction and less so in case of those who enjoy medium levels of satisfaction.