library(vcd)

## Warning: package 'vcd' was built under R version 3.5.1

## Loading required package: grid

library(vcdExtra)

## Warning: package 'vcdExtra' was built under R version 3.5.1

## Loading required package: gnm

library(ggplot2)

## Warning: package 'ggplot2' was built under R version 3.5.1

library(logmult)

## Warning: package 'logmult' was built under R version 3.5.1

## 
## Attaching package: 'logmult'

## The following object is masked from 'package:gnm':
## 
##     se

## The following object is masked from 'package:vcd':
## 
##     assoc

library(ca)

## Warning: package 'ca' was built under R version 3.5.1

library(MASS)

## Warning: package 'MASS' was built under R version 3.5.1

library(car)

## Warning: package 'car' was built under R version 3.5.1

## Loading required package: carData

## 
## Attaching package: 'carData'

## The following object is masked from 'package:vcdExtra':
## 
##     Burt

library(VGAM)

## Warning: package 'VGAM' was built under R version 3.5.1

## Loading required package: stats4

## Loading required package: splines

## 
## Attaching package: 'VGAM'

## The following object is masked from 'package:car':
## 
##     logit

library(rms)

## Warning: package 'rms' was built under R version 3.5.1

## Loading required package: Hmisc

## Loading required package: lattice

## 
## Attaching package: 'lattice'

## The following object is masked from 'package:gnm':
## 
##     barley

## Loading required package: survival

## Loading required package: Formula

## 
## Attaching package: 'Hmisc'

## The following objects are masked from 'package:base':
## 
##     format.pval, units

## Loading required package: SparseM

## 
## Attaching package: 'SparseM'

## The following object is masked from 'package:base':
## 
##     backsolve

## 
## Attaching package: 'rms'

## The following objects are masked from 'package:VGAM':
## 
##     calibrate, lrtest

## The following objects are masked from 'package:car':
## 
##     Predict, vif

library(reshape2)

## Warning: package 'reshape2' was built under R version 3.5.1

library(directlabels)

## Warning: package 'directlabels' was built under R version 3.5.1

library(effects)

## Warning: package 'effects' was built under R version 3.5.1

## Use the command
##     lattice::trellis.par.set(effectsTheme())
##   to customize lattice options for effects plots.
## See ?efffectsTheme for details.

library(nnet)

## Warning: package 'nnet' was built under R version 3.5.1

data("Vietnam", package = "vcdExtra")

Vietnam$yearsex<- paste(Vietnam$year, Vietnam$sex, sep = ':')
VN <- xtabs(Freq ~ response + Vietnam$yearsex, data = Vietnam)
viet <- margin.table(VN, 1:2)
viet.ca <- ca(viet)

plot(viet.ca)

Homework #7

Model 1:

head(Vietnam$response, 8)

## [1] A B C D A B C D
## Levels: A B C D

viet.polr <- polr(response ~ year + sex, data = Vietnam)
summary(viet.polr)

## 
## Re-fitting to get Hessian

## Call:
## polr(formula = response ~ year + sex, data = Vietnam)
## 
## Coefficients:
##              Value Std. Error    t value
## year    -4.786e-15     0.2000 -2.393e-14
## sexMale -1.244e-15     0.5657 -2.199e-15
## 
## Intercepts:
##     Value   Std. Error t value
## A|B -1.0986  0.7572    -1.4509
## B|C  0.0000  0.7348     0.0000
## C|D  1.0986  0.7572     1.4509
## 
## Residual Deviance: 110.9035 
## AIC: 120.9035

Model 2:

vn.polr <- polr(response ~ Vietnam$year + Vietnam$sex, data = Vietnam)
summary(vn.polr)

## 
## Re-fitting to get Hessian

## Call:
## polr(formula = response ~ Vietnam$year + Vietnam$sex, data = Vietnam)
## 
## Coefficients:
##                      Value Std. Error    t value
## Vietnam$year    -4.786e-15     0.2000 -2.393e-14
## Vietnam$sexMale -1.244e-15     0.5657 -2.199e-15
## 
## Intercepts:
##     Value   Std. Error t value
## A|B -1.0986  0.7572    -1.4509
## B|C  0.0000  0.7348     0.0000
## C|D  1.0986  0.7572     1.4509
## 
## Residual Deviance: 110.9035 
## AIC: 120.9035

Since models, 1 and 2, are the same, we will work through the exercise using only Model 1:

(a) Fit the proportional odds model to these data, allowing an interaction of year and sex:

Let’s do an analysis of the relationship between Year and Sex:

Anova(viet.polr)

## Analysis of Deviance Table (Type II tests)
## 
## Response: response
##      LR Chisq Df Pr(>Chisq)
## year        0  1          1
## sex         0  1          1

(b) Is there evidence that the proportional odds assumption does not hold for this data set? Use the motheods described in “Ordinal response: Proportional odds model” to assess this.

The proportional odds assumption assumes the slope of the line is fixed and does not change for each new value. Hence, we will have to test the proportional odds assumption. If the line’s slope changes for each new value, it is the “Nested Proportional odd” (NPO). If the line’s slope doesn’t change it would be PO.

Let us assume that our model isn’t proportional, and build it. Then, we will build a model assuming that the two are proportional – or vice versa

(1) Let’s, first, create the model assuming the odds are proportional:

vietpo <- vglm(response ~ year + sex, data = Vietnam, family = cumulative(parallel = TRUE))

## Warning in eval(slot(family, "initialize")): response should be ordinal---
## see ordered()

vietpo

## 
## Call:
## vglm(formula = response ~ year + sex, family = cumulative(parallel = TRUE), 
##     data = Vietnam)
## 
## 
## Coefficients:
## (Intercept):1 (Intercept):2 (Intercept):3          year       sexMale 
## -1.098612e+00 -6.879800e-16  1.098612e+00 -8.881784e-17  1.256074e-16 
## 
## Degrees of Freedom: 120 Total; 115 Residual
## Residual deviance: 110.9035 
## Log-likelihood: -55.45177

(2) Create a model assuming odds are not proportional:

vietnpo <- vglm(response ~ year + sex, data = Vietnam, family = cumulative(parallel = FALSE))

## Warning in eval(slot(family, "initialize")): response should be ordinal---
## see ordered()

vietnpo

## 
## Call:
## vglm(formula = response ~ year + sex, family = cumulative(parallel = FALSE), 
##     data = Vietnam)
## 
## 
## Coefficients:
## (Intercept):1 (Intercept):2 (Intercept):3        year:1        year:2 
## -1.098612e+00 -6.879800e-16  1.098612e+00  4.634791e-16  6.326856e-17 
##        year:3     sexMale:1     sexMale:2     sexMale:3 
##  8.599751e-17 -3.918388e-16 -1.133916e-15 -8.918715e-16 
## 
## Degrees of Freedom: 120 Total; 111 Residual
## Residual deviance: 110.9035 
## Log-likelihood: -55.45177

It seems the slope is changing here for each new value. So, we have an NPO and the models appear to be different.

Let’s look at an easier way to compare the values:

coef(vietpo, matrix = TRUE)

##             logit(P[Y<=1]) logit(P[Y<=2]) logit(P[Y<=3])
## (Intercept)  -1.098612e+00  -6.879800e-16   1.098612e+00
## year         -8.881784e-17  -8.881784e-17  -8.881784e-17
## sexMale       1.256074e-16   1.256074e-16   1.256074e-16

coef(vietnpo, matrix = TRUE)

##             logit(P[Y<=1]) logit(P[Y<=2]) logit(P[Y<=3])
## (Intercept)  -1.098612e+00  -6.879800e-16   1.098612e+00
## year          4.634791e-16   6.326856e-17   8.599751e-17
## sexMale      -3.918388e-16  -1.133916e-15  -8.918715e-16

It seems the slope is changing for each ew value. Hence, the models are not the same.

Graphical assessment:

op <- par(mfrow = c(1,3))
plot.xmean.ordinaly(response ~ year + sex, data = Vietnam, lwd = 2, pch = 16, subn = FALSE)

(c) Fit the multinomial logistic model (generalized logit model), also allowing an interaction. Use car::Anova() to assess the model terms. Here, we append the fitted response probabilities (or logits) to the data frame and plotting these in relation to the predictors

Visualization: Full-model plot

viet.fitp <- cbind(Vietnam, predict(viet.polr, type = "probs"))
head(viet.fitp)

##      sex year response Freq  yearsex    A    B    C    D
## 1 Female    1        A   13 1:Female 0.25 0.25 0.25 0.25
## 2 Female    1        B   19 1:Female 0.25 0.25 0.25 0.25
## 3 Female    1        C   40 1:Female 0.25 0.25 0.25 0.25
## 4 Female    1        D    5 1:Female 0.25 0.25 0.25 0.25
## 5 Female    2        A    5 2:Female 0.25 0.25 0.25 0.25
## 6 Female    2        B    9 2:Female 0.25 0.25 0.25 0.25

plotdat <- melt(viet.fitp, id.vars= c("year","sex", "response"), measure.vars= c("A", "B", "C", "D"), variable.name= "Level", value.name= "Probability")
head(plotdat)

##   year    sex response Level Probability
## 1    1 Female        A     A        0.25
## 2    1 Female        B     A        0.25
## 3    1 Female        C     A        0.25
## 4    1 Female        D     A        0.25
## 5    2 Female        A     A        0.25
## 6    2 Female        B     A        0.25

Let’s use car::Anova() to assess the model terms. Here, we measure the association with these data.

car::Anova(viet.polr)

## Analysis of Deviance Table (Type II tests)
## 
## Response: response
##      LR Chisq Df Pr(>Chisq)
## year        0  1          1
## sex         0  1          1

We see above that since chisq equals zero, there seems to be an association between the models.

gg <- ggplot(plotdat, aes (year, y= Probability, color = Level))+ ylim(0.0, 1.0) + geom_line(size= 5.0) + theme_bw()+ xlim(10, 100)+ geom_point(color= "black", size= 3.0)+ facet_grid(year ~ sex)
direct.label(gg)

## Warning: Removed 160 rows containing missing values (geom_path).

## Warning: Removed 160 rows containing missing values (geom_point).

(d) Produce an effect plot for this model and describe the nature of the ingteraction

Visualization: Effects Plot

plot(Effect("year", viet.polr))

## 
## Re-fitting to get Hessian

plot(Effect("sex", viet.polr), style = "stacked", key.args = list(x = .45, y = .8))

## 
## Re-fitting to get Hessian

Note: it is very difficult to determine the nature of the interaction here. It seems ther is no interaction.

plot(Effect("year", viet.polr), style = "stacked", key.args = list(x = .45, y = .8))

## 
## Re-fitting to get Hessian

Note: Again,it is very difficult to determine the nature of the interaction here.

(e) Fit the simpler multinomial model in which there is no effect of year for females and the effect of year is linear for males (on the logit scale). Test whether this model is significantly worse than the general multinomial model with interaction.

(I am not so sure about this one)

(1) Let us look at the generalized logit model:

viet.multinom <- multinom(response ~ year + sex, data = Vietnam, Hess = TRUE)

## # weights:  16 (9 variable)
## initial  value 55.451774 
## final  value 55.451774 
## converged

summary(viet.multinom)

## Call:
## multinom(formula = response ~ year + sex, data = Vietnam, Hess = TRUE)
## 
## Coefficients:
##   (Intercept) year sexMale
## B           0    0       0
## C           0    0       0
## D           0    0       0
## 
## Std. Errors:
##   (Intercept)      year   sexMale
## B    1.140175 0.3162278 0.8944272
## C    1.140175 0.3162278 0.8944272
## D    1.140175 0.3162278 0.8944272
## 
## Residual Deviance: 110.9035 
## AIC: 128.9035

Above, there is no effect on of year for females

(2) Let’s use the effect of interaction instead of additive:

viet.multinom2 <- multinom(response ~ year * sex, data = Vietnam, Hess = TRUE)

## # weights:  20 (12 variable)
## initial  value 55.451774 
## final  value 55.451774 
## converged

summary(viet.multinom2)

## Call:
## multinom(formula = response ~ year * sex, data = Vietnam, Hess = TRUE)
## 
## Coefficients:
##   (Intercept) year sexMale year:sexMale
## B           0    0       0            0
## C           0    0       0            0
## D           0    0       0            0
## 
## Std. Errors:
##   (Intercept)      year  sexMale year:sexMale
## B     1.48324 0.4472136 2.097618    0.6324555
## C     1.48324 0.4472136 2.097618    0.6324555
## D     1.48324 0.4472136 2.097618    0.6324555
## 
## Residual Deviance: 110.9035 
## AIC: 134.9035

At this point, it seems model 1 is not significantly worse than model 2 (with interaction)

Let’s use Anova:

Anova(viet.multinom)

## Analysis of Deviance Table (Type II tests)
## 
## Response: response
##      LR Chisq Df Pr(>Chisq)
## year        0  3          1
## sex         0  3          1

Anova(viet.multinom2)

## Analysis of Deviance Table (Type II tests)
## 
## Response: response
##          LR Chisq Df Pr(>Chisq)
## year            0  3          1
## sex             0  3          1
## year:sex        0  3          1

ANLY 545-Homework 7

Charles Pierre

August 3, 2018

Homework #7

Model 1:

Model 2:

Since models, 1 and 2, are the same, we will work through the exercise using only Model 1:

(a) Fit the proportional odds model to these data, allowing an interaction of year and sex:

Let’s do an analysis of the relationship between Year and Sex:

(b) Is there evidence that the proportional odds assumption does not hold for this data set? Use the motheods described in “Ordinal response: Proportional odds model” to assess this.

Let us assume that our model isn’t proportional, and build it. Then, we will build a model assuming that the two are proportional – or vice versa

(1) Let’s, first, create the model assuming the odds are proportional:

(2) Create a model assuming odds are not proportional:

It seems the slope is changing here for each new value. So, we have an NPO and the models appear to be different.

Let’s look at an easier way to compare the values:

It seems the slope is changing for each ew value. Hence, the models are not the same.

Graphical assessment:

(c) Fit the multinomial logistic model (generalized logit model), also allowing an interaction. Use car::Anova() to assess the model terms. Here, we append the fitted response probabilities (or logits) to the data frame and plotting these in relation to the predictors

Visualization: Full-model plot

Let’s use car::Anova() to assess the model terms. Here, we measure the association with these data.

We see above that since chisq equals zero, there seems to be an association between the models.

(d) Produce an effect plot for this model and describe the nature of the ingteraction

Visualization: Effects Plot

Note: it is very difficult to determine the nature of the interaction here. It seems ther is no interaction.

Note: Again,it is very difficult to determine the nature of the interaction here.

(e) Fit the simpler multinomial model in which there is no effect of year for females and the effect of year is linear for males (on the logit scale). Test whether this model is significantly worse than the general multinomial model with interaction.

(I am not so sure about this one)

(1) Let us look at the generalized logit model:

Above, there is no effect on of year for females

(2) Let’s use the effect of interaction instead of additive:

At this point, it seems model 1 is not significantly worse than model 2 (with interaction)

Let’s use Anova:

ANLY 545-Homework 7

Charles Pierre

August 3, 2018

Homework #7

Model 1:

Model 2:

Since models, 1 and 2, are the same, we will work through the exercise using only Model 1:

(a) Fit the proportional odds model to these data, allowing an interaction of year and sex:

Let’s do an analysis of the relationship between Year and Sex:

Since chisq is 0, we know year and sex are related.

(b) Is there evidence that the proportional odds assumption does not hold for this data set? Use the motheods described in “Ordinal response: Proportional odds model” to assess this.

Let us assume that our model isn’t proportional, and build it. Then, we will build a model assuming that the two are proportional – or vice versa

(1) Let’s, first, create the model assuming the odds are proportional:

Here, the log-likelihood ratio is not close the 0. This implies that the year and sex are not related. We will have to search further.

(2) Create a model assuming odds are not proportional:

It seems the slope is changing here for each new value. So, we have an NPO and the models appear to be different.

Let’s look at an easier way to compare the values:

It seems the slope is changing for each ew value. Hence, the models are not the same.

Graphical assessment:

(c) Fit the multinomial logistic model (generalized logit model), also allowing an interaction. Use car::Anova() to assess the model terms. Here, we append the fitted response probabilities (or logits) to the data frame and plotting these in relation to the predictors

Visualization: Full-model plot

Let’s use car::Anova() to assess the model terms. Here, we measure the association with these data.

We see above that since chisq equals zero, there seems to be an association between the models.

(d) Produce an effect plot for this model and describe the nature of the ingteraction

Visualization: Effects Plot

Note: it is very difficult to determine the nature of the interaction here. It seems ther is no interaction.

Note: Again,it is very difficult to determine the nature of the interaction here.

(e) Fit the simpler multinomial model in which there is no effect of year for females and the effect of year is linear for males (on the logit scale). Test whether this model is significantly worse than the general multinomial model with interaction.

(I am not so sure about this one)

(1) Let us look at the generalized logit model:

Above, there is no effect on of year for females

(2) Let’s use the effect of interaction instead of additive:

At this point, it seems model 1 is not significantly worse than model 2 (with interaction)

Let’s use Anova: