In his book Regression Models for Categorical and Limited Dependent Variables, Scott Long makes the following claim:
If a dependent variable is ordinal and a model for nominal variables [e.g., multinomial logit] is used, there is a loss of efficiency since information is being ignored. On the other hand, when a method for ordinal variables [e.g., ordered logit] is applied to a nominal dependent variable, the resulting estimates are biased or even nonsensical (Long, 1997, 149).
The goal is to test this claim with a simulation in R. If this is done once, there needs to be an evaluation of Long’s claim about bias in the estimates. However, for an optional challenge, use a for loop to do this a large number of times (e.g., 1000) and also evaluate his claims about efficiency.
The simulation should create two different DGPs with one independent variable: one DGP in which the dependent variable is ordered and one in which it is unordered. Each dependent variable should have the same number of categories and this number must be at least 3. Also, the true coefficient values in the ordered DGP must be different from the ones in the unordered DGP.
## Load Data and Libraries ##
library(Zelig)
## Loading required package: MASS
## Loading required package: boot
## ##
## ## Zelig (Version 3.5.4, built: 2010-01-20)
## ## Please refer to http://gking.harvard.edu/zelig for full
## ## documentation or help.zelig() for help with commands and
## ## models supported by Zelig.
## ##
##
## ## Zelig project citations:
## ## Kosuke Imai, Gary King, and Olivia Lau. (2009).
## ## ``Zelig: Everyone's Statistical Software,''
## ## http://gking.harvard.edu/zelig.
## ## and
## ## Kosuke Imai, Gary King, and Olivia Lau. (2008).
## ## ``Toward A Common Framework for Statistical Analysis
## ## and Development,'' Journal of Computational and
## ## Graphical Statistics, Vol. 17, No. 4 (December)
## ## pp. 892-913.
##
## ## To cite individual Zelig models, please use the citation format printed with
## ## each model run and in the documentation.
## ##
## DGP for Ordered, Logit Dependent Variable ##
set.seed(23422)
# Set true parameter values
n <- 1000 # Sample size
X.o <- runif(n, -1, 1) # Create a sample of n obs on the independent variable X with uniform distribution between specified interval
plot(density(X.o))
b0.o <- .2 # True value for the intercept for ordered dependent variable
b1.o <- .5 # True value for the slope
# Create Y*
Y.star <- rlogis(n, b0.o + b1.o*X.o, 1) # Logistic distribution of error terms; scale parameter s=1
plot(density(Y.star))
# Create Y by dividing Y* into 3 categories
Y.o <- ifelse(Y.star <= -2.5, 1, # 1st cut
ifelse(Y.star >= 2.5, 3, # 3rd cut
2)) # All else, 2nd cut
head(cbind(Y.star, Y.o), 250) # Checks first 250 result; Y.star is continuous; Y.ob is ordinal (variance get shrunken down)
## Y.star Y.o
## [1,] -1.511984 2
## [2,] -5.284871 1
## [3,] -0.056717 2
## [4,] -2.099140 2
## [5,] 0.326421 2
## [6,] -1.636550 2
## [7,] -0.897556 2
## [8,] -0.418558 2
## [9,] -0.343260 2
## [10,] 1.734199 2
## [11,] -0.064805 2
## [12,] -2.204358 2
## [13,] -0.941092 2
## [14,] 2.252287 2
## [15,] 0.208329 2
## [16,] 0.883638 2
## [17,] -0.008993 2
## [18,] -2.714569 1
## [19,] 4.222231 3
## [20,] -1.935934 2
## [21,] 2.589334 3
## [22,] 0.335775 2
## [23,] 1.129000 2
## [24,] -1.050773 2
## [25,] 0.141628 2
## [26,] -0.557595 2
## [27,] -1.230805 2
## [28,] 0.869733 2
## [29,] -2.304491 2
## [30,] -0.212447 2
## [31,] -3.333118 1
## [32,] -0.557264 2
## [33,] 0.980504 2
## [34,] -0.399030 2
## [35,] 4.955106 3
## [36,] 0.523508 2
## [37,] -4.778634 1
## [38,] -2.100411 2
## [39,] 0.416096 2
## [40,] 0.059566 2
## [41,] -0.444913 2
## [42,] 0.584858 2
## [43,] -4.676331 1
## [44,] -1.374652 2
## [45,] 0.121879 2
## [46,] 0.048121 2
## [47,] 1.722326 2
## [48,] -1.344685 2
## [49,] -0.194796 2
## [50,] 3.275845 3
## [51,] -2.065860 2
## [52,] 0.966119 2
## [53,] -0.487452 2
## [54,] -0.579934 2
## [55,] -3.198596 1
## [56,] -0.961823 2
## [57,] 2.292384 2
## [58,] -1.434992 2
## [59,] -0.053520 2
## [60,] -1.112219 2
## [61,] 1.267044 2
## [62,] -3.247795 1
## [63,] -0.234542 2
## [64,] -0.187019 2
## [65,] -0.100106 2
## [66,] 3.739503 3
## [67,] 2.122628 2
## [68,] -1.171996 2
## [69,] -0.740025 2
## [70,] 0.863944 2
## [71,] -0.863733 2
## [72,] -0.701822 2
## [73,] -0.207780 2
## [74,] 2.601035 3
## [75,] 0.251543 2
## [76,] 0.177061 2
## [77,] 1.396327 2
## [78,] 0.770132 2
## [79,] 0.747883 2
## [80,] -0.509901 2
## [81,] 1.184897 2
## [82,] -0.349338 2
## [83,] 1.759933 2
## [84,] -2.211516 2
## [85,] -0.596651 2
## [86,] 0.811866 2
## [87,] -1.126235 2
## [88,] 1.852565 2
## [89,] 1.885633 2
## [90,] 1.213903 2
## [91,] -1.059834 2
## [92,] -0.857286 2
## [93,] -1.699331 2
## [94,] -4.077606 1
## [95,] -0.111860 2
## [96,] 1.846289 2
## [97,] 0.841853 2
## [98,] 2.150591 2
## [99,] 2.225647 2
## [100,] -1.371374 2
## [101,] 0.396039 2
## [102,] 3.401768 3
## [103,] 1.005966 2
## [104,] 0.827561 2
## [105,] 0.873644 2
## [106,] 1.487383 2
## [107,] -1.201109 2
## [108,] -0.237238 2
## [109,] -2.173350 2
## [110,] 2.836856 3
## [111,] 1.237889 2
## [112,] 1.473135 2
## [113,] 5.351732 3
## [114,] 0.306288 2
## [115,] 6.589411 3
## [116,] 2.616273 3
## [117,] -0.646459 2
## [118,] -1.408354 2
## [119,] -1.431128 2
## [120,] -0.706303 2
## [121,] -2.157688 2
## [122,] -0.071634 2
## [123,] -1.825754 2
## [124,] 1.629766 2
## [125,] -0.688362 2
## [126,] -0.579080 2
## [127,] 2.223910 2
## [128,] 0.026639 2
## [129,] -2.683403 1
## [130,] -0.805134 2
## [131,] 0.909330 2
## [132,] -0.151755 2
## [133,] -1.531881 2
## [134,] -1.663449 2
## [135,] -2.353260 2
## [136,] -0.178621 2
## [137,] 0.793839 2
## [138,] 0.408151 2
## [139,] 3.294074 3
## [140,] -0.812694 2
## [141,] -0.616865 2
## [142,] -2.193244 2
## [143,] -0.939088 2
## [144,] 1.056917 2
## [145,] 2.834019 3
## [146,] 0.585249 2
## [147,] 2.052776 2
## [148,] -0.141039 2
## [149,] 2.027255 2
## [150,] -2.746535 1
## [151,] 0.907352 2
## [152,] -2.041120 2
## [153,] -0.557353 2
## [154,] 0.578550 2
## [155,] -1.549144 2
## [156,] -0.574416 2
## [157,] 0.770532 2
## [158,] 2.450125 2
## [159,] 0.715198 2
## [160,] -1.208544 2
## [161,] -0.759780 2
## [162,] -0.945278 2
## [163,] 2.459016 2
## [164,] 1.491859 2
## [165,] -0.350231 2
## [166,] -0.189101 2
## [167,] 2.928746 3
## [168,] 0.517333 2
## [169,] -0.022117 2
## [170,] 1.394512 2
## [171,] 0.834754 2
## [172,] -0.139593 2
## [173,] -0.673933 2
## [174,] -1.358812 2
## [175,] 1.707708 2
## [176,] -1.078690 2
## [177,] -1.747918 2
## [178,] 2.006654 2
## [179,] -2.157192 2
## [180,] 2.637382 3
## [181,] 4.510659 3
## [182,] 0.924308 2
## [183,] -0.587748 2
## [184,] -0.865399 2
## [185,] -1.162715 2
## [186,] 0.791954 2
## [187,] -1.662049 2
## [188,] 0.229790 2
## [189,] 1.035079 2
## [190,] 4.755325 3
## [191,] -1.012092 2
## [192,] 0.432297 2
## [193,] -1.762988 2
## [194,] -0.902338 2
## [195,] -0.640069 2
## [196,] 0.718847 2
## [197,] -2.167180 2
## [198,] 2.302362 2
## [199,] -1.102557 2
## [200,] 0.391176 2
## [201,] 1.597851 2
## [202,] -0.633696 2
## [203,] -0.185131 2
## [204,] 4.512309 3
## [205,] 0.322684 2
## [206,] 2.132022 2
## [207,] 0.633030 2
## [208,] -1.464315 2
## [209,] 0.616901 2
## [210,] 1.914395 2
## [211,] 0.597460 2
## [212,] -0.644290 2
## [213,] -1.164916 2
## [214,] -1.223953 2
## [215,] -0.281828 2
## [216,] 0.041363 2
## [217,] -1.548751 2
## [218,] -3.422550 1
## [219,] -0.380394 2
## [220,] -1.034879 2
## [221,] -0.028972 2
## [222,] -1.996684 2
## [223,] 2.286061 2
## [224,] 2.245844 2
## [225,] 0.265578 2
## [226,] 1.358640 2
## [227,] -0.324427 2
## [228,] 1.573070 2
## [229,] 0.642918 2
## [230,] -0.951449 2
## [231,] -0.870188 2
## [232,] -1.645440 2
## [233,] 1.051795 2
## [234,] 1.975866 2
## [235,] 1.577824 2
## [236,] 3.290574 3
## [237,] -1.087379 2
## [238,] 1.201224 2
## [239,] 1.725039 2
## [240,] 0.473711 2
## [241,] 0.295034 2
## [242,] 1.004205 2
## [243,] -0.088265 2
## [244,] -1.013600 2
## [245,] 1.747149 2
## [246,] -1.090342 2
## [247,] -1.322206 2
## [248,] -0.378300 2
## [249,] 4.185265 3
## [250,] 0.682057 2
## DGP for Unordered, Logit Dependent Variable ##
set.seed(09948)
# Set true parameter values
X.u <- rnorm(1000) # Create a sample of 1000 obs on the on the independent variable X with normal distribution
plot(density(X.u))
b0.u1 <- 0.25 # Intercept for category 1
b1.u1 <- 0.5 # B1 for category 1
b0.u2 <- 1 # Intercept for category 2
b1.u2 <- 2 # B1 for category 2
# Calculate probabilities
p1 <- (exp(b0.u1 + b1.u1*X.u))/(1 + exp(b0.u1 + b1.u1*X.u) + exp(b0.u2 + b1.u2*X.u)) # Probability for category 1
p2 <- (exp(b0.u2 + b1.u2*X.u))/(1 + exp(b0.u1 + b1.u1*X.u) + exp(b0.u2 + b1.u2*X.u)) # Probability for category 2
p3 <- 1 - (p1 + p2) # Probability for category 3
head((p1 + p2 + p3), 250) # Checks out
## [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
## [36] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
## [71] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
## [106] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
## [141] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
## [176] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
## [211] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
## [246] 1 1 1 1 1
# Generate unordered dependent variable
Y.u <- rep(NA, n) # Create vector for saving dependent variable
# Run for loop sampling from each possibly category based on calculated probabilities
set.seed (1025)
for (i in 1:n){
Y.u[i] <- sample(1:3, 1, prob=c(p1[i], p2[i], p3[i])) # Draw one sample from the 3 categories given the probabilities
}
After creating the DGPs, the next step is to estimate an ordered logit and a multinomial logit on each of the dependent variables (four models in total). Then, for each of the four models, compute the change in predicted probability for each category corresponding with moving from the minimum to the maximum value of the independent variable. If you are doing the optional challenge, your code should store these computations, then repeat the process again.
# Create data frames for ordered and ordered dependent variable
df.o <- data.frame(Y.o, X.o)
df.u <- data.frame(Y.u, X.u)
# Estimate ordered and multinomial logit on ordered and unordered dependent
# variables
mo.o <- zelig(as.ordered(Y.o) ~ X.o, model = "ologit", data = df.o, cite = F) # Ordered model on ordered dependent variable
mu.o <- zelig(as.factor(Y.o) ~ X.o, model = "mlogit", data = df.o, cite = F) # Multinomial model on ordered dependent variable
##
## Attaching package: 'VGAM'
##
## The following object is masked from 'package:stats4':
##
## coef
##
## The following objects are masked from 'package:splines':
##
## bs, ns
##
## The following objects are masked from 'package:boot':
##
## logit, simplex
##
## The following objects are masked from 'package:stats':
##
## case.names, coef, coefficients, df.residual, dfbeta, fitted,
## fitted.values, formula, hatvalues, poly, residuals,
## variable.names, weights
##
## The following objects are masked from 'package:base':
##
## identity, scale.default
mo.u <- zelig(as.ordered(Y.u) ~ X.u, model = "ologit", data = df.u, cite = F) # Ordered model on unordered dependent variable
mu.u <- zelig(as.factor(Y.u) ~ X.u, model = "mlogit", data = df.u, cite = F) # Multinomial model on unordered dependent variable
summary(mo.o)
## Call:
## zelig(formula = as.ordered(Y.o) ~ X.o, model = "ologit", data = df.o,
## cite = F)
##
## Coefficients:
## Value Std. Error t value
## X.o 0.47 0.159 2.96
##
## Intercepts:
## Value Std. Error t value
## 1|2 -2.865 0.140 -20.520
## 2|3 2.387 0.114 20.868
##
## Residual Deviance: 998.72
## AIC: 1004.72
summary(mu.o)
##
## Call:
## zelig(formula = as.factor(Y.o) ~ X.o, model = "mlogit", data = df.o,
## cite = F)
##
## Pearson residuals:
## Min 1Q Median 3Q Max
## log(mu[,1]/mu[,3]) -2.2 -0.13 -0.10 -0.083 4.71
## log(mu[,2]/mu[,3]) -3.7 0.36 0.38 0.407 0.45
##
## Coefficients:
## Estimate Std. Error z value
## (Intercept):1 -0.42 0.18 -2.4
## (Intercept):2 2.33 0.12 20.0
## X.o:1 -0.85 0.30 -2.8
## X.o:2 -0.51 0.20 -2.5
##
## Number of linear predictors: 2
##
## Names of linear predictors: log(mu[,1]/mu[,3]), log(mu[,2]/mu[,3])
##
## Dispersion Parameter for multinomial family: 1
##
## Residual deviance: 998.5 on 1996 degrees of freedom
##
## Log-likelihood: -499.2 on 1996 degrees of freedom
##
## Number of iterations: 5
summary(mo.u)
## Call:
## zelig(formula = as.ordered(Y.u) ~ X.u, model = "ologit", data = df.u,
## cite = F)
##
## Coefficients:
## Value Std. Error t value
## X.u -0.372 0.0625 -5.95
##
## Intercepts:
## Value Std. Error t value
## 1|2 -1.397 0.080 -17.487
## 2|3 1.029 0.073 14.084
##
## Residual Deviance: 1997.00
## AIC: 2003.00
summary(mu.u)
##
## Call:
## zelig(formula = as.factor(Y.u) ~ X.u, model = "mlogit", data = df.u,
## cite = F)
##
## Pearson residuals:
## Min 1Q Median 3Q Max
## log(mu[,1]/mu[,3]) -7.1 -0.54 -0.23 -0.11 3.9
## log(mu[,2]/mu[,3]) -10.5 -0.58 0.18 0.59 3.7
##
## Coefficients:
## Estimate Std. Error z value
## (Intercept):1 0.15 0.12 1.3
## (Intercept):2 0.98 0.10 9.7
## X.u:1 0.60 0.12 5.0
## X.u:2 1.96 0.13 14.7
##
## Number of linear predictors: 2
##
## Names of linear predictors: log(mu[,1]/mu[,3]), log(mu[,2]/mu[,3])
##
## Dispersion Parameter for multinomial family: 1
##
## Residual deviance: 1645 on 1996 degrees of freedom
##
## Log-likelihood: -822.4 on 1996 degrees of freedom
##
## Number of iterations: 5
# Compute change in predicted probability for each category corresponding
# with moving from min to max of the independent variable
set.seed(3423)
# Create matrices for storing simulations of predicted probabilities
pe.moo <- matrix(NA, nrow = 1000, ncol = 3)
colnames(pe.moo) <- c("1", "2", "3")
pe.muo <- matrix(NA, nrow = 1000, ncol = 3)
colnames(pe.muo) <- c("1", "2", "3")
pe.mou <- matrix(NA, nrow = 1000, ncol = 3)
colnames(pe.mou) <- c("1", "2", "3")
pe.muu <- matrix(NA, nrow = 1000, ncol = 3)
colnames(pe.muu) <- c("1", "2", "3")
results <- matrix(NA, nrow = 4, ncol = 6) # Report table
colnames(results) <- c("Ordrd DV - 1", "Ordrd DV - 2", "Ordrd DV - 3", "Unordrd DV - 1",
"Unordrd DV - 2", "Unordrd DV - 3")
rownames(results) <- c("Ordrd Model - Pred Prob", "Ordrd Model - SD", "Multinom Model - Pred Prob",
"Multinom Model - SD")
# Predicted probability for mo.o
mooX.min <- setx(mo.o, X.o = min(X.o))
mooX.max <- setx(mo.o, X.o = max(X.o))
for (i in 1:n) {
sim.moo <- sim(mo.o, x = mooX.min, x1 = mooX.max)
pe.moo[i, 1] <- apply(sim.moo$qi$fd, 2, mean)[1]
pe.moo[i, 2] <- apply(sim.moo$qi$fd, 2, mean)[2]
pe.moo[i, 3] <- apply(sim.moo$qi$fd, 2, mean)[3]
}
results[1, 1] <- apply(pe.moo, 2, mean)[1]
results[1, 2] <- apply(pe.moo, 2, mean)[2]
results[1, 3] <- apply(pe.moo, 2, mean)[3]
results[2, 1] <- apply(pe.moo, 2, sd)[1]
results[2, 2] <- apply(pe.moo, 2, sd)[2]
results[2, 3] <- apply(pe.moo, 2, sd)[3]
# Predicted probability for mu.o
muoX.min <- setx(mu.o, X.o = min(X.o))
muoX.max <- setx(mu.o, X.o = max(X.o))
for (i in 1:n) {
sim.muo <- sim(mu.o, x = muoX.min, x1 = muoX.max)
pe.muo[i, 1] <- apply(sim.muo$qi$fd, 2, mean)[1]
pe.muo[i, 2] <- apply(sim.muo$qi$fd, 2, mean)[2]
pe.muo[i, 3] <- apply(sim.muo$qi$fd, 2, mean)[3]
}
results[3, 1] <- apply(pe.muo, 2, mean)[1]
results[3, 2] <- apply(pe.muo, 2, mean)[2]
results[3, 3] <- apply(pe.muo, 2, mean)[3]
results[4, 1] <- apply(pe.muo, 2, sd)[1]
results[4, 2] <- apply(pe.muo, 2, sd)[2]
results[4, 3] <- apply(pe.moo, 2, sd)[3]
# Predicted probability for mo.u
mouX.min <- setx(mo.u, X.u = min(X.u))
mouX.max <- setx(mo.u, X.u = max(X.u))
for (i in 1:n) {
sim.mou <- sim(mo.o, x = mouX.min, x1 = mouX.max)
pe.mou[i, 1] <- apply(sim.mou$qi$fd, 2, mean)[1]
pe.mou[i, 2] <- apply(sim.mou$qi$fd, 2, mean)[2]
pe.mou[i, 3] <- apply(sim.mou$qi$fd, 2, mean)[3]
}
results[1, 4] <- apply(pe.mou, 2, mean)[1]
results[1, 5] <- apply(pe.mou, 2, mean)[2]
results[1, 6] <- apply(pe.mou, 2, mean)[3]
results[2, 4] <- apply(pe.mou, 2, sd)[1]
results[2, 5] <- apply(pe.mou, 2, sd)[2]
results[2, 6] <- apply(pe.mou, 2, sd)[3]
# Predicted probability for mu.u
muuX.min <- setx(mu.u, X.u = min(X.u))
muuX.max <- setx(mu.u, X.u = max(X.u))
for (i in 1:n) {
sim.muu <- sim(mo.o, x = mouX.min, x1 = mouX.max)
pe.muu[i, 1] <- apply(sim.muu$qi$fd, 2, mean)[1]
pe.muu[i, 2] <- apply(sim.muu$qi$fd, 2, mean)[2]
pe.muu[i, 3] <- apply(sim.muu$qi$fd, 2, mean)[3]
}
results[3, 4] <- apply(pe.muu, 2, mean)[1]
results[3, 5] <- apply(pe.muu, 2, mean)[2]
results[3, 6] <- apply(pe.muu, 2, mean)[3]
results[4, 4] <- apply(pe.muu, 2, sd)[1]
results[4, 5] <- apply(pe.muu, 2, sd)[2]
results[4, 6] <- apply(pe.muu, 2, sd)[3]
Next, report the changes in the predicted probabilities computed. If doing the optional challenge, report the means and the standard deviations of each change across the simulations. Make sure it is easy to compare within each dependent variable.
Finally, explain in 2-3 sentences what the results say about Scott Long’s claim with regard to bias. If doing the optional challenge, add another 1-2 sentences about what the results say about his claim with regard to efficiency.
results
## Ordrd DV - 1 Ordrd DV - 2 Ordrd DV - 3
## Ordrd Model - Pred Prob -0.0494949 -0.024689 0.074184
## Ordrd Model - SD 0.0017408 0.001314 0.002537
## Multinom Model - Pred Prob -0.0406361 -0.042661 0.083297
## Multinom Model - SD 0.0008614 0.001277 0.002537
## Unordrd DV - 1 Unordrd DV - 2 Unordrd DV - 3
## Ordrd Model - Pred Prob -0.27012 -0.018431 0.28855
## Ordrd Model - SD 0.01279 0.003665 0.01141
## Multinom Model - Pred Prob -0.27052 -0.018475 0.28899
## Multinom Model - SD 0.01294 0.003715 0.01159
Looking at the results of the regression models, the ordered logit model predicted the coefficient for the ordinal dependent variable fairly well (0.47 versus 0.5 for the true coefficient). The coefficients for the multinomial logit are in relation to the 3rd category. Comparing the predicted probabilities, both models are similar to 2 decimals, though the multinomial logit shows some inefficiency in predicting the 1st and 2nd category, showing some evidence supporting Long's claim.
The ordered logit model makes one coefficient prediction for the nominal dependent variable (-0.372), which doesn't make sense since there are three sets of coefficients for the three categories. The nominal logit model predicts the coefficient sets close enough. Looking at the results matrix, both models are the same up to 3 decimals for both the predicted probability and SD of each category.