A simulated dataset

To show how to constraint parameters using gmnl package, I create an artificial mixed logit dataset:

rm(list = ls(all = TRUE))          # Clean objects
library("gmnl")
library("mlogit")

The true latent utility is given by: \[ \begin{equation*} U_{ijt}^* = \beta_1x_{1,ijt} + \beta_{2i}x_{4,ijt} + \beta_{3i}x_{3,ijt}+\epsilon_{ijt} \end{equation*} \] where \(\beta_1 = 1\) is a fixed parameter; \(\beta_{2i}\sim N(0, 2)\), \(\beta_{3i}\sim N(2, 1)\) and \(\epsilon_{ijt}\) is distributed as Extreme Value Type I. For this sample we set \(N = 300\), \(J = 4\) and 10 choice situations per indidivudal.

I first set the globals (seed, number of individuals, number of alternatives, etc) for the simulation:

# Globals
set.seed(250586)             # Set seed
N <- 300                     # Number of individuals
J <- 4                       # Number of alternatives
t <- 10                      # Number of choice situations
R <- 50                      # Numer of draws for simulation
id <- rep(1:N, each = J * t) # Id for each individual

Now the two normally distributed parameters and the variables are constructed

# Create random parameters and variables
x1 <- rnorm(N * J * t)                   # continuous variable
x2 <- as.numeric(runif(N * J * t) < 0.5) # dummy variable
x3 <- rnorm(N * J * t)                   # continuous variable
b2 <- 0 + 2 * rnorm(N) # Normal
b3 <- 2 + 1 * rnorm(N) # Normal

Note that x2 is a dummy variable, whereas x1 and x3 are continuous. The latent utility for each individual is created as follows:

# The true data generating process
ex.value <- -log(-log(runif(N * J * T)))
U <-  1 * x1 + b2[id] * x2 + b3[id] * x3 + ex.value

The chosen alternative is based on the alternative with the highest utility. That is:

\[ \begin{equation*} y_{it} = j \quad iff\quad U_{ijt}^* > U_{ikt}^*\quad \forall k\neq j \end{equation*} \]

The code is the following:

# Chosen alternative
chosen <- rep(0, N * J * t)
for (i in 1:(N * t)) {
  U_j <- U[((i - 1) * J + 1):(i * J)]
  U_max <- max(U_j)
  pos <- which(U_j == U_max)
  chosen[((i - 1) * J + 1):(i * J)][pos] <- TRUE
}

Putting the data in the correct format using mlogit.data function from mlogit package:

data <- as.data.frame(cbind(id, chosen, x1, x2))
data <- mlogit.data(data, 
                    choice = "chosen",
                    shape = "long", 
                    alt.levels = c("1", "2", "3", "4"), 
                    id.var = "id")

To check if the DGP was created correctly, we estimate a MIXL model:

mix <- gmnl(chosen ~ x1 + x2 + x3 | 0, 
            data = data,
            model = "mixl",
            ranp = c(x2 = "n", x3 = "n"),
            print.init = TRUE,
            R = 50,
            haltons = list("primes" = c(2, 3), 
                           "drop" = c(101, 101)),
            panel = TRUE,
            print.level = 2,
            iterlim = 500)
## 
## Starting Values:
##          x1          x2          x3       sd.x2       sd.x3 
##  0.73949876 -0.07700946  1.32380655  0.10000000  0.10000000 
## Estimating MIXL model 
## initial  value 2738.820501 
## iter   2 value 2555.118313
## iter   3 value 2508.578660
## iter   4 value 2501.102612
## iter   5 value 2498.606580
## iter   6 value 2496.552864
## iter   7 value 2494.548071
## iter   8 value 2450.235272
## iter   9 value 2448.259448
## iter  10 value 2447.974879
## iter  11 value 2447.956573
## iter  12 value 2447.955547
## iter  13 value 2447.955382
## iter  14 value 2447.955300
## iter  14 value 2447.955268
## iter  14 value 2447.955258
## final  value 2447.955258 
## converged
summary(mix)
## 
## Model estimated on: Sun Oct 08 11:57:08 2017 
## 
## Call:
## gmnl(formula = chosen ~ x1 + x2 + x3 | 0, data = data, model = "mixl", 
##     ranp = c(x2 = "n", x3 = "n"), R = 50, haltons = list(primes = c(2, 
##         3), drop = c(101, 101)), panel = TRUE, print.init = TRUE, 
##     print.level = 2, iterlim = 500, method = "bfgs")
## 
## Frequencies of categories:
## 
##       1       2       3       4 
## 0.25033 0.25133 0.25900 0.23933 
## 
## The estimation took: 0h:0m:46s 
## 
## Coefficients:
##        Estimate Std. Error z-value Pr(>|z|)    
## x1     0.967428   0.038132  25.371   <2e-16 ***
## x2    -0.168133   0.125476  -1.340   0.1803    
## x3     1.878064   0.079775  23.542   <2e-16 ***
## sd.x2  1.929194   0.127069  15.182   <2e-16 ***
## sd.x3  0.929376   0.071220  13.049   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Optimization of log-likelihood by BFGS maximization
## Log Likelihood: -2448
## Number of observations: 3000
## Number of iterations: 47
## Exit of MLE: successful convergence 
## Simulation based on 50 draws

The point estimates are very close to the true population parameters. So, it seems that the dataset was created correctly.

Constraining using arguments from maxLik

gmnl package uses maxLik function to maximize the log-likelihood functions. maxLik is basically a wrapper for different optimizers returning an object of class ``maxLik’’.

Among the different features of maxLik, it allows constraint optimization by using the argument constraints. This argument can be either NULL for unconstrained optimization (the default in gmnl) or a list with two components. The components may be either eqA and eqB for equality-constrained optimization \(A\theta + B = 0\); or ineqA and ineqB for inequality constraints \(A\theta + B > 0\). More than one row in ineqA and ineqB corresponds to more than one linear constraint, in that case all these must be zero (equality) or positive (inequality constraints).

Suppose we would like to constraint the standard deviation parameters such that sd.x2 = sd.x3. To do so, we use constraints argument passed to maxLik optimizer. The matrix of constraints are:

# With constrained Parameters: sd.x2 == sd.x3
A <- matrix(c(0, 0, 0 , 1, -1), nrow = 1 , ncol = 5)  # Matrix of restrictions. See help(maxLik)
B <- 0 

Note that setting the constraint sd.x2 = sd.x3 is equal to sd.x2 - sd.x3 = 0. Now we estimate a MIXL model including this constraint:

mix_c <- gmnl(chosen ~ x1 + x2 + x3| 0, 
            data = data,
            model = "mixl",
            ranp = c(x2 = "n", x3 = "n"),
            print.init = TRUE,
            R = 50,
            haltons = list("primes" = c(2, 3), "drop" = c(101, 101)),
            panel = TRUE,
            constraints = list(eqA = A, eqB = B),
            print.level = 2,
            iterlim = 500)
## 
## Starting Values:
##          x1          x2          x3       sd.x2       sd.x3 
##  0.73949876 -0.07700946  1.32380655  0.10000000  0.10000000 
## Estimating MIXL model
summary(mix_c)
## 
## Model estimated on: Sun Oct 08 12:02:33 2017 
## 
## Call:
## gmnl(formula = chosen ~ x1 + x2 + x3 | 0, data = data, model = "mixl", 
##     ranp = c(x2 = "n", x3 = "n"), R = 50, haltons = list(primes = c(2, 
##         3), drop = c(101, 101)), panel = TRUE, print.init = TRUE, 
##     constraints = list(eqA = A, eqB = B), print.level = 2, iterlim = 500, 
##     method = "bfgs")
## 
## Frequencies of categories:
## 
##       1       2       3       4 
## 0.25033 0.25133 0.25900 0.23933 
## 
## The estimation took: 0h:5m:25s 
## 
## Coefficients:
##        Estimate Std. Error z-value Pr(>|z|)    
## x1     0.962760   0.038120 25.2562   <2e-16 ***
## x2    -0.154045   0.099288 -1.5515   0.1208    
## x3     2.015902   0.107110 18.8208   <2e-16 ***
## sd.x2  1.335914   0.081110 16.4704   <2e-16 ***
## sd.x3  1.335880   0.111285 12.0041   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Optimization of log-likelihood by BFGS maximization
## Log Likelihood: -2480.3
## Number of observations: 3000
## Number of iterations: 44
## Exit of MLE: successful convergence 
## Simulation based on 50 draws