Template for Multinomial Logit Model

Introduction

Multinomial logit model has the same code as binary logit model. We will see two additional function in apollo_probabilities() in step 5.

If there are multiple observations per individual, then we need to call function apollo_panelProd() which multiplies the probabilities choice observations for the same individual. We have been commenting this command (using #) because we only have one observation per individual.

### Take product across observation for same individual
P = apollo_panelProd(P, apollo_inputs, functionality)

If we wish to include weights, then we need to call the function apollo_weighing() prior to apollo_prepareProb(). The command is shown below. We also need to identify the weights column in apollo_control. It will be discussed in step 1 later.

### accounts for the weights
P = apollo_weighting(P, apollo_inputs, functionality)

The apollo_probabilities() always ends with the same two commands. First is apollo_prepareProb() which prepares the output of the function, followed by command return(P)

### Prepare and return outputs of function
P = apollo_prepareProb(P, apollo_inputs, functionality)

return(P)

Note - There are many examples, along with few datasets, that can be downloaded from the Apollo choice modelling website. The chapter 4 in the Apollo manual explains the example no. 3 in detail. However, I feel that the details in chapter 4 can be overwhelming for those who are new to discrete choice modelling.

The code is broken down into series of steps, it very slightly deviates from the manual in step 2 because we will use tidyverse package to read, analyze and preprocess data since it is more flexible and very useful tool for future.

Step 1 - Initialisation
- set working directory
- set apollo_control
  - identify weights column along with identifier column
Step 2 - Data
- read file (csv or spss)
- clean data (optional)
- if we read spss file, remove labels/label/attributes
- convert tibble to dataframe
Step 3 - Define model parameters
- initialise parameters
Step 4 - Validate Inputs
Step 5 - Define apollo probabilities
- define utility equation
- set mnl_settings
Step 6 - Estimate model
Step 7 - Print and save output

Create a folder with name Apollo MNL Model. Start a new script by clicking on File %>% New File %>% R Script. Read %>% as then. The script pane would be the on top left. Let’s save the script as MNL_Template.R (by clicking File %>% Save), we don’t need to write the extension .R just like we don’t write the extension .png to save the image. Save the script in the same folder (Apollo MNL model).

Step 1 - Initialise the code

The first step is to initialise the code. We can break it into two sub-steps. You can read more detail in page no. 19 in the apollo manual.

Step 1-1 - Load libraries

We first clear the workspace, set working directory, load the relevant libraries and then we call the function apollo_initialize()

package - tidyverse is used to read, clean, transform and visualize the data
package - apollo is used for choice modelling
package - haven is used to read SPSS files

# --------------------------------------------------------------------------------
### Step 1 - Initialise the code 

#### Step 1-1 - Load libraries

# clear workspace
rm(list = ls())

# set working directory
setwd("ENTER PATH (USE / NOT \)")
# e.g. setwd("C:/Users/gulhare.s/Desktop/Discrete choice analysis/Apollo Binary Logit")

# load relevant libraries
library(tidyverse) # for data analysis
library(apollo) # for mode choice modeling
library(haven) # to read SPSS files

# mandatory step
apollo_initialise()

Step 1-2 - Set Apollo controls

We have to set the core controls. In this case, we give the name and description of the model. We also identify the column which contains information about the individual decision makers and weights (if required). The details can be read in page no. 21 in the manual.

# --------------------------------------------------------------------------------
#### Step 1-2 - Set Apollo controls

apollo_control = list(
  modelName  ="ENTER MODEL_NAME",
  modelDescr =" ",
  indivID    ="ENTER INDIVIDUAL IDENTIFIER",
  weights    ="ENTER WEIGHTS IDENTIFIER" 
)
# e.g.
# apollo_control = list(
#   modelName  ="Model_MNL",
#   modelDescr ="MNL sample code",
#   indivID    ="PersonID",
#   weights    ="WEIGHT" 
# )

Step 2 - Data

In this step we read the data from files (.csv, .spss). And if required, we clean and transform the data.

Step 2-1 - Read data

You can read in detail about reading/importing csv file in chapter 11 in the book - R for Data Science

# --------------------------------------------------------------------------------
### Step 2 - Data

#### Step 2-1 - Read data

# TO READ CSV FILE (OPTIONAL)
tbl <- read_csv("ENTER FILE NAME")
# e.g. tbl <- read_csv("Data.csv")

# READ SPSS FILE (OPTIONAL)
tbl_spss <- read_sav("ENTER FILE NAME")
# e.g. tbl_spss <- read_sav("Data.sav")

Step 2-2 - Clean and transform data

This step involves cleaning and transforming the data. In case, we read data from spss files, we need to remove labels, attributes from the tibble using the command below. Note that we are working with data type tibble. but Apollo requires data in the form of dataframe, not tibble. So we convert tibble into dataframe using function as.data.frame() and assign it to a variable called database

# ----------------------------------------------------------------------------------
#### Step 2-2 - Clean and transform data

# If we read data from SPSS file, we need to remove labels/label/attributes from tibble (OPTIONAL)
tbl_spss[] <- lapply(tbl_spss, function(x) {attributes(x) <- NULL;x})

# convert tibble into dataframe
database <- as.data.frame(tbl_spss)

Step 3 - Define model parameters

We need to define a vector apollo_beta which contains parameters and their starting values. We initialise them with zeros. In the commented example, we need to estimate alternate specific constants and beta parameters for travel time and travel cost. We also need to define another vector apollo_fixed which contains parameters whose values are kept fixed.

# --------------------------------------------------------------------------------
### Step 3 - Define model parameters

### Vector of parameters, including any that are kept fixed in estimation
apollo_beta = c("INITIALIZE PARAMETERS")
# e.g.
# apollo_beta = c(asc_DA = 0,
#                 asc_SR = 0,
#                 asc_TR = 0,
#                   
#                 b_tt    = 0,
#                 b_tc    = 0)

### Vector with names (in quotes) of parameters to be kept fixed at their starting value in apollo_beta, use apollo_beta_fixed = c() if none
apollo_fixed = c("ENTER PARAMETER THAT HAS TO FIXED")
# e.g.
# apollo_fixed = c("asc_TR")

Step 4 - Validate Inputs

This function runs a number of checks and produces a consolidate list of model inputs. It looks for various inputs in global environment -

apollo_control
apollo_beta
apollo_fixed
database
Also searched for the identifier indivID (which is declared in step 1-2) in the database

If any of these is missing from global environment, then apollo_validateInputs() fails.

# --------------------------------------------------------------------------------
### Step 4 - Validate data

apollo_inputs = apollo_validateInputs()

Step 5 - Define apollo probabilities

Unlike other functions which are predefined, apollo_probabilities() needs to be defined by the user. The function is used by another function called apollo_estimate() in step 6. The apollo_probabilities() takes three inputs -

apollo_beta
apollo_inputs
functionality which takes a default value “estimate”

The initial lines of the code apollo_attach(apollo_beta, apollo_inputs) enables us to call individual elements of database e.g. using tt instead of database$tt. The command on.exit(apollo_detach(apollo_beta, apollo_inputs)) reverses the first command as soon as the code exits the apollo_probabilities(). The details can be read in section 4.5.1 in the manual.

We define the actual model i.e. a list of utility equations V.

# --------------------------------------------------------------------------------
### Step 5 - Define apollo probabilities

apollo_probabilities=function(apollo_beta, apollo_inputs, functionality="estimate"){

  ### Attach inputs and detach after function exit
  apollo_attach(apollo_beta, apollo_inputs)
  on.exit(apollo_detach(apollo_beta, apollo_inputs))
  
  ### DEFINE UTILITY EQUATIONS HERE *************************************
  
  ### List of utilities: these must use the same names as in mnl_settings, order is irrelevant
  V = list()
  V[["ENTER_MODE_1"]]  = ENTER UTILITY EQUATION FOR MODE 1
  V[["ENTER_MODE_2"]]  = ENTER UTILITY EQUATION FOR MODE 2
  V[["ENTER_MODE_3"]]  = ENTER UTILITY EQUATION FOR MODE 3
  # e.g.
  # V[["DA"]]  = asc_DA + b_tt * TTDA + b_tc * TCDA
  # V[["SR"]]  = asc_SR + b_tt * TTSR + b_tc * TCSR
  # V[["TR"]]  = asc_TR + b_tt * TTTR + b_tc * TCTR
  
  ### Define settings for MNL model component
  mnl_settings = list(
    alternatives  = c(ENTER_MODE_1 = IDENTIFIER_MODE_1, 
                      ENTER_MODE_2 = IDENTIFIER_MODE_2,
                      ENTER_MODE_3 = IDENTIFIER_MODE_3), 
    avail         = 1, 
    choiceVar     = ENTER_CHOICE_COLUMN_NAME,
    V             = V)
  # e.g.
  # mnl_settings = list(
  #   alternatives  = c(DA = 1, SR = 2, TR = 3), 
  #   avail         = 1, 
  #   choiceVar     = choice,
  #   V             = V)
  
  # *************************************************************************************
  
  
  ### Create list of probabilities P
  P = list()
  
  ### Compute probabilities using MNL model
  P[['model']] = apollo_mnl(mnl_settings, functionality)

  # ### Take product across observation for same individual
  # P = apollo_panelProd(P, apollo_inputs, functionality)

  ### accounts for the weights
  P = apollo_weighting(P, apollo_inputs, functionality)

  ### Prepare and return outputs of function
  P = apollo_prepareProb(P, apollo_inputs, functionality)
  
  return(P)
}

Step 6 - Estimate model

# --------------------------------------------------------------------------------
### Step 6 - Estimate model

model = apollo_estimate(apollo_beta, 
                        apollo_fixed, 
                        apollo_probabilities, 
                        apollo_inputs)

Step 7 - Model output

# --------------------------------------------------------------------------------
### Step 7 - Model output

apollo_modelOutput(model)

# save output 
apollo_saveOutput(model)