Example of Binary Logit Model

Introduction

In this document, we have provided a sample code for the assignment question 3a. There are many examples, along with few datasets, that can be downloaded from the Apollo choice modelling website. The chapter 4 in the Apollo manual explains the example no. 3 in detail. However, I feel that the details in chapter 4 can be overwhelming for those who are new to discrete choice modelling.

The code below is broken down into series of steps, it very slightly deviates from the manual in step 2 because we will use tidyverse package to read, analyze and preprocess data since it is more flexible and very useful tool for future

Step 1 - Initialisation
- set working directory
- set apollo_control
Step 2 - Data
- read file (csv or spss)
- clean data (optional)
- if we read spss file, remove labels/label/attributes
- convert tibble to dataframe
Step 3 - Define model parameters
- initialise parameters
Step 4 - Validate Inputs
Step 5 - Define apollo probabilities
- define utility equation
- set mnl_settings
Step 6 - Estimate model
Step 7 - Print and save output

Example code - Assignment 3a

We have provided a sample code for binary logit model for an assignment question - 3a

# --------------------------------------------------------------------------------
### Step 1 - Initialise the code 

# clean workspace
rm(list = ls())

# (OPTIONAL) set working directory
setwd("C:/Users/gulhare.s/Dropbox (UFL)/PhD/Discrete choice modeling/Article 4 - Binary Logit Model")

# load libraries
library(tidyverse) # for data science
library(haven) # to read SPSS files
library(apollo) # for mode choice analysis

# mandatory step
apollo_initialise()

#### Step 1-2 - Set Apollo controls
apollo_control = list(
  modelName  = "Model_3a",
  modelDescr ="Model 3a",
  indivID    ="PersonID"
)

# --------------------------------------------------------------------------------
### Step 2 - Data

# read SPSS file
tbl_spss <- read_sav("Dataset_assign1.sav")

# read labels and assign it to tbl_labelled (OPTIONAL)
tbl_labelled <- as_factor(tbl_spss)

# ----------------------------------------------------------------------------------
#### Step 2-2 - Clean and transform data

# remove labels/attributes from tibble
tbl <- zap_labels(zap_formats(zap_label(tbl_spss)))

# convert tibble to dataframe, always use variable name database
database <- as.data.frame(tbl)


# --------------------------------------------------------------------------------
### Step 3 - Define model parameters

### Vector of parameters, including any that are kept fixed in estimation
apollo_beta = c(asc_DA = 0,
                asc_Tr = 0,
                
                beta_IVTT  = 0,
                beta_OVTT  = 0,
                beta_Cost  = 0)

### Vector with names (in quotes) of parameters to be kept fixed at their starting value in apollo_beta, use apollo_beta_fixed = c() if none
apollo_fixed = c("asc_Tr")

# --------------------------------------------------------------------------------
### Step 4 - Validate data
apollo_inputs = apollo_validateInputs()

# --------------------------------------------------------------------------------
### Step 5 - Define apollo probabilities

apollo_probabilities=function(apollo_beta, apollo_inputs, functionality="estimate"){
  
  ### Attach inputs and detach after function exit
  apollo_attach(apollo_beta, apollo_inputs)
  on.exit(apollo_detach(apollo_beta, apollo_inputs))
  
  ### List of utilities: these must use the same names as in mnl_settings, order is irrelevant
  V = list()
  V[['DA']]  = asc_DA + beta_IVTT * IVTT_C + beta_OVTT * OVTT_C + beta_Cost * Cost_C
  V[['Tr']]  = asc_Tr + beta_IVTT * IVTT_Tr + beta_OVTT * OVTT_Tr + beta_Cost * Cost_Tr
  
  ### Create list of probabilities P
  P = list()
  
  ### Define settings for MNL model component
  mnl_settings = list(
    alternatives  = c(DA = 1, Tr = 0), 
    avail         = 1, 
    choiceVar     = Mode,
    V             = V)
  
  ### Compute probabilities using MNL model
  P[['model']] = apollo_mnl(mnl_settings, functionality)
  
  # ### Take product across observation for same individual
  # P = apollo_panelProd(P, apollo_inputs, functionality)
  
  ### Prepare and return outputs of function
  P = apollo_prepareProb(P, apollo_inputs, functionality)
  
  return(P)
}

# --------------------------------------------------------------------------------
### Step 6 - Estimate model

model = apollo_estimate(apollo_beta, 
                        apollo_fixed, 
                        apollo_probabilities, 
                        apollo_inputs)

# --------------------------------------------------------------------------------
### Step 7 - Model output

# print model output
apollo_modelOutput(model)

# save output
apollo_saveOutput(model)

Output files generate by apollo

Apollo generates many output files, the screenshot of output files is below -

Note that - the files are saved with the modelName provided in step 1-2 in apollo_control

We need to look at two files for relevant output -

Model_3a_estimates.csv

It contains the estimates and corresponding statistics - se, t-stat, rob.t-stat. The screenshot of output is below -

Model_3a_output.txt

It contains other statistics such as loglikelihood values, Rho-squares, AIC, BIC. Below is the screenshot -