Template for Ordered Response Model

Introduction

The code is broken down into series of steps, it very slightly deviates from the manual in step 2 because we will use tidyverse package to read, analyze and preprocess data since it is more flexible and very useful tool for future.

Step 1 - Initialisation
- set working directory
- set apollo_control
Step 2 - Data
- read file (csv or spss)
- clean data (optional)
- if we read spss file, remove labels/label/attributes
- convert tibble to dataframe
Step 3 - Define model parameters
- initialise parameters
Step 4 - Validate Inputs
Step 5 - Define apollo probabilities
- define utility equation
- set ol_settings
Step 6 - Estimate model
Step 7 - Print and save output

Step 1 - Initialise the code

The first step is to initialise the code. We can break it into two sub-steps. You can read more detail in page no. 19 in the apollo manual.

Step 1-1 - Load libraries

We first clear the workspace, set working directory, load the relevant libraries and then we call the function apollo_initialize()

package - tidyverse is used to read, clean, transform and visualize the data
package - apollo is used for choice modelling
package - haven is used to read SPSS files

# --------------------------------------------------------------------------------
### Step 1 - Initialise the code 

#### Step 1-1 - Load libraries

# clear workspace
rm(list = ls())

# set working directory
setwd("ENTER PATH (USE / NOT \)")
# e.g. setwd("C:/Users/gulhare.s/Desktop/Discrete choice analysis/Apollo Binary Logit")

# load relevant libraries
library(tidyverse) # for data analysis
library(apollo) # for mode choice modeling
library(haven) # to read SPSS files

# mandatory step
apollo_initialise()

Step 1-2 - Set Apollo controls

We have to set the core controls. In this case, we give the name and description of the model. We also identify the column which contains information about the individual decision makers. The details can be read in page no. 21 in the manual.

# --------------------------------------------------------------------------------
#### Step 1-2 - Set Apollo controls

apollo_control = list(
  modelName  ="ENTER MODEL_NAME",
  modelDescr =" ",
  indivID    ="ENTER INDIVIDUAL IDENTIFIER"
)
# e.g.
# apollo_control = list(
#   modelName  ="Model_ORL",
#   modelDescr ="ORL sample code",
#   indivID    ="PersonID"
# )

Step 2 - Data

In this step we read the data from files (.csv, .spss). And if required, we clean and transform the data.

Step 2-1 - Read data

You can read in detail about reading/importing csv file in chapter 11 in the book - R for Data Science

# --------------------------------------------------------------------------------
### Step 2 - Data

#### Step 2-1 - Read data

# TO READ CSV FILE (OPTIONAL)
tbl <- read_csv("ENTER FILE NAME")
# e.g. tbl <- read_csv("Data.csv")

# READ SPSS FILE (OPTIONAL)
tbl_spss <- read_sav("ENTER FILE NAME")
# e.g. tbl_spss <- read_sav("Data.sav")

Step 2-2 - Clean and transform data

This step involves cleaning and transforming the data. In case, we read data from spss files, we need to remove labels, attributes from the tibble using the command below. Note that we are working with data type tibble. but Apollo requires data in the form of dataframe, not tibble. So we convert tibble into dataframe using function as.data.frame() and assign it to a variable called database

# ----------------------------------------------------------------------------------
#### Step 2-2 - Clean and transform data

# If we read data from SPSS file, we need to remove labels/label/attributes from tibble (OPTIONAL)
tbl_spss[] <- lapply(tbl_spss, function(x) {attributes(x) <- NULL;x})

# convert tibble into dataframe
database <- as.data.frame(tbl_spss)

Step 3 - Define model parameters

We need to define a vector apollo_beta which contains parameter and thresholds and their starting values. We initialise them with zeros. There is no constant. In the commented example, we need to estimate beta parameters and threshold values. We also need to define another vector apollo_fixed which is to kept empty.

# --------------------------------------------------------------------------------
### Step 3 - Define model parameters

### Vector of parameters, including any that are kept fixed in estimation, enter threshold values (one less than number of  partitions) 
apollo_beta = c("INITIALIZE PARAMETERS AND THRESHOLDS")
# e.g.
# apollo_beta = c(b_dist  = 0,
#                 b_male  = 0,
#                 b_age   = 0,
#                 
#                 tau_1   = 1, 
#                 tau_2   = 2, 
#                 tau_3   = 3, 
#                 tau_4   = 4)

### Vector with names (in quotes) of parameters to be kept fixed at their starting value in apollo_beta, use apollo_beta_fixed = c() if none
apollo_fixed = c()
# e.g.
# apollo_fixed = c()

Step 4 - Validate Inputs

This function runs a number of checks and produces a consolidate list of model inputs. It looks for various inputs in global environment -

apollo_control
apollo_beta
apollo_fixed
database
Also searched for the identifier indivID (which is declared in step 1-2) in the database

If any of these is missing from global environment, then apollo_validateInputs() fails.

# --------------------------------------------------------------------------------
### Step 4 - Validate data

apollo_inputs = apollo_validateInputs()

Step 5 - Define apollo probabilities

Unlike other functions which are predefined, apollo_probabilities() needs to be defined by the user. The function is used by another function called apollo_estimate() in step 6. The apollo_probabilities() takes three inputs -

apollo_beta
apollo_inputs
functionality which takes a default value “estimate”

The initial lines of the code apollo_attach(apollo_beta, apollo_inputs) enables us to call individual elements of database e.g. using tt instead of database$tt. The command on.exit(apollo_detach(apollo_beta, apollo_inputs)) reverses the first command as soon as the code exits the apollo_probabilities(). The details can be read in section 4.5.1 in the manual.

We define the actual model i.e. a utility/propensity equations V. The details of ol_settings can be seen in section 5.3.2 of apollo manual.

# --------------------------------------------------------------------------------
### Step 5 - Define apollo probabilities

apollo_probabilities=function(apollo_beta, apollo_inputs, functionality="estimate"){

  ### Attach inputs and detach after function exit
  apollo_attach(apollo_beta, apollo_inputs)
  on.exit(apollo_detach(apollo_beta, apollo_inputs))
  
  ### DEFINE UTILITY EQUATION HERE *************************************
  
  ### Utility/propensity equation without constant
  V  = ENTER UTILITY/PROPENSITY EQUATION
  # e.g.
  # V  = b_dist * DISTANCE + b_male * MALE + b_age * AGE
  
  ### Define settings for ORL model component
  ol_settings = list(
    outcomeOrdered = IDENTIFIER COLUMN FOR LEVELS,
    V              = V,
    tau            = VECTOR OF TAU VALUES,
    coding         = NUMERIC OR CHARACTER VECTOR FOR CODING

  # e.g.
  ### Define settings for ORL model component
  # ol_settings = list(
  #   outcomeOrdered = WHCSTOPS,
  #   V              = V,
  #   tau            = c(tau_1, tau_2, tau_3, tau_4),
  #   coding         = c(0, 1, 2, 3, 4))

  
  # -----------------------------------------------------------------------------
  
  ### Create list of probabilities P
  P = list()
  
  ### Compute probabilities using MNL model
  P[['model']] = apollo_ol(ol_settings, functionality)

  ### Prepare and return outputs of function
  P = apollo_prepareProb(P, apollo_inputs, functionality)
  
  return(P)
}

Step 6 - Estimate model

# --------------------------------------------------------------------------------
### Step 6 - Estimate model

model = apollo_estimate(apollo_beta, 
                        apollo_fixed, 
                        apollo_probabilities, 
                        apollo_inputs)

Step 7 - Model output

# --------------------------------------------------------------------------------
### Step 7 - Model output

apollo_modelOutput(model)

# save output 
apollo_saveOutput(model)