MSM

Abstract

This is a replication of Berry (1992) on estimating airline entrance. An MSM (Method of Simulation Moments) technique is practiced in this code. Data can be found in https://www.econometricsociety.org/content/supplement-market-structure-and-multiple-equilibria-airline-markets-0.

library(tidyverse)
library(readxl)
library(haven)
library(stats)
library(doParallel)
library(foreach)

Method of Simulated Moments (MSM)

We consider the firm specific characteristic (\(Z_{ik}\)). The profit equation is now \[\begin{equation} \label{eq:pi0} \pi_{ik}(N) = X_i \beta + Z_{ik}\alpha - \delta \ln(N) + \sigma u_{ik} + \rho u_{i0} \end{equation}\] Note that \(u_{ik}\) is the error term for each firm in each market, and \(u_{i0}\) is the market error term that is observed by all firms.

As is with standard discrete choice models, we have to identify the unit of profit by normalizing one coefficient. Here we choose to normalize the total variance \(\epsilon = \sigma u_{i0} + \rho u_{ik}\) to 1. Therefore we set \(\sigma = \sqrt{1-\rho^2}\). Equation \(\ref{eq:pi0}\) now becomes

\[\begin{equation} \pi_{ik}(N) = X_i \beta + Z_{ik}\alpha - \delta \ln(N) + \sqrt{1 - \rho^2} u_{ik} + \rho u_{i0} \end{equation}\]

Our goal is to construct a moment condition, defined by \[\begin{equation} \label{eq:MC} E[ \nu_{i0}(N_i^*, W_i, \theta) \mid W_i, \theta ] = 0 \end{equation}\] where \[\begin{equation} \label{eq:nu} \nu_{i0}(N_i^*, W_i, \theta) = N_i^* - E(N_i^* \mid W_i, \theta) \end{equation}\]

That is, we estimate \(\theta\) such that the expected value of the difference between observed number of entrance \(N_i^*\) and the expected entrance number under the exogenous market data \(W_i\) and parameters \(\theta = (\beta, \delta, \alpha)\) equals to zero, which is equation (\(\ref{eq:MC}\)).

The moment conditions can be set as \[\begin{equation} \label{eq:mc} g(\theta) = \large[ E(\nu_{0} \cdot X ) \large, E(\nu_{k}\cdot Z_{k})_{\forall k }\large] \end{equation}\] There are in total 21 moment conditions.

Since the expectation will be hard to integrate, we try a simulation based method:

For each market, draw a vector of simulated vectors \(\hat{u}_i = (\hat{u}_{i0},\hat{u}_{i1},\hat{u}_{i2}, ...,\hat{u}_{iK})\) from \(\mathcal{N}(0,I_{K+1})\).
Construct \(\pi_{ik}(\hat{n} = 0)\) for each firm.
Increase \(\hat{n}\) from 0 to k until the total number of firms with \(\pi_{ik}(\hat{n})>0\) is less than \(\hat{n}\), implying a nonfeasible circumstance. The terminal \(\hat{n}\) is then the unbiased estimate of the expected number of firms enter.

Do the above process for \(T\) times, and take the average. This is the Monte-Carlo approach of calculating the expected value of \(N^*_i\)

Configurations

We first tidy up the data and create helper variables

flight.df<- read_stata("DATA/CilibertoTamerEconometrica.dta")
dat2 = (flight.df %>% 
          mutate(N = rowSums(across(airlineAA:airlineWN)), const = 1, .before = marketdistance ))

firm.Codes = flight.df %>% select(starts_with("airline")) %>% colnames() %>% substring(8)

market.regressors = c('const',"marketdistance","marketsize" )
firm.regressors = c("marketpresence","mindistancefromhub")

market.regressors.i = c(9,10, 15)
firm.regressors.i = c(18, 24)

market.num = nrow(dat2)
firm.n = length(firm.Codes)

length.of.each.param=c(length(firm.regressors),length(market.regressors),1,1 )

Parallel Computation

Note that each market and each simulation is independent, and it is better to perform the process under parallel computation. We use the doParallel along with foreach package for parallel computation.

core.num = detectCores()-2
cl = makeCluster(core.num)
registerDoParallel(cl)

clusterExport(cl, c("dat2", "firm.Codes", 
                    "market.regressors", "market.regressors.i", 
                    "firm.regressors", "firm.regressors.i",
                    "market.num", "firm.n",
                    "length.of.each.param", "sim.num"))

sim.num = 10

Main process

Simulation for each Market

This code section conducts the simulation, given the parameters.
Note that Since we want \(\sigma = \sqrt{1-\rho^2}\) to be real, we transform \(\rho = \frac{1}{1+\exp(\rho^*)}\) make sure \(\rho\) stays with \([0,1]\).

sim.process = function(A, B, delta, rho) {
  # foreach market
  sim = foreach(i = 1:market.num, .combine = 'rbind') %:%
    #foreach draw
    foreach(k = 1:sim.num, .combine = '+' ) %dopar% {
      prediction = rep.int(0 , firm.n)    # prediction for each draw
      prediction_prev = rep.int(0 , firm.n)
      U0 = rnorm(1)
      U.firm = rnorm(firm.n)
      
      n.try = 0
      n.pred = 0
      
      # keep adding up n.try until it exceed the predicted n.
      while (n.try <= firm.n) {
        n.pred = 0      # the num of firm under this n.try
        prediction = rep.int(0 , firm.n)
        
        # for each firm
        for (i.firm in 0:(firm.n-1) ) {
          
          X = as.matrix(dat2[i, market.regressors.i])
          Z = as.matrix(dat2[i, (firm.regressors.i + i.firm)])
          U = as.numeric(U.firm[1 + i.firm])
          
          rho = 1/(1 + exp(rho))
          
          profit =  X %*% B +
            Z %*% A +
            U0 * rho +
            U * sqrt(1 - rho ^ 2) -
            delta * log(n.try)
          
          is.enter = as.integer(profit > 0)
          
          prediction[i.firm] = is.enter
          n.pred = n.pred + is.enter
        } # end for each firm
        
        if (n.pred < n.try) {
          break
        }
        n.try = n.try + 1
        prediction_prev = prediction
      } # end while
      c(prediction_prev, n.try - 1)
    }
  
  sim = sim / sim.num
  
  return(sim)
}

Prediction Error

This section does the simulation process, given the vector of parameters. Note that due to some restrictions on the optim function, we will have to flatten the parameters into one single vector. We transform the parameters back to a list of four(\(\alpha, \beta, \delta, \rho\)) with a helper function split.param.

split.param = function(params, length.of.each.param){
  param.list = list()
  i = 1
  for (l in length.of.each.param){
    i.start = i
    i.end = i + l - 1
    
    new.append = params[i.start : i.end] %>% matrix(ncol = 1)
    param.list = append(param.list, list(new.append))
    i = i + l
  }
  return(param.list)
}

prediction.errors = function(params){
  
  sim.result = do.call(sim.process, split.param(params, length.of.each.param))
  real.result = dat2[, 2 : (2 + firm.n) ]     # N, firm1, firm2, ...frimn
  
  v = sim.result - real.result
  
  return(v)
}

Moment condition

Next we construct the moments, according to equation \(\ref{eq:mc}\)

moment.conditions = function(params){
  v = as.matrix(prediction.errors(params))  # 7 * N
  
  M = as.matrix(dat2[,-1]) 
  M = cbind( rep(0, nrow(M)) ,M)
  ##  E(V0X) ,  3*1 
  
  E_v0X = v[,7] * M[,market.regressors.i]              # 1* N * N * 3
  E_v1Z  = v[,1] * M[, c(9, firm.regressors.i + 0) ]   # 1* N * N * 3
  E_v2Z  = v[,2] * M[, c(9, firm.regressors.i + 1) ]   # 1* N * N * 3
  E_v3Z  = v[,3] * M[, c(9, firm.regressors.i + 2) ]   # 1* N * N * 3
  E_v4Z  = v[,4] * M[, c(9, firm.regressors.i + 3) ]   # 1* N * N * 3
  E_v5Z  = v[,5] * M[, c(9, firm.regressors.i + 4) ]   # 1* N * N * 3
  E_v6Z  = v[,6] * M[, c(9, firm.regressors.i + 5) ]   # 1* N * N * 3
  
  E = cbind(E_v0X, E_v1Z, E_v2Z, E_v3Z, E_v4Z, E_v5Z, E_v6Z)
  
  return(E)
}

GMM Estimator

Finally, we construct a distance for GMM(Generalized Method of Moments). Typically, an GMM estimation is obtained by \[ \hat{\theta} = \text{argmin} \quad g(\theta)'Wg(\theta) \] Where \(g(\theta)\) is the vector of expected moments, which is calculated above, and \(W\) is a weight matrix. A robust procedure is to get \(\hat{\theta}^{(1)}\) by setting \(W = \mathbb{I}\), and then get \(\hat{\theta}^{(2)}\) by setting \(\hat{W} = \frac{1}{n} g(\hat{\theta}^{(1)})g(\hat{\theta}^{(1)})'\). However, due to computational limit, we here only provide the result for \(\hat{\theta}^{(1)}\), which is still consistant but not the most efficient result.

moment.distance = function(params, W = diag(21)){
  g = params %>% moment.conditions %>% colMeans %>% as.matrix()  # 21 * 1
  dist =t(g) %*% W %*% g
  return(dist)
}

Result

We set the initial parameter close to results estimated by ordered probit method. With other parameters being 0.5 for no reason.

              # A1  A2   B1   B2    B3 delta  rho
init.param = c(0.5, 0.5, 0.9, 0.1, 0.5, 1.9, 0.6)
opt_10 = optim(init.param, moment.distance, gr=NULL, diag(21), method = "BFGS")

sms.coef = matrix(opt_10$par, ncol = 1)
sms.coef[7] = exp(sms.coef[7] ) / (1+ exp(sms.coef[7] ) )
rownames(sms.coef) = c(firm.regressors, market.regressors, "delta", "rho")

knitr::kable(sms.coef)

marketpresence	0.0214195
mindistancefromhub	0.2632054
const	-0.3796320
marketdistance	0.3305828
marketsize	0.4223052
delta	1.7649692
rho	0.6331226