Introduction

This is my attempte to replication the results of the table IV in Berry, Levinsohn, and Pakes (1995) by using Python. The following sections provide a very brief overview of the BLP model and the corresponding Python codes for how it is estimated. The main codes I relying on for this BLP replication exercise come from http://www.ivan-li.com/code/blp_1995. I also try to follow the procedure of Chris Conlon and Jeff Gortmaker’s pyblp package, with their working paper: https://jeffgortmaker.com/files/pyblp.pdf.

Import necessary packages for Python

Below are the necessary packages you need to import in Python before running the codes.

1. BLP model description

The BLP model is a model to present a framework which enables one to obtain estimates of demand and cost parameters for a class of oligopolistic differentiated products market. These estimates can be obtained using only widely available product-level and aggregate consumer-level data, and they are consistent with a structural model of equilibrium in an oligopolistic industry. The framework is based on:

  • A joint distribution of consumer characteristics and product attributes that determines preferences over the products marketed;
  • Price taking assumptions on the part of consumers;
  • Nash equilibrium assumptions on the part of producers.

The main contributions of BLP model are:

  • Use aggregate data to incorporate heterogeneity such as different consumer with different income have different taste;
  • Propose BLP instruments to address endogeneity for the price of the product and the unobserved product attributes.

1.1 Utility and demand

Below are the common notations that used in the BLP model.

  • There are \(t = 1, 2, . . . , T\) markets
  • Each market with consumers \(i = 1,2, \cdots, I_t\)
  • Each market with \(j = 1, 2,\cdots, J\) products produced
  • Each prodcut with \(k = 1,2, \cdots, K\) attributes

Given these notations, the untility derived by consumer \(i\) from consuming product \(j\) is: \[U(\zeta_i, p_j, x_{j}, \xi_{j}; \theta)\] where \(\theta\) is a \(k\)-vector of parameters to be estimated, \(\zeta\) represents a vector of product characteristics, \(p\) represents the price of the product, and \(x\) and \(\xi\) are, respectively, observed and unobserved (by the econometrician) product attributes. Consumer \(i\) chooses good \(j\) if and only if \[U(\zeta_i, p_j, x_{j}, \xi_{j}; \theta) \geq U(\zeta_i, p_j, x_{j}, \xi_{j}; \theta), \quad \text{for}\quad r = 0, 1, \cdots, J, \] where alternatives \(r = 0, 1, \cdots, J\) represent purchases of the competing differentiated products. Alternative zero, or the outside alternative, represents the option of not purchasing any of those product. Let \[A_{j} = \{U(\zeta, p_j, x_{j}, \xi_{j}; \theta) \geq U(\zeta, p_j, x_{j}, \xi_{j}; \theta), \quad \text{for}\quad r = 0, 1, \cdots, J,\}. \] That is, \(A_{j}\) is the set of values for \(\zeta\) that induces the choice of good \(j\). Then, the market share of good “\(j\)” as a function of the characteristics of all the goods competing in the market is given by

\[s_{j}(p, x, \xi ; \theta)=\int_{\xi \in A_{j}} P_{0}(d \zeta),\] where \(P_{0}(d \zeta)\) ) provides the density of \(\zeta\) in the population. The Cobb-Douglas utility function form of consumer \(i\)’s utility of consuming product \(j\) in market \(t\), in which the function \(G(.)\) is linear and allows for interaction between individual and product characteristics:

\[U\left(\zeta_{i}, p_{j}, x_{j}, \xi_{j} ; \theta\right)=\left(y_{i}-p_{j}\right)^{\alpha} G\left(x_{j}, \xi_{j}, \nu_{i}\right) e^{\epsilon(i, j)},\] where \(y\) is income, and \(\epsilon\) provides the effect of the interactions of unobserved product and individual characteristics.

Random Coefficients

The BLP paper assume that \(G(.)\) is linear and has the random coefficient specification as dicussed above. Let \(u_{ij} = \log[U_{ij}]\), then

\[u_{ij}= \alpha \log(y_{i} - p_{j}) + x_{j} \bar{\beta}+\xi_{j}+\sum_{k} \sigma_{k} x_{j k} \nu_{i k}+\epsilon_{i j}\] where \((\zeta_{i}, \epsilon_{i} ) = (\nu_{i1}, \cdots, \nu_{iK}, \epsilon_{i1}, \cdots, \epsilon_{iK})\) is a mean zero vector of random variables with (a known) distribution function. Using Taylor expansion, the utility obtained from consuming good \(j\) can be expressed as

\[u_{ij}= \alpha \log(y_{i}) + \alpha_{new} p_{j} + x_{j} \bar{\beta}+\xi_{j}+\sum_{k} \sigma_{k} x_{j k} \nu_{i k}+\epsilon_{i j}\]

Define

\[\delta_{j} = \alpha_{new} p_{j} + x_{j} \bar{\beta} +\xi_{j} \] and a deviation from that mean

\[\mu_{ijt} = \alpha \log(y_{i}) + \sum_{k} \sigma_{k} x_{j k} \nu_{i k},\] which depends on the interaction between consumer preferences and product characteristics. Given the assumption of Type I extreme value errors, we can analytically integrate out the \(\epsilon_{ij}\) and arrive at a probability that consumer \(i\) will choose product \(j\)

\[\hat{s}_{i j}\left(v_{i}\right)=\frac{\exp \left\{\delta_{j}+\mu_{i j}\left(v_{i}\right)\right\}}{\sum_{l} \exp \left\{\delta_{l}+\mu_{i l}\left(v_{i}\right)\right\}}\] BLP considers two models. In the first, \(\mu_{ij} = 0\) which reduces to the standard logit model. In the second, \(\mu_{ij} \neq 0\), which leads to the random coefficients model. Here we fouced on the second case.

The main estimation procedure:

  • Treat \(\delta_i\) as a whole component
  • Once \(\delta_i\) estimated, we can use OLS regression to estimate the parameters in \(\delta_{j} =\alpha_{new} p_{j} + x_{j} \bar{\beta} +\xi_{j}\)
  • Iterations the above two steps

Price Endogeneity and Instrumental Variables

Based on the expression of the market share, we have the regression model:

\[\ln(s_{j}) - \ln(s_{0}) = -\alpha p_{j} + x_{j} \bar{\beta} + \xi_{j},\] We can then estimate \(\bar{\beta}\) and \(\xi\) using regular OLS. Here, the structural error that we are trying to minimize is \(\xi_{jt}\), i.e. the unobserved product quality. This will also be the error term that we minimize in the random coefficients setup. For the OLS estimation, we have an endogeneity problem since cars with large unobserved quality will tend to have higher prices as well: \[\text{Cov} (p_{j}, \xi_{j}) > 0.\] This endogeneity problem will cause biased OLS estimators. A simple remedy would be to instrument for price in the logit model. BLP propose using three sets of instruments:

  • The observed product characteristics (which are assumed orthogonal to the unobserved characteristics) other than \(j\) as an instruments in the demand of differentiated products.
  • The sum of product characteristics for all models marketed by a single firm in a given market other than \(j\) as an instruments in the demand of differentiated products..
  • The sum of product characteristics for all models in a given market other than \(j\) as an instruments in the demand of differentiated products.

Here a mistake have been found by Shapiro and Gentzkow in how BLP calculated their instruments. They multiply each product characteristic by the number of models the firm sells in each market rather than sum across the characteristics. I follow this mistaken calculation so as to match BLP’s original results.

2. Computation steps and corresponding Python codes

First we need to import the data in Python and define some variables for later use.

# Read the dataframe
df = pd.read_csv("C:/Users/zgong\OneDrive/Desktop/IO II/blp_replication/blp_1995_data.csv")
df = df.drop(df.columns[0], axis = 1)

# python code for cleaning
df[["ln_hpwt", "ln_space", "ln_mpg", "ln_mpd", "ln_price"]] = \
    df[["hpwt", "space", "mpg", "mpd", "price"]].apply(lambda x: np.log(x))

# instrument change
df["trend"] = df.market.map(lambda x: x + 70)  # use with non pyblp instruments
# df["trend"] = df.market

df["cons"] = 1

df["s_0"] = np.log(1 - df.share.groupby(df["model_year"]).transform("sum"))

df["s_i"] = np.log(df.share)
df["dif"] = df.s_i - df.s_0
df["dif_2"] = np.log(df.share) - np.log(df.share_out)
df["ln_price"] = np.log(df.price)

df.head()

# here because we may want to use their instruments

product_data = pd.read_csv(pyblp.data.BLP_PRODUCTS_LOCATION)

# demand variables
X = df[["cons", "hpwt", "air", "mpd", "space"]].values

# suppy variables
W = df[["cons", "ln_hpwt", "air", "ln_mpg", "ln_space", "trend"]].values

# price
p = df.price.values

# initial delta_0 estimate: log(share) - log(share outside good)
delta_0 = df.dif_2.values

# number of goods per market
J = df.groupby("year").sum().cons.values

# number of draws per market
N = 500

# number of markets
T = len(J)

# Estimated log income means for years 1971 - 1990
incomeMeans = [2.01156, 2.06526, 2.07843, 2.05775, 2.02915, 2.05346, 2.06745,
               2.09805, 2.10404, 2.07208, 2.06019, 2.06561, 2.07672, 2.10437, 2.12608, 2.16426,
               2.18071, 2.18856, 2.21250, 2.18377]

# standard deviation of log incomes, assuming empirically given in 1995
sigma_v = 1.72

# number of terms that have the random coefficient
#  according to table 4, this is constant, hp/wt, air, mp$, size, price
k = 5

# markets for itj
markets = df.market.values

# unique markets
marks = np.unique(df.market)

# firms
firms = np.reshape(df.firmid.values, (-1,1))

Step 1:

  • For given \(\theta_2\) (and initial draws of \(\nu_i\) and \(D_{i}\)), we define a function to compute household deviations from mean utility \(\mu_{ijt} (x_{jt}; p_{jt}; ν_{i}; D_{i}; \theta_{2})\).
  • Define a function to compute indirect utility given parameters
    • \(x\): matrix of demand characteristics
    • \(v\): monte carlo draws of N simulations
    • \(p\): price vector
    • \(y\): income of individuals
    • delta: guess for the mean utility
    • theta_2: non-linear params
    • \(J\): vector of number of goods per market
    • \(T\): numer of markets
    • \(N\): number of simulations
  • For given mean utility \(\delta_{jt}\) and \(\theta_2\), we define a function to compute predicted market shares. The market share can be approximated by simulation with

\[s_{j t}\left(\delta_{t}, \theta_{2}\right)=\frac{1}{N S} \sum_{i=0}^{N S} \frac{\exp \left\{\delta_{j t}+\mu_{i j t}\right\}}{\sum_{k=0}^{J} \exp \left\{\delta_{k t}+\mu_{i k t}\right\}}\]

Step 2:

  • Contraction Mapping: given \(\theta_2\), search for \(\delta_t\) such that market shares computed in Step 1 are equal to the observed market shares, \(s_{jt} = s_{jt}(\delta_t, \theta_2)\). This is a non-linear system of equations that is solved numerically using contraction mapping proposed by BLP (1995)

\[\delta_{t}^{h+1}=\delta_{t}^{h}+\ln s_{j t}-\ln s_{j t}\left(\delta_{t}^{h}, \theta_{2}\right)\]

  • Iteration continues until \(||\delta_{t}^{h+1}- \delta_{t}^{h}||\) is below a specified tolerance level.

Step 3:

  • Calculates the marginal costs given probabilities and shares
    • q_s : output of compute share, a list of probabilities matrix(q) and shares vector(s)
    • Firms: vector of firms operating in each market (length is \(J\times T\))
    • marks: vector of unique markets (length \(T\))
    • markets: vector indicating observation in which market (length \(J\times T\))
  • From \(\delta_t\), estimate the linear parameters \(\theta_1\) using the fact that \(\delta_{jt}(s_{jt}, \theta_{2}) - (x_{jt}\beta - \alpha p_{jt}) = \xi_{jt}\). The IV moment conditions are \(E[Z'\xi] = 0\).

\[\hat{\theta}_{1}=\left(X_{1}^{\prime} Z W^{-1} Z^{\prime} X_{1}\right)^{-1} X_{1}^{\prime} Z W^{-1} Z^{\prime} \delta\left(\theta_{2}\right)\] Note: here \(X_1\) - product characteristics that enter linear part of the estimation, \(Z\) - instruments for endogenous variables, \(W\) - consistent estimate of \(E[Z'\xi \xi' Z]\). The instruments for BLP paper can be generated as follows:

  • Compute GMM objective function

\[Q\left(\theta_{2}\right)=\hat{\xi}\left(\theta_{2}\right)^{\prime} Z W Z^{\prime} \hat{\xi}\left(\theta_{2}\right)\] where \(\hat{\xi}\left(\theta_{2}\right)\) are GMM residuals.

# Construct GMM estimator
zxw1 = Z.T @ baseData

bx1 = np.linalg.inv(zxw1.T @ zxw1)@ zxw1.T @ Z.T @ delta_0

# estimated error 
e = delta_0 - baseData @ bx1

g_ind = e.reshape((-1,1)) * Z

demean = g_ind - g_ind.mean(axis=0).reshape((1,-1))

vg = demean.T @ demean / demean.shape[0]

# weighting matrix
w0 = np.linalg.inv(vg)

t3c2 = np.linalg.inv(zxw1.T @ w0 @ zxw1) @ zxw1.T @ w0 @ Z.T @ delta_0

# obtain block-diag matrix of supply and demand instruments
z = scipy.linalg.block_diag(Z,Z_s)

# Recommended initial weighting matrix from Aviv's appendix
w1 = np.linalg.inv(z.T @ z)

def objective(theta_2, s, X, V, p, y, J, T, N, marks, markets, tol, 
              Z, Z_s, W, weigh, firms):
    
    obs = np.sum(J)
    
    d.delta = solve_delta(s, X, V, p, y, d.delta, theta_2, J, T, N, tol)
    
    # obtain the actual implied quantities and shares from converged delta
    q_s = compute_share(X, V, p, y, d.delta, theta_2, J, T, N)
    
    # calculate marginal costs
    mc = calc_mc(q_s, firms, p, y, theta_2[5], J, T, N, marks, markets).reshape((-1,1))
    
    # since we are using both demand and supply side variables,
    #  we want to stack the estimated delta and estimated mc
    y2 = np.vstack((d.delta.reshape((-1,1)), np.log(mc)))
    
    # create characteristics matrix that includes both supply and demand side
    #  with demand characteristics on the bottom left and supply on the top right
    x = scipy.linalg.block_diag(X,W)
    
    # create matrix of supply and demand instruments, again with
    #  demand instruments on the right and supply on the left (top/down changed)    
    z = scipy.linalg.block_diag(Z,Z_s)
    
    # get linear parameters (this FOC is from Aviv's appendix)
    b = np.linalg.inv(x.T @ z @ weigh @ z.T @ x) @ (x.T @ z @ weigh @ z.T @ y2)
    
    # Step 3: get the error term xi (also called omega)
    xi_w = y2 - x @ b
    
    # computeo g_bar in GMM methods
    g = z.T @ xi_w / obs
    
    obj = float(obs**2 * g.T @ weigh @ g)
   
    print([theta_2, obj])
    
    return obj

Step 4:

  • Minimize \(Q(\theta_2)\) over \(\theta_2\) with Steps 1-3 nested for every \(\theta_2\) trial. The initial computation is
  • Computation time:

Around 5 minutes.

  • We next search for optimal parameters with the optimal weighting matrix
  • Computation time:

Around 9 minutes.