Please try reading the paper below: https://www.jasandford.com/porter.pdf

Porter, Robert H. “A Study of Cartel Stability: The Joint Executive Committee, 1880-1886.” The Bell Journal of Economics, vol. 14, no. 2, 1983, pp. 301–14. JSTOR, https://doi.org/10.2307/3003634. Accessed 29 Apr. 2023.

Essentials of Econometrics: Empirical Exercises SW 12

During the 1880s, a cartel known as the Joint Executive Committee (JEC) controlled the rail transport of grain from the Midwest to eastern cities in the United States.

The cartel preceded the Sherman Antitrust Act of 1890, and it legally operated to increase the price of grain above what would have been the competitive price.

From time to time, cheating by members of the cartel brought about a temporary collapse of the collusive price setting agreement.

In this exercise, you will use variations in supply associated with the cartels collapses to estimate the elasticity of demand for rail transport of grain.

The data file JEC.xls contains weekly observations on the rail shipping price and other factors from 1880 to 1886.

A detailed description of the data is contained in JEC Description.pdf.

Suppose that the demand curve for rail transport of grain is specified as -

\(ln(Q_i) = \beta_0 + \beta_1 ln(P_i) + \beta_2 Ice_i + \Sigma_{j=1}^{12} \beta_{2+j} Seas_{j,i} + u_{i}\)

where \(Q_i\) is the total tonnage of grain shipped in week i, \(P_i\) is the price of shipping a ton of grain by rail, \(Ice_i\) is a binary variable that is equal to 1 if the Great Lakes are not navigable because of ice, and \(Seas_j\) is a binary variable that captures seasonal variation in demand. Ice is included because grain could also be transported by ship when the Great Lakes were navigable.

Estimate the demand equation by OLS using heteroskedasticity robust standard errors. What is the estimated value of the demand elasticity and its standard error?
Explain why the interaction of supply and demand could make the OLS esti- mator of the elasticity biased.
Consider using the variable cartel as instrumental variable for ln(P). Use economic reasoning to argue whether cartel plausibly satisfies the two conditions for a valid instrument.
Estimate the first-stage regression using heteroskedasticity robust standard errors. Is cartel a relevant instrument?
Estimate the demand equation by instrumental variable regression using heteroskedasticity robust standard errors. What is the estimated demand elasticity and its standard error?
Does the evidence suggest that the cartel was charging the profit-maximizing monopoly price? Explain. (Hint : What should a monopolist do if the price elasticity is less than 1?)

1 Introduction

Instrumental variables (IV) regression is a statistical method used to estimate causal relationships between variables when there may be endogeneity or omitted variable bias. In other words, it is used to address situations where a predictor variable may be correlated with the error term in a regression model, leading to biased estimates of the regression coefficients.

In an IV regression, an instrumental variable is used as a proxy for the potentially endogenous or omitted variable. An instrumental variable is a variable that is correlated with the predictor variable of interest, but is not correlated with the error term in the regression model. The instrumental variable is used to construct a new predictor variable that is uncorrelated with the error term, allowing for unbiased estimation of the regression coefficients.

The IV regression model typically involves two stages. In the first stage, the instrumental variable is regressed on the potentially endogenous predictor variable to obtain the predicted values of the predictor variable. In the second stage, the predicted values of the predictor variable are used as the predictor in the outcome variable regression. This two-stage process effectively removes the endogeneity or omitted variable bias from the regression model.

IV regression is commonly used in econometrics, where it is used to estimate causal relationships between economic variables. It is also used in other fields such as public health and social sciences, where it is used to address endogeneity or omitted variable bias in regression models.

2 Import Data

Raw Data can be found at https://github.com/R-Avalos/JEC

Data Dictionary is available at https://wps.pearsoned.co.uk/wps/media/objects/12401/12699039/empirical/empex_tb/JEC_Description.pdf

JEC contains weekly observations on prices and other factors from 1880-1886, for a total of n = 328 weeks. These data were provided by Professor Rob Porter of Northwestern University and were used in his paper “A Study of Cartel Stability: The Joint Executive Committee, 1880-1886” The Bell Journal of Economics, Vol. 14, No. 2, Autumn 1983, 301-314.

remove(list=ls())

library(readxl)
?read_excel
JEC_data <- read_excel("JEC_data.xlsx")

Variable Definitions

week: week of observation: = 1 if 1/1/1880-1/7/1880, = 2 if 1/8/1880-1/14/1880, …, = 328 for final week
price = weekly index of price of shipping a ton of grain by rail
ice = 1 if Great Lakes are impassable because of ice, = 0 otherwise
cartel = 1 railroad cartel is operative, = 0 otherwise
quantity = total tonnage of grain shipped in the week
seas1 – seas13 = thirteen “month” binary variables. To match the weekly data, the calendar has been divided into 13 periods, each approximately 4 weeks long. Thus seas1 = 1 if date is January 1 through January 28, =0 otherwise seas2 = 1 if date is January 29 through February 25, =0 otherwise … seas13 = 1 if date is December 4 through December 31, =0 otherwise

library("psych")
describe(JEC_data)

##          vars   n     mean       sd   median  trimmed      mad     min     max
## week*       1 329   165.00    95.12   165.00   165.00   121.57    1.00   329.0
## price       2 328     0.25     0.07     0.25     0.24     0.07    0.12     0.4
## cartel*     3 328     1.38     0.49     1.00     1.35     0.00    1.00     2.0
## quantity    4 328 25384.39 11632.77 23100.50 24352.26 11298.15 4810.00 76407.0
## seas1       5 328     0.09     0.28     0.00     0.00     0.00    0.00     1.0
## seas2       6 328     0.09     0.28     0.00     0.00     0.00    0.00     1.0
## seas3       7 328     0.09     0.28     0.00     0.00     0.00    0.00     1.0
## seas4       8 328     0.09     0.28     0.00     0.00     0.00    0.00     1.0
## seas5       9 328     0.07     0.26     0.00     0.00     0.00    0.00     1.0
## seas6      10 328     0.07     0.26     0.00     0.00     0.00    0.00     1.0
## seas7      11 328     0.07     0.26     0.00     0.00     0.00    0.00     1.0
## seas8      12 328     0.07     0.26     0.00     0.00     0.00    0.00     1.0
## seas9      13 328     0.07     0.26     0.00     0.00     0.00    0.00     1.0
## seas10     14 328     0.07     0.26     0.00     0.00     0.00    0.00     1.0
## seas11     15 328     0.07     0.26     0.00     0.00     0.00    0.00     1.0
## seas12     16 328     0.07     0.26     0.00     0.00     0.00    0.00     1.0
## ice*       17 328     1.43     0.50     1.00     1.41     0.00    1.00     2.0
##             range skew kurtosis     se
## week*      328.00 0.00    -1.21   5.24
## price        0.28 0.04    -0.58   0.00
## cartel*      1.00 0.49    -1.77   0.03
## quantity 71597.00 0.97     1.35 642.31
## seas1        1.00 2.95     6.75   0.02
## seas2        1.00 2.95     6.75   0.02
## seas3        1.00 2.95     6.75   0.02
## seas4        1.00 2.95     6.75   0.02
## seas5        1.00 3.26     8.67   0.01
## seas6        1.00 3.26     8.67   0.01
## seas7        1.00 3.26     8.67   0.01
## seas8        1.00 3.26     8.67   0.01
## seas9        1.00 3.26     8.67   0.01
## seas10       1.00 3.26     8.67   0.01
## seas11       1.00 3.26     8.67   0.01
## seas12       1.00 3.26     8.67   0.01
## ice*         1.00 0.29    -1.92   0.03

?subset
df <- na.omit(JEC_data)

2.1 Plot key variables

We will use log of price and log of quantity as both variables are slightly positively skewed. The interpretation

par(mfrow = c(2,2))

?hist

# untransformed variables 
hist(df$price,    xlab = "Price",    main = "" )
hist(df$quantity, xlab = "Quantity", main = "" )

# log variables 
hist(log(df$price),    xlab = "Log of Price",       main = "")
hist(log(df$quantity), xlab = "Log of Quantity",    main = "")

par(mfrow = c(1,1))

# keep logged variables for future use
df$lnQ <- log(df$quantity) 
df$lnP <- log(df$price)

3 OLS

The estimated elasticity is -.64, which means that a one percent increase in price is estimated to reduce the demand by .64 percent. Its standard error is estimated to be .07.

ols <- lm(data = df, 
          formula = lnQ ~ lnP + ice + seas1 + seas1 + seas2 + seas3 + seas4 + seas5 + seas6 + seas7 + seas8 + seas9 + seas10 + seas11 + seas12 )

summary(ols)

## 
## Call:
## lm(formula = lnQ ~ lnP + ice + seas1 + seas1 + seas2 + seas3 + 
##     seas4 + seas5 + seas6 + seas7 + seas8 + seas9 + seas10 + 
##     seas11 + seas12, data = df)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -1.39102 -0.24296  0.06575  0.28284  1.05884 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  8.861233   0.171361  51.711  < 2e-16 ***
## lnP         -0.638885   0.082389  -7.755 1.26e-13 ***
## iceIce       0.447754   0.119604   3.744 0.000216 ***
## seas1       -0.132822   0.110959  -1.197 0.232197    
## seas2        0.066888   0.111298   0.601 0.548286    
## seas3        0.111436   0.111308   1.001 0.317527    
## seas4        0.155422   0.110743   1.403 0.161477    
## seas5        0.109658   0.129918   0.844 0.399282    
## seas6        0.046833   0.159596   0.293 0.769377    
## seas7        0.122552   0.160041   0.766 0.444397    
## seas8       -0.235008   0.159856  -1.470 0.142533    
## seas9        0.003561   0.160021   0.022 0.982262    
## seas10       0.169247   0.161295   1.049 0.294849    
## seas11       0.215184   0.160096   1.344 0.179890    
## seas12       0.219633   0.159136   1.380 0.168524    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.3973 on 313 degrees of freedom
## Multiple R-squared:  0.3126, Adjusted R-squared:  0.2819 
## F-statistic: 10.17 on 14 and 313 DF,  p-value: < 2.2e-16

# NO PERFECT MULTICOLLINEARITY - 4 weeks are one season ! 4 weeks a season * 13 seasons  = 52 weeks 
df$seas1 + df$seas2 + df$seas3 + df$seas4 + df$seas5 + df$seas6 + df$seas7 + df$seas8 + df$seas9 + df$seas10 + df$seas11 + df$seas12

##   [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
##  [38] 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
##  [75] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 1 1 1 1 1 1 1
## [112] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
## [149] 1 1 1 1 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
## [186] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1
## [223] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0
## [260] 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
## [297] 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

3.1 BIAS

In a market such as this, the quantity and price are simultaneously determined, and so the OLS estimate would suffer from simultaneous bias.

Recall from Econ 1001 that price is simply the intersection of demand curve and supply curve.

You can think of simultaneity as a kind of omitted variable bias where the price is correlated with the error term.

For example, if there is market power by cartels (omitted variable) which will reduce quantity sold, and market power is positively correlated with price (profit maximizing behaviors of the firms), then your price point estimate will be negatively biased (more negative than the true estimate).
However, if firms set prices at random, that rules out this sort of behavior.

You can use instrumnent variable (IV) to get rid of bias. People will typically use cost shifters when estimating demand.

4 IV - by hand (Two Stage Least Squares)

An instrumental variable is a third variable introduced into regression analysis that is correlated with the predictor variable (relevance condition), but uncorrelated with the response variable/it affects the response variable only through its impact on the predictor variable (exclusion condition).

By using this (instrument) variable, it becomes possible to estimate the true causal effect that some predictor variable has on a response variable.

The way that we actually use an instrumental variable is through instrumental variables regression, sometimes called two-stage least squares regression.

4.1 Conditions for Valid IV

Check if conditions hold for our Econ example -

When the cartel is in effect, the Executive Committee is restricting the supply of freight services, and so is increasing the price beyond what it would have been. Think of the cartel formation as reducing the number of effective suppliers and thus shifting the entire supply curve upward to the left.

So the cartel affects price, which satisfies the relevance criterion.
- This can be quantified. If the F stat of the first regression is greater than 10, you have a strong instrument. This essentially means that your t statistic on the instrument for the endogenous variable in the first stage regression is statistically significant.
  - Fstat=(tstat)^2 when you have only one regressor.
  - You want to look at Fstat and not tstat as you are at the end of the day trying to predict the endogenous variable/take out the “good/exogenous variation” and leave the “bad/endogenous variation”.
- Weak instruments have their own set of issues.
- Make sure your first stage regression has the correct sigh on the instrument to make others buy your results.
It seems reasonable to conclude that the cartel affects the farmers’ shipping demand only through its effect on price, that is, cartel is not correlated with unobserved factors in the error term that effect demand, and so it meets the independence criterion.
1. Violation of this assumption: EG - You can argue that the cartel owners also control the retailers and could set price downstream in the shops directly too/or affect them there. In this case
2. Independence criterion can never be proven - only argued. We can however show violation of independence criterion, especially if you have two instrument variables.
  - If the two instruments are very far from each other, then you can argue atleast one of them is wrong (maybe both are). But even if they are very close to each other, you cannot say they are “correct”/valid for sure.

4.2 First Stage

The cartel variable is not a weak instrument. The t-statistic on cartel is 14.395, and the the F-statistic (which is the square of the t-statistic when only one co-variate) far exceeds the rule-of-thumb value of 10 for being a strong instrument.

table(df$cartel)

## 
##      Cartel Competition 
##         203         125

table(df$ice)

## 
## Clear Shipping Lanes                  Ice 
##                  188                  140

first_stage_1 <- 
          lm(data = df, 
          formula = lnP  ~  cartel + ice + seas1 + seas1 + seas2 + seas3 + seas4 + seas5 + seas6 + seas7 + seas8 + seas9 + seas10 + seas11 + seas12 )

summary(first_stage_1)

## 
## Call:
## lm(formula = lnP ~ cartel + ice + seas1 + seas1 + seas2 + seas3 + 
##     seas4 + seas5 + seas6 + seas7 + seas8 + seas9 + seas10 + 
##     seas11 + seas12, data = df)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.49765 -0.13625  0.01362  0.13616  0.55689 
## 
## Coefficients:
##                    Estimate Std. Error t value Pr(>|t|)    
## (Intercept)       -1.335843   0.072611 -18.397  < 2e-16 ***
## cartelCompetition -0.357898   0.024862 -14.395  < 2e-16 ***
## iceIce             0.035003   0.064252   0.545  0.58629    
## seas1              0.038725   0.059084   0.655  0.51268    
## seas2              0.136288   0.059084   2.307  0.02173 *  
## seas3              0.189049   0.059319   3.187  0.00158 ** 
## seas4              0.089523   0.059357   1.508  0.13251    
## seas5              0.017863   0.069869   0.256  0.79838    
## seas6             -0.025741   0.085529  -0.301  0.76364    
## seas7             -0.067126   0.085529  -0.785  0.43314    
## seas8             -0.035837   0.085709  -0.418  0.67614    
## seas9             -0.005776   0.086321  -0.067  0.94670    
## seas10            -0.100211   0.086321  -1.161  0.24656    
## seas11            -0.086751   0.085362  -1.016  0.31028    
## seas12             0.011693   0.085362   0.137  0.89113    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.2114 on 313 degrees of freedom
## Multiple R-squared:  0.4881, Adjusted R-squared:  0.4652 
## F-statistic: 21.32 on 14 and 313 DF,  p-value: < 2.2e-16

# Declare the levels for the factor variable
df$CARTEL <- factor(x = df$cartel, 
                    levels = c("Competition", "Cartel")
                    )

df$CARTEL <- relevel(df$CARTEL, ref="Competition")


# Declare the levels for the factor variable
df$ICE <- factor(x = df$ice, 
                    levels = c("Ice", "Clear Shipping Lanes")
                    )

df$ICE <- relevel(df$ICE, ref="Ice")

first_stage_2 <- 
          lm(data = df, 
          formula = lnP  ~  CARTEL + ice + seas1 + seas1 + seas2 + seas3 + seas4 + seas5 + seas6 + seas7 + seas8 + seas9 + seas10 + seas11 + seas12 )

summary(first_stage_2)

## 
## Call:
## lm(formula = lnP ~ CARTEL + ice + seas1 + seas1 + seas2 + seas3 + 
##     seas4 + seas5 + seas6 + seas7 + seas8 + seas9 + seas10 + 
##     seas11 + seas12, data = df)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.49765 -0.13625  0.01362  0.13616  0.55689 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  -1.693741   0.078361 -21.615  < 2e-16 ***
## CARTELCartel  0.357898   0.024862  14.395  < 2e-16 ***
## iceIce        0.035003   0.064252   0.545  0.58629    
## seas1         0.038725   0.059084   0.655  0.51268    
## seas2         0.136288   0.059084   2.307  0.02173 *  
## seas3         0.189049   0.059319   3.187  0.00158 ** 
## seas4         0.089523   0.059357   1.508  0.13251    
## seas5         0.017863   0.069869   0.256  0.79838    
## seas6        -0.025741   0.085529  -0.301  0.76364    
## seas7        -0.067126   0.085529  -0.785  0.43314    
## seas8        -0.035837   0.085709  -0.418  0.67614    
## seas9        -0.005776   0.086321  -0.067  0.94670    
## seas10       -0.100211   0.086321  -1.161  0.24656    
## seas11       -0.086751   0.085362  -1.016  0.31028    
## seas12        0.011693   0.085362   0.137  0.89113    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.2114 on 313 degrees of freedom
## Multiple R-squared:  0.4881, Adjusted R-squared:  0.4652 
## F-statistic: 21.32 on 14 and 313 DF,  p-value: < 2.2e-16

library(stargazer)

## 
## Please cite as:

##  Hlavac, Marek (2022). stargazer: Well-Formatted Regression and Summary Statistics Tables.

##  R package version 5.2.3. https://CRAN.R-project.org/package=stargazer

stargazer(first_stage_1, first_stage_2,
          type = "text"
          )

## 
## ===========================================================
##                                    Dependent variable:     
##                                ----------------------------
##                                            lnP             
##                                     (1)            (2)     
## -----------------------------------------------------------
## cartelCompetition                -0.358***                 
##                                   (0.025)                  
##                                                            
## CARTELCartel                                    0.358***   
##                                                  (0.025)   
##                                                            
## iceIce                             0.035          0.035    
##                                   (0.064)        (0.064)   
##                                                            
## seas1                              0.039          0.039    
##                                   (0.059)        (0.059)   
##                                                            
## seas2                             0.136**        0.136**   
##                                   (0.059)        (0.059)   
##                                                            
## seas3                             0.189***      0.189***   
##                                   (0.059)        (0.059)   
##                                                            
## seas4                              0.090          0.090    
##                                   (0.059)        (0.059)   
##                                                            
## seas5                              0.018          0.018    
##                                   (0.070)        (0.070)   
##                                                            
## seas6                              -0.026        -0.026    
##                                   (0.086)        (0.086)   
##                                                            
## seas7                              -0.067        -0.067    
##                                   (0.086)        (0.086)   
##                                                            
## seas8                              -0.036        -0.036    
##                                   (0.086)        (0.086)   
##                                                            
## seas9                              -0.006        -0.006    
##                                   (0.086)        (0.086)   
##                                                            
## seas10                             -0.100        -0.100    
##                                   (0.086)        (0.086)   
##                                                            
## seas11                             -0.087        -0.087    
##                                   (0.085)        (0.085)   
##                                                            
## seas12                             0.012          0.012    
##                                   (0.085)        (0.085)   
##                                                            
## Constant                         -1.336***      -1.694***  
##                                   (0.073)        (0.078)   
##                                                            
## -----------------------------------------------------------
## Observations                        328            328     
## R2                                 0.488          0.488    
## Adjusted R2                        0.465          0.465    
## Residual Std. Error (df = 313)     0.211          0.211    
## F Statistic (df = 14; 313)       21.317***      21.317***  
## ===========================================================
## Note:                           *p<0.1; **p<0.05; ***p<0.01

4.3 Second Stage

# fitted values from the first stage regerssion
df$lnP_predicted <- predict(object = first_stage_2)

# use the fitted values as an independnant vraiable in the second stage regression
second_stage <-
          lm(data = df, 
          formula = lnQ  ~  lnP_predicted + ice + seas1 + seas1 + seas2 + seas3 + seas4 + seas5 + seas6 + seas7 + seas8 + seas9 + seas10 + seas11 + seas12 )

summary(second_stage)

## 
## Call:
## lm(formula = lnQ ~ lnP_predicted + ice + seas1 + seas1 + seas2 + 
##     seas3 + seas4 + seas5 + seas6 + seas7 + seas8 + seas9 + seas10 + 
##     seas11 + seas12, data = df)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -1.37111 -0.22843 -0.01303  0.30380  0.79350 
## 
## Coefficients:
##                Estimate Std. Error t value Pr(>|t|)    
## (Intercept)    8.573535   0.219270  39.100  < 2e-16 ***
## lnP_predicted -0.866587   0.133848  -6.474 3.68e-10 ***
## iceIce         0.422934   0.123156   3.434 0.000675 ***
## seas1         -0.130973   0.113773  -1.151 0.250538    
## seas2          0.090952   0.114644   0.793 0.428179    
## seas3          0.135872   0.114671   1.185 0.236963    
## seas4          0.152511   0.113557   1.343 0.180236    
## seas5          0.073562   0.134223   0.548 0.584044    
## seas6         -0.006064   0.165408  -0.037 0.970778    
## seas7          0.060232   0.166538   0.362 0.717841    
## seas8         -0.293599   0.166070  -1.768 0.078047 .  
## seas9         -0.058372   0.166488  -0.351 0.726117    
## seas10         0.085811   0.169701   0.506 0.613452    
## seas11         0.151791   0.166678   0.911 0.363162    
## seas12         0.178656   0.164235   1.088 0.277518    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.4073 on 313 degrees of freedom
## Multiple R-squared:  0.2774, Adjusted R-squared:  0.2451 
## F-statistic: 8.582 on 14 and 313 DF,  p-value: 9.777e-16

5 IV Reg Directly in R

Conducting an instrumental variables (IV) analysis in R involves a few steps, which can vary depending on the specific model and data. However, here is a general overview of the steps involved:

Load the necessary packages: You will need to load the ivreg package, which is used for IV analysis in R. You may also need to load other packages depending on your specific analysis.
Load your data: Load your dataset into R and make sure it is in a format that can be used for IV analysis.
Identify your variables: Identify the variables you want to include in your analysis, including the endogenous variable (the variable affected by the instrument) and the instrumental variable (the variable that affects the endogenous variable only through its effect on the instrument).
Create your instrument variable: If you don’t already have an instrumental variable, you will need to create one. This may involve finding a natural experiment, using a policy change or an external event as an instrument, or using a variable that is highly correlated with the endogenous variable but is not directly affected by it.
Run the IV regression: Use the ivreg() function to run the instrumental variables regression. The basic syntax is:

ivreg(dependent_variable ~ exogenous_variables + instrument, data = your_data)
Interpret the results: Examine the output of your IV regression to determine whether there is a significant effect of the instrument on the endogenous variable, and whether the relationship is causal. You may also want to test for the validity of your instrument, and run additional tests to check the assumptions of your model.

It is important to note that IV analysis can be complex and requires a good understanding of econometric theory and statistical methods. You may want to consult with a statistician or an econometrician if you are unsure how to proceed.

# install.packages("ivreg")
library("ivreg")

?ivreg # Fit instrumental-variable regression by two-stage least squares (2SLS). This is equivalent to direct instrumental-variables estimation when the number of instruments is equal to the number of regressors.

ivreg <- 
ivreg(formula = lnQ ~ lnP + ice + seas1 + seas1 + seas2 + seas3 + seas4 + seas5 + seas6 + seas7 + seas8 + seas9 + seas10 + seas11 + seas12 | CARTEL + ice + seas1 + seas1 + seas2 + seas3 + seas4 + seas5 + seas6 + seas7 + seas8 + seas9 + seas10 + seas11 + seas12 , 
      data = df)
summary(ivreg)

## 
## Call:
## ivreg(formula = lnQ ~ lnP + ice + seas1 + seas1 + seas2 + seas3 + 
##     seas4 + seas5 + seas6 + seas7 + seas8 + seas9 + seas10 + 
##     seas11 + seas12 | CARTEL + ice + seas1 + seas1 + seas2 + 
##     seas3 + seas4 + seas5 + seas6 + seas7 + seas8 + seas9 + seas10 + 
##     seas11 + seas12, data = df)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -1.38295 -0.27275  0.07318  0.27703  1.09320 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  8.573535   0.216445  39.611  < 2e-16 ***
## lnP         -0.866587   0.132123  -6.559 2.24e-10 ***
## iceIce       0.422934   0.121569   3.479 0.000575 ***
## seas1       -0.130973   0.112307  -1.166 0.244420    
## seas2        0.090952   0.113167   0.804 0.422181    
## seas3        0.135872   0.113194   1.200 0.230912    
## seas4        0.152511   0.112094   1.361 0.174632    
## seas5        0.073562   0.132494   0.555 0.579148    
## seas6       -0.006064   0.163277  -0.037 0.970397    
## seas7        0.060232   0.164392   0.366 0.714319    
## seas8       -0.293599   0.163930  -1.791 0.074259 .  
## seas9       -0.058372   0.164343  -0.355 0.722689    
## seas10       0.085811   0.167514   0.512 0.608831    
## seas11       0.151791   0.164530   0.923 0.356941    
## seas12       0.178656   0.162119   1.102 0.271306    
## 
## Diagnostic tests:
##                  df1 df2 statistic p-value    
## Weak instruments   1 313   207.222  <2e-16 ***
## Wu-Hausman         1 312     5.124  0.0243 *  
## Sargan             0  NA        NA      NA    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.4021 on 313 degrees of freedom
## Multiple R-Squared: 0.2959,  Adjusted R-squared: 0.2644 
## Wald test: 8.807 on 14 and 313 DF,  p-value: 3.5e-16

6 Present Results

6.1 Stargazer

stargazer(first_stage_2, second_stage, ivreg, type = "text", 
          covariate.labels = c("Cartel in Effect", "Instrumented Price")
          )

## 
## ===============================================================
##                                      Dependent variable:       
##                                --------------------------------
##                                   lnP             lnQ          
##                                   OLS       OLS    instrumental
##                                                      variable  
##                                   (1)       (2)        (3)     
## ---------------------------------------------------------------
## Cartel in Effect               0.358***                        
##                                 (0.025)                        
##                                                                
## Instrumented Price                       -0.867***             
##                                           (0.134)              
##                                                                
## lnP                                                 -0.867***  
##                                                      (0.132)   
##                                                                
## iceIce                           0.035   0.423***    0.423***  
##                                 (0.064)   (0.123)    (0.122)   
##                                                                
## seas1                            0.039    -0.131      -0.131   
##                                 (0.059)   (0.114)    (0.112)   
##                                                                
## seas2                           0.136**    0.091      0.091    
##                                 (0.059)   (0.115)    (0.113)   
##                                                                
## seas3                          0.189***    0.136      0.136    
##                                 (0.059)   (0.115)    (0.113)   
##                                                                
## seas4                            0.090     0.153      0.153    
##                                 (0.059)   (0.114)    (0.112)   
##                                                                
## seas5                            0.018     0.074      0.074    
##                                 (0.070)   (0.134)    (0.132)   
##                                                                
## seas6                           -0.026    -0.006      -0.006   
##                                 (0.086)   (0.165)    (0.163)   
##                                                                
## seas7                           -0.067     0.060      0.060    
##                                 (0.086)   (0.167)    (0.164)   
##                                                                
## seas8                           -0.036    -0.294*    -0.294*   
##                                 (0.086)   (0.166)    (0.164)   
##                                                                
## seas9                           -0.006    -0.058      -0.058   
##                                 (0.086)   (0.166)    (0.164)   
##                                                                
## seas10                          -0.100     0.086      0.086    
##                                 (0.086)   (0.170)    (0.168)   
##                                                                
## seas11                          -0.087     0.152      0.152    
##                                 (0.085)   (0.167)    (0.165)   
##                                                                
## seas12                           0.012     0.179      0.179    
##                                 (0.085)   (0.164)    (0.162)   
##                                                                
## Constant                       -1.694*** 8.574***    8.574***  
##                                 (0.078)   (0.219)    (0.216)   
##                                                                
## ---------------------------------------------------------------
## Observations                      328       328        328     
## R2                               0.488     0.277      0.296    
## Adjusted R2                      0.465     0.245      0.264    
## Residual Std. Error (df = 313)   0.211     0.407      0.402    
## F Statistic (df = 14; 313)     21.317*** 8.582***              
## ===============================================================
## Note:                               *p<0.1; **p<0.05; ***p<0.01

6.2 Model Summary

If you do not like stargazer, can try modelsummary

# install.packages("modelsummary")
library("modelsummary")

## `modelsummary` 2.0.0 now uses `tinytable` as its default table-drawing
##   backend. Learn more at: https://vincentarelbundock.github.io/tinytable/
## 
## Revert to `kableExtra` for one session:
## 
##   options(modelsummary_factory_default = 'kableExtra')
##   options(modelsummary_factory_latex = 'kableExtra')
##   options(modelsummary_factory_html = 'kableExtra')
## 
## Silence this message forever:
## 
##   config_modelsummary(startup_message = FALSE)

## 
## Attaching package: 'modelsummary'

## The following object is masked from 'package:psych':
## 
##     SD

m_list <- list(OLS = second_stage, IV = ivreg)
msummary(m_list)

tinytable_ro3rk2vnc631m66uzhir

	OLS	IV
(Intercept)	8.574	8.574
	(0.219)	(0.216)
lnP_predicted	-0.867
	(0.134)
iceIce	0.423	0.423
	(0.123)	(0.122)
seas1	-0.131	-0.131
	(0.114)	(0.112)
seas2	0.091	0.091
	(0.115)	(0.113)
seas3	0.136	0.136
	(0.115)	(0.113)
seas4	0.153	0.153
	(0.114)	(0.112)
seas5	0.074	0.074
	(0.134)	(0.132)
seas6	-0.006	-0.006
	(0.165)	(0.163)
seas7	0.060	0.060
	(0.167)	(0.164)
seas8	-0.294	-0.294
	(0.166)	(0.164)
seas9	-0.058	-0.058
	(0.166)	(0.164)
seas10	0.086	0.086
	(0.170)	(0.168)
seas11	0.152	0.152
	(0.167)	(0.165)
seas12	0.179	0.179
	(0.164)	(0.162)
lnP		-0.867
		(0.132)
Num.Obs.	328	328
R2	0.277	0.296
R2 Adj.	0.245	0.264
AIC	358.3	349.8
BIC	419.0	410.5
Log.Lik.	-163.154
F	8.582
RMSE	0.40	0.39

7 Conclusion

The cartel was not able to achieve the profit-maximizing monopoly price.

If the monopolist was experiencing an elasticity of this magnitude, he could increase revenue (and profits) by restricting supply further, since for the next percentage point increase in price, the quantity shipped would fall less than one percent.

The monopolist would continue increasing price until the elasticity reached -1.

8 References

https://researchrepository.ucd.ie/entities/publication/1d319c76-de90-477f-a8e7-7c670eca5a1e/details

Instrumental Variables

Arvind Sharma

2024-11-14