library(knitr)
library(rsconnect)
library(readxl)
library(tinytex)
## Warning: package 'tinytex' was built under R version 4.2.3
library(stargazer)
##
## Please cite as:
## Hlavac, Marek (2022). stargazer: Well-Formatted Regression and Summary Statistics Tables.
## R package version 5.2.3. https://CRAN.R-project.org/package=stargazer
data <- read.csv("C:/Users/schic/Econometrics Work/us_fred_coastal_us_states_avg_hpi_before_after_2005.csv")
\[ HPICHG_i = \beta_0 + \beta_1TimePeriod + \beta_2DisaterAffected + \beta_3TimePeriod*DisasterAffected + \epsilon_i \]
model1 <- lm(HPI_CHG ~ Time_Period + Disaster_Affected + Time_Period*Disaster_Affected,data = data)
summary(model1)
##
## Call:
## lm(formula = HPI_CHG ~ Time_Period + Disaster_Affected + Time_Period *
## Disaster_Affected, data = data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.023081 -0.007610 -0.000171 0.004656 0.035981
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.037090 0.002819 13.157 < 2e-16 ***
## Time_Period -0.027847 0.003987 -6.985 1.2e-08 ***
## Disaster_Affected -0.013944 0.006176 -2.258 0.0290 *
## Time_Period:Disaster_Affected 0.019739 0.008734 2.260 0.0288 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.01229 on 44 degrees of freedom
## Multiple R-squared: 0.5356, Adjusted R-squared: 0.504
## F-statistic: 16.92 on 3 and 44 DF, p-value: 1.882e-07
In this case, we can see that the control group is composed of states that have not been affected by a natural disaster \(DisasterAffected_i\) = 0, and the control in this model is keeping the model confined to the variable \(TimePeriod\). The treatment group, in this case, is composed of states that have been affected by natural disasters \(DisasterAffected_i\) = 1, and the treatment in this model is the dummy variable \(DisasterAffected_i\).
By obtaining the difference of the differences of the outcomes, before and after the treatment, we hope to obtain a value that will explain the net effect of the treatment. This should work, as we assume that the trends of the groups prior to the treatment are parallel. In plain words, if the trends of both groups are the same when the treatment is not present, we can assume that the treatment is the variable causing the change in trend, so by calculating the diff-in-diff, we can say that this value is generally the effect of the treatment on our target variable.
| Time_Period = 0 | Time_Period = 1 | |
|---|---|---|
| Disaster_Affected = 0 | \(\beta_0 + \epsilon_i = 0.0371\) | \(\beta_0 + \beta_1 + \epsilon_i = 0.009\) |
| Disaster_Affected = 1 | \(\beta_0 + \beta_2 + \epsilon_i = 0.0232\) | \(\beta_0 + \beta_1 + \beta_2 + \beta_3 + \epsilon_i = 0.0151\) |
#create 2x2 Matrix
values <- c(.0371, 0.009, 0.0232, 0.0151)
rows <- c("Treated = 0","Treated = 1")
cols <- c("Time = 0", "Time = 1")
Matrix1 <- matrix(values,
nrow=2,
ncol=2,
dimnames=list(rows,cols)
)
stargazer(Matrix1, type = "text")
##
## =============================
## Time = 0 Time = 1
## -----------------------------
## Treated = 0 0.037 0.023
## Treated = 1 0.009 0.015
## -----------------------------
Post treatment difference: 0.015 - 0.023 = -0.008
Pre-treatment difference: 0.009 - 0.037 = -0.028
Diff-in-Diff: (-0.008) - (-0.028) = 0.02
The diff-in-diff is approximately equal to \(\beta_3\) = 0.0197 after rounding, which is the coefficient on the interaction variable in the regression
The threat to identification for diff-in-diff models is when the parallel trends assumption is not met. For diff-in-diff methods to work, we must observe that the trends between the treatment and control groups that are analyzed are the same prior to treatment, as this assumption allows us to isolate the effect of the treatment on the treated group. Additionally, for a diff-in-diff model to work in the first place, we must establish control and treatment groups where the trends of the two groups are recorded over the same controlled time period before and after treatment takes place in the treatment group. Lastly, we must abide by the assumption that future treatments do not affect the outcomes of the past, as this would affect our parallel trends assumption.