df <- read.csv("/Users/pin.lyu/Desktop/BC_Class_Folder/Econometrics/DIS_&_ASSIGNMENT/DIS_6/Hurricane2005.csv")

# Set up the 3rd parameter
df$did <- df$Time_Period * df$Disaster_Affected

Diff-in-diff Regression Model

\[ \widehat H_ = \beta_0 + \beta_1TimePeriod +\beta_2DisasterAffected + \beta_3 Time*Disaster + \epsilon \]

# Perform did model regression
did_model <- lm(HPI_CHG ~ 
     Time_Period + 
     Disaster_Affected + 
     did,
   data = df
)

summary(did_model)

## 
## Call:
## lm(formula = HPI_CHG ~ Time_Period + Disaster_Affected + did, 
##     data = df)
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -0.023081 -0.007610 -0.000171  0.004656  0.035981 
## 
## Coefficients:
##                    Estimate Std. Error t value Pr(>|t|)    
## (Intercept)        0.037090   0.002819  13.157  < 2e-16 ***
## Time_Period       -0.027847   0.003987  -6.985  1.2e-08 ***
## Disaster_Affected -0.013944   0.006176  -2.258   0.0290 *  
## did                0.019739   0.008734   2.260   0.0288 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.01229 on 44 degrees of freedom
## Multiple R-squared:  0.5356, Adjusted R-squared:  0.504 
## F-statistic: 16.92 on 3 and 44 DF,  p-value: 1.882e-07

All coefficient numbers are statistically significant, and it worth to point out that \(R^2\) is 0.504 which means that the model can explain more than 50% of the variance in responce to variable HPI_CHG.

What is the control and the control group, and what is the treatment and the treatment group?

Control: Time, before 2005 and after 2005.
Control group: Number of counties of each coastal state that did not receive 14 individual assistance in 2005
- 19 states: ME, NH, MA, CT, NY, NJ, DE, MD, DC, VA, NC, SC, GA, WA, OR, CA, AK, HI
Treatment: States that are subjected to the full brunt of 2005 hurricane season
Treatment group: Number of counties of each coastal state that received 14 or more than 14 individual assistance in 2005
- 5 states: TX, LA, MS, AL, FL

2x2 Regression Matrix

	Time_Period_0	Time_Period_1
Treated_0	\[ \beta_0 +\epsilon = 0.0371 \]	\[ \beta_0 + \beta_1 + \epsilon = 0.0231 \]
Treated_1	\[ \beta_0 +\beta_2 +\epsilon = 0.009 \]	\[ \beta_0 +\beta_1 +\beta_2 + \beta_3 + \epsilon = 0.0131 \]

### 2x2 matrix of did regression equations

# values
values <- c(0.037, 0.231, 0.009, 0.031)

# Set up column and row names
rnames <- c("Treated = 0","Treated = 1")
cnames <- c("Time = 0", "Time = 1")

# Set up the matrix
did_matrix <- matrix(values,
                     nrow=2,
                     byrow=TRUE,
                     dimnames=list(rnames,cnames)
                     )

did_matrix

##             Time = 0 Time = 1
## Treated = 0    0.037    0.231
## Treated = 1    0.009    0.031

Assumptions in Diff-in-diff model

Consistency assumption:
- Future treatment does not affect the past outcomes
Counter-factual assumption (Parallel trends):
- It assumes that in the absence of treatment, the difference between the ‘treatment’ and ‘control’ group is constant over time.

Weakness of this study

In my opinion, this study suffers from inadequate parallel trend issue. The authors have not provided a comprehensive explanation of their efforts to address variations in housing value changes across different housing markets. It is plausible that certain states experienced higher rates of appreciation or depreciation, depending on their unique economic and social contexts. Consequently, asserting that all these states exhibited the same rate of change in their housing markets before the 2005 hurricane season becomes challenging. Without a satisfactory resolution of this issue, the credibility of the study’s findings remains in doubt.”

ENMT_DIS#6_Diff_in_diff

Pin Lyu

2023-10-08

Diff-in-diff Regression Model

2x2 Regression Matrix

Assumptions in Diff-in-diff model

Weakness of this study