1 + 1[1] 2
Quarto enables you to weave together content and executable code into a finished document. To learn more about Quarto see https://quarto.org.
When you click the Render button a document will be generated that includes both content and the output of embedded code. You can embed code like this:
1 + 1[1] 2
You can add options to executable code like this
[1] 4
The echo: false option disables the printing of code (only output is displayed).
library(pacman)
library(readr)
library(stargazer)
Please cite as:
Hlavac, Marek (2022). stargazer: Well-Formatted Regression and Summary Statistics Tables.
R package version 5.2.3. https://CRAN.R-project.org/package=stargazer
library(ggplot2)
library(dplyr)
Attaching package: 'dplyr'
The following objects are masked from 'package:stats':
filter, lag
The following objects are masked from 'package:base':
intersect, setdiff, setequal, union
# Load the CSV file using read_csv
dataset <- read_csv("/Users/bettywang/Desktop/dataset.csv")Rows: 48 Columns: 6
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (1): STATE
dbl (5): HPI_CHG, Time_Period, Disaster_Affected, NUM_DISASTERS, NUM_IND_ASSIST
ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
# Create the interaction term
dataset$Interaction <- dataset$Time_Period * dataset$Disaster_Affected
# Run the linear regression
model <- lm(HPI_CHG ~ Time_Period + Disaster_Affected + Interaction,
data = dataset)
# Summary of the model
summary(model)
Call:
lm(formula = HPI_CHG ~ Time_Period + Disaster_Affected + Interaction,
data = dataset)
Residuals:
Min 1Q Median 3Q Max
-0.023081 -0.007610 -0.000171 0.004656 0.035981
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.037090 0.002819 13.157 < 2e-16 ***
Time_Period -0.027847 0.003987 -6.985 1.2e-08 ***
Disaster_Affected -0.013944 0.006176 -2.258 0.0290 *
Interaction 0.019739 0.008734 2.260 0.0288 *
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.01229 on 44 degrees of freedom
Multiple R-squared: 0.5356, Adjusted R-squared: 0.504
F-statistic: 16.92 on 3 and 44 DF, p-value: 1.882e-07
# Control Group:
# Time_Period = 0 (pre-disaster) and Disaster_Affected = 0 (non-affected region)
# Treatment Group:
# Time_Period = 1 (post-disaster) and Disaster_Affected = 1 (affected region)
# Difference-in-Difference:
# The control group represents regions not affected by the disaster during the
# pre-disaster time period.
# The treatment group represents regions affected by the disaster in the
# post-disaster time period.
# Difference-in-Differences (DiD) measure the effect of disaster on one group
# by comparing it to another group that didn't experience the disaster.
# Before the disaster, check the outcome variable HPI_CHG for both groups
# (Disaster_Affected = 0 and Disaster_Affected = 1);
# Repeat this after the disaster happened
# Then look at how HPI_CHG changed for group (Disaster_Affected = 0)
# and group (Disaster_Affected = 1) before and after the disaster respectively.
# The DiD method isolates the effect of the disaster by comparing the extra
# change in the group (Disaster_Affected = 1) against
# the group (Disaster_Affected = 0). Did helps control for other things that
# might affect both groups equally .
# Calculate the mean of HPI_CHG
mean_HPI_CHG <- mean(dataset$HPI_CHG, na.rm = TRUE)
# Display the mean
mean_HPI_CHG[1] 0.02231762
# Extract the regression coefficients
coef_model <- coef(model)
# Calculate the predicted values for each group
# Group 1: Time_Period = 0, Disaster_Affected = 0
pred_00 <- coef_model["(Intercept)"]
# Group 2: Time_Period = 1, Disaster_Affected = 0
pred_10 <- coef_model["(Intercept)"] + coef_model["Time_Period"]
# Group 3: Time_Period = 0, Disaster_Affected = 1
pred_01 <- coef_model["(Intercept)"] + coef_model["Disaster_Affected"]
# Group 4: Time_Period = 1, Disaster_Affected = 1
pred_11 <- coef_model["(Intercept)"] + coef_model["Time_Period"] +
coef_model["Disaster_Affected"] + coef_model["Interaction"]
# Create the 2x2 matrix
regression_matrix <- matrix(c(pred_00, pred_01, pred_10, pred_11),
nrow = 2, byrow = TRUE)
# Name the rows and columns
rownames(regression_matrix) <- c("Time_Period = 0", "Time_Period = 1")
colnames(regression_matrix) <- c("Disaster_Affected = 0", "Disaster_Affected = 1")
# Display the matrix
regression_matrix Disaster_Affected = 0 Disaster_Affected = 1
Time_Period = 0 0.037090020 0.02314612
Time_Period = 1 0.009242792 0.01503835
# Calculate the difference for the non-affected and affected groups across time
difference_Time <- pred_10 - pred_00
difference_Time_Affected <- pred_11 - pred_01
# Display the result
difference_Time(Intercept)
-0.02784723
difference_Time_Affected (Intercept)
-0.008107772
# Calculate the difference in difference coefficient
difference_Interaction <- difference_Time_Affected - difference_Time
# Display the result
difference_Interaction(Intercept)
0.01973946
# This result from the 2 X 2 matrix table matches the difference in difference
# coefficient "Interaction" in the linear regression model.\[ HPI.CHG_i = \beta_0 + \beta_1 Time.Period_i + \beta_2 Disaster.Affected_i + \beta_3 Interaction_i + \epsilon_i \] \[ y_{ihv} = \beta_0 + \beta_1 BH_{ihv} + \beta_2 F_{ihv} + \beta_3 T_{ihv} + \beta_4 F_{ihv}*BH_{ihv} + \beta_5 T_{ihv}*BH_{ihv} + \beta_6 T_{ihv}*F_{ihv} + \beta_7 T_{ihv}*F_{ihv}*BH_{ihv} + \epsilon_{ihv} \]