rm(list = ls())
gc()
## used (Mb) gc trigger (Mb) max used (Mb)
## Ncells 542074 29.0 1205575 64.4 686414 36.7
## Vcells 988668 7.6 8388608 64.0 1875897 14.4
cat("\f")
graphics.off()
# Load Libraries
library(stargazer) # summary statistics
##
## Please cite as:
## Hlavac, Marek (2022). stargazer: Well-Formatted Regression and Summary Statistics Tables.
## R package version 5.2.3. https://CRAN.R-project.org/package=stargazer
library(tidyverse) # data manipulation
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.1.4 ✔ readr 2.1.5
## ✔ forcats 1.0.0 ✔ stringr 1.5.1
## ✔ ggplot2 3.5.1 ✔ tibble 3.2.1
## ✔ lubridate 1.9.3 ✔ tidyr 1.3.1
## ✔ purrr 1.0.2
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(dplyr) #data manipulation
# Importing Data
df <- read.csv("E:/download/us_fred_coastal_us_states_avg_hpi_before_after_2005.csv")
stargazer(... = df,
type = "text")
##
## ================================================
## Statistic N Mean St. Dev. Min Max
## ------------------------------------------------
## HPI_CHG 48 0.022 0.017 -0.006 0.061
## Time_Period 48 0.500 0.505 0 1
## Disaster_Affected 48 0.208 0.410 0 1
## NUM_DISASTERS 48 3.208 2.143 1 10
## NUM_IND_ASSIST 48 8.583 14.946 0 55
## ------------------------------------------------
# Create a two-way table with labels
raw_table <- table(Time_Period = ifelse(test = df$Time_Period == 0, yes = "Pre", no = "Post" ),
Treatment_Status = ifelse(test = df$Disaster_Affected == 0, yes = "Control", no = "Treatment")
)
raw_table
## Treatment_Status
## Time_Period Control Treatment
## Post 19 5
## Pre 19 5
$$
HPI CHG= _0+_1 Time_Periodt+_2 Disaster_Affecteds+_3 Time_Periodt∗Disaster_Affecteds+ϵst
$$
Reg_model <- lm(formula = HPI_CHG ~ Time_Period * Disaster_Affected, data = df)
summary(Reg_model)
##
## Call:
## lm(formula = HPI_CHG ~ Time_Period * Disaster_Affected, data = df)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.023081 -0.007610 -0.000171 0.004656 0.035981
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.037090 0.002819 13.157 < 2e-16 ***
## Time_Period -0.027847 0.003987 -6.985 1.2e-08 ***
## Disaster_Affected -0.013944 0.006176 -2.258 0.0290 *
## Time_Period:Disaster_Affected 0.019739 0.008734 2.260 0.0288 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.01229 on 44 degrees of freedom
## Multiple R-squared: 0.5356, Adjusted R-squared: 0.504
## F-statistic: 16.92 on 3 and 44 DF, p-value: 1.882e-07
The control and control groups is the group that does not receive the treatment being tested within a study. It can provide a normal situation or result without intervention for comparison.
Treatment is any event that selectively affects only some of the individuals or things in a study. The treatment and treatment group is the group that receives the treatment being tested within a study. The difference between the comparison and control groups allows us to estimate what the effect of treatment is.
We use this method to verify whether the treatment affects people or events. For example, in our discussion, we want to determine whether disaster treatment influences house prices. Since the only difference between the two groups in our regression is whether they experienced a disaster, the differences in the pre-treatment data for the two groups can estimate the impact of the disaster.
# Calculate the mean
mean_table <- tapply(X = df$HPI_CHG,
INDEX = list(Time_Period = ifelse(df$Time_Period == 0, "Pre", "Post"),
Treatment_Status = ifelse(df$Disaster_Affected == 0, "Control", "Treatment") ),
FUN = mean )
print(mean_table)
## Treatment_Status
## Time_Period Control Treatment
## Post 0.009242792 0.01503835
## Pre 0.037090020 0.02314612
# Calculate Did effect
DiD_effect <- ( mean_table[1, 2] - mean_table[1, 1] ) - ( mean_table[2, 2] - mean_table[2, 1] )
print(DiD_effect)
## [1] 0.01973946
The result matches the difference in the coefficient in the linear regression above.
The “implicit assumptions” includes:
Counterfactual assumption (Parallel Trends). The trends of the outcomes for the control group and the treatment group are the same in the absence of intervention. It ensures that the differences in outcomes after receiving the treatment can be attributed to the treatment itself.
Consistency. The potential outcome for each unit after receiving the treatment is consistent. For instance, if I get a dog, I will become happy, regardless of the differences between the dogs; I won’t be less happy just because the dog is black.
Positive Assumption. Assume tratment is not determinant for secific values of X.
In a perfectly competitive market, an increase in the minimum wage usually leads to a rise in the unemployment rate. Because the higher minimum wage causes employers to reduce their hiring demand, it will increase the unemployment rate
In this paper, the authors found the opposite conclusion. By comparing the employment rates in fast-food restaurants in New Jersey and Pennsylvania, they discovered that the increase in the minimum wage did not lead to a decrease in employment in the fast-food restaurants and even increased the employment rate.
The treatment is an increase in the minimum wage.
Treatment group is the employment rates in fast-food restaurants in New Jersey and the control group is the employment rates in fast-food restaurants in Pennsylvania.
Aside from New Jersey increasing the minimum wage, the trends in unemployment rates for fast-food restaurants during this period were similar in both New Jersey and Pennsylvania. Therefore, the difference between the two states in unemployment rates can estimate the impact of the minimum wage increase on unemployment rates.
I will but the study’s result. Real markets are not perfectly competitive and this market is not in equilibrium. If the labor market is in a state where the quantity of demand exceeds supply, an increase in the minimum wage can raise employment and decrease the unemployment rate.
Counterfactual assumption (Parallel Trends). The trends of the outcomes for the control group and the treatment group are the same in the absence of intervention. It ensures that the differences in outcomes after receiving the treatment can be attributed to the treatment itself.
Consistency. The potential outcome for each unit after receiving the treatment is consistent. For instance, if I get a dog, I will become happy, regardless of the differences between the dogs; I won’t be less happy just because the dog is black.
Positive Assumption. Assume tratment is not determinant for secific values of X.