Figure 2.1: 學前教育
Figure 2.2: 餐廳員工數與最低時薪
\(D_{i}\): Indicator of treatment for unit \(i\) \[ \mathrm{D}_{i} = \begin{cases} 1, & \text{if unit} \hspace{3pt} i \hspace{3pt} \text{receives treatment} \\ % & is your "\tab"-like command (it's a tab alignment character) 0, & \text{otherwise} \end{cases} \]
\(Y_{i}\): Observed outcome of interest for unit \(i\)
\[ \tag{1} \mathrm{Y}_{di}=\begin {cases} Y_{0i}, & \text{Potential outcome for unit} \hspace{3pt}i\hspace{3pt} \text{without treatment} \\ Y_{1i}, & \text{Potential outcome for unit } \hspace{3pt} i \hspace{3pt} \text{with treatment} \end{cases} \]
\(Y_{i}=D_{i}\cdot Y_{1i}+(1-D_{i})\cdot Y_{0i}\)
\[\begin{equation} \tag{2} \mathrm{Y}_{i}=\begin {cases} Y_{0i}, & \text{if} \hspace{3pt}D_{i}=0 \\ Y_{1i}, & \text{if} \hspace{3pt} D_{i}=1 \end{cases} \end{equation}\]
\(\alpha_{i} = Y_{1i}-Y_{0i}\)
\[ Y_{i}(D_{1}, D_{2},\ldots,D_{n})=Y_{i}(D'_{1}, D'_{2},\ldots,D'_{n})\quad \text{if}\quad D_{i}=D'_{i} \]
\[\begin{align} \tag{3} ATE & = E[Y_{1}-Y_{0}]\\ & =E[Y_{1}]-E[Y_{0}]\\ & =E[Y_{1}\mid D=1]-E[Y_{0}\mid D=0] \\ & =E[Y_{1}-Y_{0}]\\ & =E[Y_{1}]-E[Y_{0}] \end{align}\]
\(i\) | \(Y_{1i}\) | \(Y_{0i}\) | \(D_{i}\) | \(\alpha_{i}\) | \(Y_{i}\) |
---|---|---|---|---|---|
1 | 3 | 0 | 1 | 3 | 3 |
2 | 1 | 0 | 1 | 0 | 1 |
3 | 1 | 0 | 1 | 0 | |
4 | 1 | 1 | 0 | 0 | |
\(E[Y_{1}]\) | 1.5 | ||||
\(E[Y_{0}]\) | 0.5 | ||||
\(E[Y_{1}-Y_{0}]\) |
\[ E[Y_{1}] = \frac{1}{N}\Sigma Y_{1i}=1.5 \] \[ E[Y_{0}] = \frac{1}{N}\Sigma Y_{0i}=0.5 \] \[ E[Y_{1}]-E[Y_{0}] = 1 \] \[ \alpha_{ATE}=E[Y_{1}-Y_{0}]=\frac{1}{4}\cdot(3+0+1+0)=1 \]
\[ \alpha_{ATT}=E[Y_{1i}]=\frac{1}{2}\cdot(3+0)=1.5 \]
star <- read.csv("~/Dropbox/EastAsia2024/data/DSS/STAR.csv")
head(star)
## classtype reading math graduated
## 1 small 578 610 1
## 2 regular 612 612 1
## 3 regular 583 606 1
## 4 small 661 648 1
## 5 small 614 636 1
## 6 regular 610 603 0
star <- star %>% mutate(D = as.factor(classtype)) %>%
mutate(graduated = recode_factor(graduated,
'1'='Yes', '0'='No'))
class(star$D)
## [1] "factor"
levels(star$D)
## [1] "regular" "small"
mean(star$reading[star$D=='small'])
## [1] 632.7
mean(star$reading[star$D=='regular'])
## [1] 625.5
library(gtsummary)
star %>% tbl_summary(
by = D,
statistic = list(
all_continuous() ~ "{mean} ({sd})",
graduated ~ "{n} / {N} ({p}%)"
),
digits = all_continuous() ~ 2) %>%
add_overall() %>%
add_n() %>%
modify_header(label ~ "**Variable**")
Variable | N | Overall, N = 1,2741 | regular, N = 6891 | small, N = 5851 |
---|---|---|---|---|
classtype | 1,274 | |||
regular | 689 (54%) | 689 (100%) | 0 (0%) | |
small | 585 (46%) | 0 (0%) | 585 (100%) | |
reading | 1,274 | 628.80 (36.73) | 625.49 (35.88) | 632.70 (37.37) |
math | 1,274 | 631.59 (38.84) | 628.84 (37.94) | 634.83 (39.66) |
graduated | 1,274 | 1,108 / 1,274 (87%) | 597 / 689 (87%) | 511 / 585 (87%) |
1 n (%); Mean (SD); n / N (%) |
mean(star$reading[star$D=='small']) - mean(star$reading[star$D=='regular'])
## [1] 7.211
\[ \sqrt{var(\hat{Y}_{treated}+\hat{Y}_{control})} =\sqrt{var(\hat{Y}_{treated}) + var(\hat{Y}_{control})} =\sqrt{\frac{\sigma^2_{T}}{N_{T}} + \frac{\sigma^2_{C}}{N_{C}}} \]
v.small <- var(star$reading[star$D == 'small']); v.small
## [1] 1396
v.regular <- var(star$reading[star$D == 'regular']); v.regular
## [1] 1287
n.group<- star %>% group_by (D) %>%
summarise(n = n())
n.small<-as.numeric(n.group[2,2]); n.regular<-as.numeric(n.group[1,2])
PoolSE<-sqrt((v.small/n.small)+(v.regular/n.regular)); PoolSE
## [1] 2.063
library(gtsummary)
star %>% tbl_summary(
by = D,
statistic = list(
all_continuous() ~ "{mean} ({sd})",
all_categorical() ~ "{n} / {N} ({p}%)"
),
digits = all_continuous() ~ 2) %>%
add_p(pvalue_fun = ~ style_pvalue(.x, digits = 2)) %>%
add_overall() %>%
add_n() %>%
modify_header(label ~ "**Variable**")
Variable | N | Overall, N = 1,2741 | regular, N = 6891 | small, N = 5851 | p-value2 |
---|---|---|---|---|---|
classtype | 1,274 | <0.001 | |||
regular | 689 / 1,274 (54%) | 689 / 689 (100%) | 0 / 585 (0%) | ||
small | 585 / 1,274 (46%) | 0 / 689 (0%) | 585 / 585 (100%) | ||
reading | 1,274 | 628.80 (36.73) | 625.49 (35.88) | 632.70 (37.37) | <0.001 |
math | 1,274 | 631.59 (38.84) | 628.84 (37.94) | 634.83 (39.66) | 0.013 |
graduated | 1,274 | 1,108 / 1,274 (87%) | 597 / 689 (87%) | 511 / 585 (87%) | 0.71 |
gender | 1,274 | 572 / 1,274 (45%) | 300 / 689 (44%) | 272 / 585 (46%) | 0.29 |
interest | 1,274 | 0.42 | |||
1 | 257 / 1,274 (20%) | 142 / 689 (21%) | 115 / 585 (20%) | ||
2 | 235 / 1,274 (18%) | 129 / 689 (19%) | 106 / 585 (18%) | ||
3 | 128 / 1,274 (10%) | 68 / 689 (9.9%) | 60 / 585 (10%) | ||
4 | 280 / 1,274 (22%) | 138 / 689 (20%) | 142 / 585 (24%) | ||
5 | 374 / 1,274 (29%) | 212 / 689 (31%) | 162 / 585 (28%) | ||
1 n / N (%); Mean (SD) | |||||
2 Pearson’s Chi-squared test; Wilcoxon rank sum test |
R
計算如下:
v.small <- var(star$reading[star$D == 'small']); v.small
## [1] 1396
v.regular <- var(star$reading[star$D == 'regular']); v.regular
## [1] 1287
n.group<- star %>% group_by (D) %>%
summarise(n = n())
n.small<-n.group[2,2]; n.regular<-n.group[1,2]
p1 <- (((n.small-1)*v.small)+((n.regular-1)*v.regular))
PoolSD<-sqrt(p1/(n.small+n.regular-2));
PoolSE <- PoolSD*sqrt((1/n.small)+(1/n.regular)); PoolSE
## n
## 1 2.056
最後更新日期 02/24/2024