Content

1. 資料結構
2. 模型介紹
3. 模型選擇
4. 模型解釋

1. What is panel data?

library(plm)
mydata<- read.csv("panel_wage.csv")
attach(mydata) # 代表接下來都用同樣的df,呼叫變數時前面不用再打'mydata$'
formula = lwage ~ exp + exp2 + wks + ed
lwage log(Wage) Dependent Variable
exp Experience Varying Regressor
exp2 Experience^2 Varying Regressor
wks Weeks worked Varying Regressor
ed Education Time-invariant Regressor
pdata <- pdata.frame(mydata, index=c("id","t"))

2. How to deal with corrlations among error: Data Transform

I. Estimators

  • OLS Pooled
    • Transformation: Nothing
    • No. obs: NT
    • Ignores the unobserved heterogeneity of users (possible association within groups)
pooling <- plm(formula, data=pdata, model= "pooling")
summary(pooling)
  • Beween (individual)
    • Transformation: Time average of all variable
    • No. obs: N
    • Loss information
between <- plm(formula, data=pdata, model= "between")
summary(between)
  • Within (individual) across time, Fixed Effect (FE)
    • Transformation: Time-demean
    • No. obs: NT
    • Individual specific effect (𝜶i) cancelled
    • Time-invariant variable are dropped
fixed <- plm(formula, data=pdata, model= "within")
summary(fixed)
圖解Btween和Within之差異

圖解Btween和Within之差異

  • First-Diff (FD)
    • Transformation: One period difference
    • No. obs: N(T-1)
    • Individual specific effect (𝜶i) cancelled
    • Time-invariant variable are dropped
firstdiff <- plm(formula, data=pdata, model= "fd")
summary(firstdiff) # 沒有截距項,會把exp的係數打在(Intercept)中
  • Random Effect (RE)
    • Transformation: Weighted average of between & within estimates
    • No. obs: NT
    • Lambda愈接近0代表靠近Pooled OLS,愈接近1代表靠近within
random <- plm(formula, data=pdata, model= "random")
summary(random)

II. Comparation

Estimator \ True model Pooled model RE model FE model
Pooled OLS estimator Consistent Consistent Inconsistent
Between estimator Consistent Consistent Inconsistent
Within or FE estimator Consistent Consistent Consistent
RE estimator Consistent Consistent Inconsistent

3. Choose a Model

Flowchart for Choosing a Model

Flowchart for Choosing a Model

library(lmtest)
bptest(pooling)
## 
##  studentized Breusch-Pagan test
## 
## data:  pooling
## BP = 40.252, df = 4, p-value = 3.838e-08
# LM test for random effects versus OLS
plmtest(pooling)
## 
##  Lagrange Multiplier Test - (Honda) for balanced panels
## 
## data:  formula
## normal = 72.056, p-value < 2.2e-16
## alternative hypothesis: significant effects
# LM test for fixed effects versus OLS
pFtest(fixed, pooling)
## 
##  F test for individual effects
## 
## data:  formula
## F = 40.239, df1 = 593, df2 = 3567, p-value < 2.2e-16
## alternative hypothesis: significant effects
phtest(random, fixed)
## 
##  Hausman Test
## 
## data:  formula
## chisq = 6191.4, df = 3, p-value < 2.2e-16
## alternative hypothesis: one model is inconsistent

4. Explaination

Comparing Estimators

Comparing Estimators