計量經濟R語言-第三周:Panel_Data

Content

1. What is panel data?

Structure in data: Individual & Time

Prepare Data

library(plm)
mydata<- read.csv("panel_wage.csv")
attach(mydata) # 代表接下來都用同樣的df，呼叫變數時前面不用再打'mydata$'
formula = lwage ~ exp + exp2 + wks + ed

Variable List

lwage	log(Wage)	Dependent Variable
exp	Experience	Varying Regressor
exp2	Experience^2	Varying Regressor
wks	Weeks worked	Varying Regressor
ed	Education	Time-invariant Regressor

Set as Panel Data

pdata <- pdata.frame(mydata, index=c("id","t"))

2. How to deal with corrlations among error: Data Transform

I. Estimators

OLS Pooled
- Transformation: Nothing
- No. obs: NT
- Ignores the unobserved heterogeneity of users (possible association within groups)

pooling <- plm(formula, data=pdata, model= "pooling")
summary(pooling)

Beween (individual)
- Transformation: Time average of all variable
- No. obs: N
- Loss information

between <- plm(formula, data=pdata, model= "between")
summary(between)

Within (individual) across time, Fixed Effect (FE)
- Transformation: Time-demean
- No. obs: NT
- Individual specific effect (𝜶i) cancelled
- Time-invariant variable are dropped

fixed <- plm(formula, data=pdata, model= "within")
summary(fixed)

圖解Btween和Within之差異

First-Diff (FD)
- Transformation: One period difference
- No. obs: N(T-1)
- Individual specific effect (𝜶i) cancelled
- Time-invariant variable are dropped

firstdiff <- plm(formula, data=pdata, model= "fd")
summary(firstdiff) # 沒有截距項，會把exp的係數打在(Intercept)中

Random Effect (RE)
- Transformation: Weighted average of between & within estimates
- No. obs: NT
- Lambda愈接近0代表靠近Pooled OLS，愈接近1代表靠近within

random <- plm(formula, data=pdata, model= "random")
summary(random)

II. Comparation

Estimator \ True model	Pooled model	RE model	FE model
Pooled OLS estimator	Consistent	Consistent	Inconsistent
Between estimator	Consistent	Consistent	Inconsistent
Within or FE estimator	Consistent	Consistent	Consistent
RE estimator	Consistent	Consistent	Inconsistent

3. Choose a Model

Flowchart for Choosing a Model

Heteroscedasticity: BP test

library(lmtest)
bptest(pooling)

## 
##  studentized Breusch-Pagan test
## 
## data:  pooling
## BP = 40.252, df = 4, p-value = 3.838e-08

Other Statistic test

# LM test for random effects versus OLS
plmtest(pooling)

## 
##  Lagrange Multiplier Test - (Honda) for balanced panels
## 
## data:  formula
## normal = 72.056, p-value < 2.2e-16
## alternative hypothesis: significant effects

# LM test for fixed effects versus OLS
pFtest(fixed, pooling)

## 
##  F test for individual effects
## 
## data:  formula
## F = 40.239, df1 = 593, df2 = 3567, p-value < 2.2e-16
## alternative hypothesis: significant effects

Hausman test: FE v.s. RE
- Can be calculated only for the time-varying regressors.
- Significant: use the fixed effects.
- Insignificant: use the random effects.

phtest(random, fixed)

## 
##  Hausman Test
## 
## data:  formula
## chisq = 6191.4, df = 3, p-value < 2.2e-16
## alternative hypothesis: one model is inconsistent

4. Explaination

Comparing Estimators

不管是哪個 estimators 都顯示，較高的經驗和教育水準與較高的薪資水平有關
就各個模型而言
- 【Pooled OLS】跨過個人和時間，額外一年的工作經驗會導致薪資提高4％
- 【Between】對有多一年工作經驗的人，其平均薪資比一般人高3%
- 【Within】每增加一年的工作經驗，對經驗高於平均的人而言薪資會多11%
- 【First differences】在第一年到下一年的期間，每增加一年的工作經驗，薪資會多11%
- 【Random】每增加一年的工作經驗，對經驗高於平均的人而言薪資會多8%
因為 Hausman test 顯示 FE & RE 兩者模型的係數顯著不同，因此我們選擇 FE 模型
Rho 是 individual specific variation 的百分比，此例有非常高的比例 (FE: 98% & RE: 81%) 被 individual specific term 被解釋，剩餘不能解釋的是由於 idiosyncratic error
Lambda 為 82%，因此 RE estimates 比 pooled estimates 更靠近 within estimates
- FE把所有個人扣除所以R2比較大
R-squares 顯示 between estimator 可以解釋 32% 的 between variation，而 FE & RE estimators 分別可以解釋 66% 和 63% 的 within variation

計量經濟R語言-第三周:Panel_Data_Model

唐思琪

2020-02-26

Content

1. What is panel data?

2. How to deal with corrlations among error: Data Transform

I. Estimators

II. Comparation

3. Choose a Model

4. Explaination