firm year sector emp
Min. : 1.0 Min. :1976 Min. :1.000 Min. : 0.104
1st Qu.: 37.0 1st Qu.:1978 1st Qu.:3.000 1st Qu.: 1.181
Median : 74.0 Median :1980 Median :5.000 Median : 2.287
Mean : 73.2 Mean :1980 Mean :5.123 Mean : 7.892
3rd Qu.:110.0 3rd Qu.:1981 3rd Qu.:8.000 3rd Qu.: 7.020
Max. :140.0 Max. :1984 Max. :9.000 Max. :108.562
wage capital output
Min. : 8.017 Min. : 0.0119 Min. : 86.9
1st Qu.:20.636 1st Qu.: 0.2210 1st Qu.: 97.1
Median :24.006 Median : 0.5180 Median :100.6
Mean :23.919 Mean : 2.5074 Mean :103.8
3rd Qu.:27.494 3rd Qu.: 1.5010 3rd Qu.:110.6
Max. :45.232 Max. :47.1079 Max. :128.4
Here, we redefined the data set as panel data and checked to see if it is balanced. It is not balanced, meaning not all firms have all data for all years.
We checked if there were any missing data, there wasn’t.
empl_data <-pdata.frame(EmplUK, index =c("firm", "year"))is.pbalanced(empl_data)
[1] FALSE
sapply(empl_data, function(x) sum(is.na(x)))
firm year sector emp wage capital output
0 0 0 0 0 0 0
Pooled OLS Model
pooled_model <-plm(emp ~ wage + capital + output, data = empl_data, model ="pooling")summary(pooled_model)
Pooling Model
Call:
plm(formula = emp ~ wage + capital + output, data = empl_data,
model = "pooling")
Unbalanced Panel: n = 140, T = 7-9, N = 1031
Residuals:
Min. 1st Qu. Median 3rd Qu. Max.
-79.77766 -2.54599 -0.99936 1.19158 47.18957
Coefficients:
Estimate Std. Error t-value Pr(>|t|)
(Intercept) 8.252019 3.108688 2.6545 0.008065 **
wage -0.324252 0.048761 -6.6498 4.761e-11 ***
capital 2.105611 0.044085 47.7623 < 2.2e-16 ***
output 0.020382 0.027720 0.7353 0.462337
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Total Sum of Squares: 261540
Residual Sum of Squares: 80229
R-Squared: 0.69324
Adj. R-Squared: 0.69235
F-statistic: 773.642 on 3 and 1027 DF, p-value: < 2.22e-16
Here, we ran the pooled OLS model. Like we discussed in the lecture, this model doesn’t account for firm and time specific differences and ignores the panel nature of the data.
The intercept doesn’t matter in this case, because it doesn’t make sense for a firm to have 0 wages, 0 capital, 0 output etc.
Coefficient
p-value
Wage
-0.32
<0.001
Capital
+2.11
<0.001
Output
+0.02
0.46
Wage has a significant negative effect on employment. A 1 unit increase in wage, results in a 0.32 unit decrease in employment on average (across all firms and years).
Capital has a strong positive effect on employment. For every unit of increase in capital, emplyment also increases by 2.11 units.
Output is not significant.
R² = 0.69 –> the model explains about 69% of the variation in employment.
However, because the pooled OLS model ignores between (firms) and within (a firm over time) variation and also because it cannot control for time-invariant factors (e.g. sector, firm culture etc.) which might affect employment, the results might be “naive”.
FE1: Least Squares Dummy Variable - LSDV
To control for unobserved, time-invariant characteristics of each firm, we used LSDV.
Because we didn’t define which firm should be the reference / base - R automatically used Firm 1 as the reference. This means for each firm, the co-efficient we get shows how that firm’s average employment differs from Firm 1, after accounting for wage, capital and output.
Dummy variables (Firm 1 in this case) “soak up” all between-firm variation, so that the model shows only within-firm changes. This is how unobserved, time invariant factors are controlled for.
lsdv_model <-lm(emp ~ wage + capital + output +factor(firm), data = empl_data)summary(lsdv_model)
Now these coefficients need to be interpreted within the same over time. So,
Wage: 1 unit increase in wage is associated with a 0.10 unit decrease in employment, within the same firm over time.
Capital: 1 unit increase in capital is associated with a 0.75-unit increase in employment, within the same firm over time.
R² = 0.985 —> Almost all of the variation in employment is explained by the model.
The model removes time-invariant variables and doesn’t account for them (overestimates explanatory power) and is inefficient for high number of units (firms).
FE2: Demeaned Fixed Effects
fixed_model <-plm(emp ~ wage + capital + output, data = empl_data, model ="within")summary(fixed_model)
Oneway (individual) effect Within Model
Call:
plm(formula = emp ~ wage + capital + output, data = empl_data,
model = "within")
Unbalanced Panel: n = 140, T = 7-9, N = 1031
Residuals:
Min. 1st Qu. Median 3rd Qu. Max.
-16.6007954 -0.3803183 -0.0028994 0.4071307 26.4876927
Coefficients:
Estimate Std. Error t-value Pr(>|t|)
wage -0.1016412 0.0321637 -3.1601 0.00163 **
capital 0.7511302 0.0623233 12.0522 < 2.2e-16 ***
output 0.0588070 0.0074657 7.8770 9.772e-15 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Total Sum of Squares: 5030.6
Residual Sum of Squares: 3933
R-Squared: 0.21819
Adj. R-Squared: 0.093166
F-statistic: 82.6066 on 3 and 888 DF, p-value: < 2.22e-16
RE: Random Effects Model
random_model <-plm(emp ~ wage + capital + output, data = empl_data, model ="random")summary(random_model)
Oneway (individual) effect Random Effect Model
(Swamy-Arora's transformation)
Call:
plm(formula = emp ~ wage + capital + output, data = empl_data,
model = "random")
Unbalanced Panel: n = 140, T = 7-9, N = 1031
Effects:
var std.dev share
idiosyncratic 4.429 2.105 0.057
individual 73.212 8.556 0.943
theta:
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.9074 0.9074 0.9074 0.9098 0.9134 0.9183
Residuals:
Min. 1st Qu. Median Mean 3rd Qu. Max.
-14.0927 -0.6671 -0.2820 -0.0106 0.2028 29.0515
Coefficients:
Estimate Std. Error z-value Pr(>|z|)
(Intercept) 2.7412853 1.4432778 1.8993 0.0575189 .
wage -0.1207583 0.0330594 -3.6528 0.0002594 ***
capital 1.0783753 0.0582626 18.5089 < 2.2e-16 ***
output 0.0537293 0.0078498 6.8447 7.666e-12 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Total Sum of Squares: 7228.7
Residual Sum of Squares: 5061.7
R-Squared: 0.2998
Adj. R-Squared: 0.29775
Chisq: 435.933 on 3 DF, p-value: < 2.22e-16
Hausman Test
phtest(fixed_model, random_model)
Hausman Test
data: emp ~ wage + capital + output
chisq = 217.52, df = 3, p-value < 2.2e-16
alternative hypothesis: one model is inconsistent
Summary Table
library(stargazer)
Please cite as:
Hlavac, Marek (2022). stargazer: Well-Formatted Regression and Summary Statistics Tables.
R package version 5.2.3. https://CRAN.R-project.org/package=stargazer
library(lmtest)
Loading required package: zoo
Attaching package: 'zoo'
The following objects are masked from 'package:base':
as.Date, as.Date.numeric