1 + 1[1] 2
Quarto enables you to weave together content and executable code into a finished document. To learn more about Quarto see https://quarto.org.
When you click the Render button a document will be generated that includes both content and the output of embedded code. You can embed code like this:
1 + 1[1] 2
You can add options to executable code like this
[1] 4
The echo: false option disables the printing of code (only output is displayed).
# Load the AER package
library(AER)Loading required package: car
Loading required package: carData
Loading required package: lmtest
Loading required package: zoo
Attaching package: 'zoo'
The following objects are masked from 'package:base':
as.Date, as.Date.numeric
Loading required package: sandwich
Loading required package: survival
# Load the Grunfeld dataset
data("Grunfeld")
# View the structure of the dataset
str(Grunfeld)'data.frame': 220 obs. of 5 variables:
$ invest : num 318 392 411 258 331 ...
$ value : num 3078 4662 5387 2792 4313 ...
$ capital: num 2.8 52.6 156.9 209.2 203.4 ...
$ firm : Factor w/ 11 levels "General Motors",..: 1 1 1 1 1 1 1 1 1 1 ...
$ year : int 1935 1936 1937 1938 1939 1940 1941 1942 1943 1944 ...
# Check if the data is balanced
table(Grunfeld$firm, Grunfeld$year)
1935 1936 1937 1938 1939 1940 1941 1942 1943 1944 1945 1946
General Motors 1 1 1 1 1 1 1 1 1 1 1 1
US Steel 1 1 1 1 1 1 1 1 1 1 1 1
General Electric 1 1 1 1 1 1 1 1 1 1 1 1
Chrysler 1 1 1 1 1 1 1 1 1 1 1 1
Atlantic Refining 1 1 1 1 1 1 1 1 1 1 1 1
IBM 1 1 1 1 1 1 1 1 1 1 1 1
Union Oil 1 1 1 1 1 1 1 1 1 1 1 1
Westinghouse 1 1 1 1 1 1 1 1 1 1 1 1
Goodyear 1 1 1 1 1 1 1 1 1 1 1 1
Diamond Match 1 1 1 1 1 1 1 1 1 1 1 1
American Steel 1 1 1 1 1 1 1 1 1 1 1 1
1947 1948 1949 1950 1951 1952 1953 1954
General Motors 1 1 1 1 1 1 1 1
US Steel 1 1 1 1 1 1 1 1
General Electric 1 1 1 1 1 1 1 1
Chrysler 1 1 1 1 1 1 1 1
Atlantic Refining 1 1 1 1 1 1 1 1
IBM 1 1 1 1 1 1 1 1
Union Oil 1 1 1 1 1 1 1 1
Westinghouse 1 1 1 1 1 1 1 1
Goodyear 1 1 1 1 1 1 1 1
Diamond Match 1 1 1 1 1 1 1 1
American Steel 1 1 1 1 1 1 1 1
# Summary of the number of observations for each firm
obs_per_firm <- table(Grunfeld$firm)
# Check if all firms have the same number of observations
obs_per_firm
General Motors US Steel General Electric Chrysler
20 20 20 20
Atlantic Refining IBM Union Oil Westinghouse
20 20 20 20
Goodyear Diamond Match American Steel
20 20 20
If each firm has the same number of observations over the entire period, then all entities (firms) are observed for all time periods meaning the panel dataset is balanced. Since all firms have exactly 20 observations (one for each year from 1935 to 1954), the dataset is balanced. Time Component: represented by the year variable which ranges from 1935 to 1954. Entity Component: firm variable, which represents 10 U.S. firms (labeled as 1, 2, 3,…, 10). \[ invest_{it} = \beta_0 + \beta_1 value_{it} + \beta_2 capital_{it} + \epsilon_{it} \]
# Estimate the OLS regression model
ols_model <- lm(invest ~ value + capital, data = Grunfeld)
# Summary of the model
summary(ols_model)
Call:
lm(formula = invest ~ value + capital, data = Grunfeld)
Residuals:
Min 1Q Median 3Q Max
-290.33 -25.76 11.06 29.74 377.94
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -38.410054 8.413371 -4.565 8.35e-06 ***
value 0.114534 0.005519 20.753 < 2e-16 ***
capital 0.227514 0.024228 9.390 < 2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 90.28 on 217 degrees of freedom
Multiple R-squared: 0.8179, Adjusted R-squared: 0.8162
F-statistic: 487.3 on 2 and 217 DF, p-value: < 2.2e-16
Both the coefficients for the value variable and capital variable are statistically significant at the alpha level, indicating that both factors are related to the outcome variable investment. The estimated coefficients, 0.114534 for value and 0.227514 for capital are not big so the economic magnitude of these two factors are relatively small.
There could be omitted variable bias (OVB). If there are relevant factors that affect investment but are not included in the model, estimates of the coefficients will be biased and inconsistent. It is possible that firm-specific characteristics or time-specific effects (e.g., macroeconomic conditions) that influence investment are omitted. \[ invest_{it} = \beta_0 + \beta_1 value_{it} + \beta_2 capital_{it} + \alpha_i + \epsilon_{it} \] Three Different Ways to Estimate the Fixed Effects Model in R: 1. Using the plm package (within estimator): This method explicitly removes the firm-specific effects by transforming the data using the “within” transformation (demeaning). 2. Using the lm() function with dummy variables: This method creates dummy variables for each firm and includes them in the regression to control for firm-specific fixed effects. 3. Manually demean the data (within transformation): This method manually removes the firm-specific means from the data to estimate the fixed effects.
# 1. plm (within estimator)
# Load the 'plm' package for panel data analysis
library(plm)
# Convert the Grunfeld dataset to a panel data frame
grunfeld_panel <- pdata.frame(Grunfeld, index = c("firm", "year"))
# Estimate the fixed effects model using the 'within' method
fe_model_plm <- plm(invest ~ value + capital, data = grunfeld_panel, model = "within")
# Summary of the fixed effects model
summary(fe_model_plm)Oneway (individual) effect Within Model
Call:
plm(formula = invest ~ value + capital, data = grunfeld_panel,
model = "within")
Balanced Panel: n = 11, T = 20, N = 220
Residuals:
Min. 1st Qu. Median 3rd Qu. Max.
-184.00792 -15.66024 0.27161 16.41421 250.75337
Coefficients:
Estimate Std. Error t-value Pr(>|t|)
value 0.11013 0.01130 9.7461 < 2.2e-16 ***
capital 0.31003 0.01654 18.7439 < 2.2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Total Sum of Squares: 2244500
Residual Sum of Squares: 523720
R-Squared: 0.76667
Adj. R-Squared: 0.75314
F-statistic: 340.079 on 2 and 207 DF, p-value: < 2.22e-16
# 2. lm() with Dummy Variables for Firms
# Use the 'lm()' function and add dummy variables for firms
fe_model_lm <- lm(invest ~ value + capital + factor(firm), data = Grunfeld)
# Summary of the fixed effects model
summary(fe_model_lm)
Call:
lm(formula = invest ~ value + capital + factor(firm), data = Grunfeld)
Residuals:
Min 1Q Median 3Q Max
-184.008 -15.660 0.272 16.414 250.753
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -70.29907 47.37535 -1.484 0.139
value 0.11013 0.01130 9.746 < 2e-16 ***
capital 0.31003 0.01654 18.744 < 2e-16 ***
factor(firm)US Steel 172.20381 29.70010 5.798 2.48e-08 ***
factor(firm)General Electric -165.27033 30.28530 -5.457 1.38e-07 ***
factor(firm)Chrysler 42.48996 41.84991 1.015 0.311
factor(firm)Atlantic Refining -44.30345 48.12221 -0.921 0.358
factor(firm)IBM 47.13887 44.61445 1.057 0.292
factor(firm)Union Oil 3.75484 48.19182 0.078 0.938
factor(firm)Westinghouse 12.75258 41.98604 0.304 0.762
factor(firm)Goodyear -16.91548 46.17941 -0.366 0.715
factor(firm)Diamond Match 63.73104 47.96886 1.329 0.185
factor(firm)American Steel 49.72087 48.28006 1.030 0.304
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 50.3 on 207 degrees of freedom
Multiple R-squared: 0.9461, Adjusted R-squared: 0.9429
F-statistic: 302.6 on 12 and 207 DF, p-value: < 2.2e-16
# 3. Manually Demean the Data (Within Transformation)
# First, calculate the firm means for each variable
firm_means <- aggregate(cbind(invest, value, capital) ~ firm, data = Grunfeld, mean)
# Merge the firm means with the original dataset
Grunfeld <- merge(Grunfeld, firm_means, by = "firm", suffixes = c("", "_mean"))
# Manually demean the variables by subtracting firm means
Grunfeld$invest_demeaned <- Grunfeld$invest - Grunfeld$invest_mean
Grunfeld$value_demeaned <- Grunfeld$value - Grunfeld$value_mean
Grunfeld$capital_demeaned <- Grunfeld$capital - Grunfeld$capital_mean
# Estimate the fixed effects model using the demeaned data
fe_model_manual <- lm(invest_demeaned ~ value_demeaned + capital_demeaned - 1, data = Grunfeld)
# Summary of the fixed effects model
summary(fe_model_manual)
Call:
lm(formula = invest_demeaned ~ value_demeaned + capital_demeaned -
1, data = Grunfeld)
Residuals:
Min 1Q Median 3Q Max
-184.008 -15.660 0.272 16.414 250.753
Coefficients:
Estimate Std. Error t value Pr(>|t|)
value_demeaned 0.11013 0.01101 10.00 <2e-16 ***
capital_demeaned 0.31003 0.01612 19.24 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 49.01 on 218 degrees of freedom
Multiple R-squared: 0.7667, Adjusted R-squared: 0.7645
F-statistic: 358.2 on 2 and 218 DF, p-value: < 2.2e-16
The estimated coefficients, 0.11013 for value and 0.31003 for capital, changed for the fixed effect models, and all of the FE models give the same coefficient estimates for the variables value and capital. 1. The plm() function automatically transforms the data by removing the firm-specific means and uses the within estimator to control for fixed effects. 2. The lm() function creates a dummy variable for each firm using the factor(firm) term and estimates fixed effects by including firm dummies. 3. The de-mean method manually removes the firm-specific means from the variables (i.e., it demeans the data) and then runs an OLS regression on the demeaned data. The -1 in the formula removes the intercept since the data has already been demeaned. The fixed effect model run above using the three methods only control for time-invariant characteristics of the entity (such as firm-specific factors). These are characteristics that do not change over time but differ across entities (firms), such as: industry type, geographical location, and other structural or permanent differences. \[ invest_{it} = \beta_0 + \beta_1 value_{it} + \beta_2 capital_{it} + \alpha_i + \lambda_t + \epsilon_{it} \] Now, run another fixed effect models using the three methods again to include both entity fixed effects and time fixed effects. By doing so, we account for both time-invariant characteristics of the entities and time-varying characteristics that affect all entities.
# 1. plm
# Load the 'plm' package if not already loaded
library(plm)
# Convert the Grunfeld dataset to a panel data frame
grunfeld_panel2 <- pdata.frame(Grunfeld, index = c("firm", "year"))
# Estimate the fixed effects model with both entity and time fixed effects
fe_model_plm_time <- plm(invest ~ value + capital + factor(year), data = grunfeld_panel2, model = "within")
# Summary of the fixed effects model
summary(fe_model_plm_time)Oneway (individual) effect Within Model
Call:
plm(formula = invest ~ value + capital + factor(year), data = grunfeld_panel2,
model = "within")
Balanced Panel: n = 11, T = 20, N = 220
Residuals:
Min. 1st Qu. Median 3rd Qu. Max.
-163.4113 -17.6747 -1.8345 17.9490 217.1070
Coefficients:
Estimate Std. Error t-value Pr(>|t|)
value 0.116681 0.012933 9.0219 < 2.2e-16 ***
capital 0.351436 0.021049 16.6964 < 2.2e-16 ***
factor(year)1936 -16.959225 21.517883 -0.7881 0.4316042
factor(year)1937 -36.375640 22.363809 -1.6265 0.1055097
factor(year)1938 -35.623722 21.162286 -1.6834 0.0939654 .
factor(year)1939 -63.099393 21.505219 -2.9341 0.0037613 **
factor(year)1940 -39.824767 21.626070 -1.8415 0.0671215 .
factor(year)1941 -16.487765 21.528944 -0.7658 0.4447309
factor(year)1942 -17.999327 21.274589 -0.8460 0.3986014
factor(year)1943 -37.772443 21.414676 -1.7639 0.0793802 .
factor(year)1944 -38.320061 21.459336 -1.7857 0.0757588 .
factor(year)1945 -49.539482 21.687469 -2.2842 0.0234746 *
factor(year)1946 -27.754389 21.865751 -1.2693 0.2059001
factor(year)1947 -34.877538 21.588537 -1.6156 0.1078668
factor(year)1948 -38.330726 21.733580 -1.7637 0.0794130 .
factor(year)1949 -65.200751 21.900872 -2.9771 0.0032931 **
factor(year)1950 -67.387721 22.027641 -3.0592 0.0025432 **
factor(year)1951 -54.834631 22.436806 -2.4440 0.0154498 *
factor(year)1952 -56.489038 22.819259 -2.4755 0.0141899 *
factor(year)1953 -58.512579 23.819421 -2.4565 0.0149371 *
factor(year)1954 -81.793910 24.204108 -3.3793 0.0008834 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Total Sum of Squares: 2244500
Residual Sum of Squares: 459400
R-Squared: 0.79533
Adj. R-Squared: 0.76158
F-statistic: 34.7874 on 21 and 188 DF, p-value: < 2.22e-16
# 2. lm
# Estimate the fixed effects model using lm() with dummies for firms and years
fe_model_lm_time <- lm(invest ~ value + capital + factor(firm) + factor(year), data = Grunfeld)
# Summary of the fixed effects model
summary(fe_model_lm_time)
Call:
lm(formula = invest ~ value + capital + factor(firm) + factor(year),
data = Grunfeld)
Residuals:
Min 1Q Median 3Q Max
-163.411 -17.675 -1.835 17.949 217.107
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -83.68199 52.83502 -1.584 0.114912
value 0.11668 0.01293 9.022 < 2e-16 ***
capital 0.35144 0.02105 16.696 < 2e-16 ***
factor(firm)US Steel 202.31880 33.10138 6.112 5.55e-09 ***
factor(firm)General Electric -139.31536 33.63785 -4.142 5.21e-05 ***
factor(firm)Chrysler 88.17030 47.63465 1.851 0.065743 .
factor(firm)Atlantic Refining -10.73113 54.42347 -0.197 0.843901
factor(firm)IBM 95.31235 50.87165 1.874 0.062539 .
factor(firm)Union Oil 44.97606 54.70390 0.822 0.412020
factor(firm)Westinghouse 60.05315 47.85534 1.255 0.211077
factor(firm)Goodyear 23.80679 52.39144 0.454 0.650063
factor(firm)Diamond Match 118.26245 54.92529 2.153 0.032581 *
factor(firm)American Steel 101.76963 55.17670 1.844 0.066694 .
factor(year)1936 -16.95922 21.51788 -0.788 0.431604
factor(year)1937 -36.37564 22.36381 -1.627 0.105510
factor(year)1938 -35.62372 21.16229 -1.683 0.093965 .
factor(year)1939 -63.09939 21.50522 -2.934 0.003761 **
factor(year)1940 -39.82477 21.62607 -1.842 0.067121 .
factor(year)1941 -16.48777 21.52894 -0.766 0.444731
factor(year)1942 -17.99933 21.27459 -0.846 0.398601
factor(year)1943 -37.77244 21.41468 -1.764 0.079380 .
factor(year)1944 -38.32006 21.45934 -1.786 0.075759 .
factor(year)1945 -49.53948 21.68747 -2.284 0.023475 *
factor(year)1946 -27.75439 21.86575 -1.269 0.205900
factor(year)1947 -34.87754 21.58854 -1.616 0.107867
factor(year)1948 -38.33073 21.73358 -1.764 0.079413 .
factor(year)1949 -65.20075 21.90087 -2.977 0.003293 **
factor(year)1950 -67.38772 22.02764 -3.059 0.002543 **
factor(year)1951 -54.83463 22.43681 -2.444 0.015450 *
factor(year)1952 -56.48904 22.81926 -2.475 0.014190 *
factor(year)1953 -58.51258 23.81942 -2.457 0.014937 *
factor(year)1954 -81.79391 24.20411 -3.379 0.000883 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 49.43 on 188 degrees of freedom
Multiple R-squared: 0.9527, Adjusted R-squared: 0.9449
F-statistic: 122.1 on 31 and 188 DF, p-value: < 2.2e-16
# 3. Manually demean
# First, calculate the firm means for each variable
firm_means <- aggregate(cbind(invest, value, capital) ~ firm, data = Grunfeld, mean)
# Merge the firm means with the original dataset, use different suffixes to avoid duplication
Grunfeld <- merge(Grunfeld, firm_means, by = "firm", suffixes = c("", "_firm_mean"))
# Manually demean the variables by subtracting firm means
Grunfeld$invest_demeaned2 <- Grunfeld$invest - Grunfeld$invest_firm_mean
Grunfeld$value_demeaned2 <- Grunfeld$value - Grunfeld$value_firm_mean
Grunfeld$capital_demeaned2 <- Grunfeld$capital - Grunfeld$capital_firm_mean
# Estimate the fixed effects model using the demeaned data and include time dummies
fe_model_manual_time <- lm(invest_demeaned2 ~ value_demeaned2 + capital_demeaned2 + factor(year) - 1, data = Grunfeld)
# Summary of the fixed effects model
summary(fe_model_manual_time)
Call:
lm(formula = invest_demeaned2 ~ value_demeaned2 + capital_demeaned2 +
factor(year) - 1, data = Grunfeld)
Residuals:
Min 1Q Median 3Q Max
-163.411 -17.675 -1.835 17.949 217.107
Coefficients:
Estimate Std. Error t value Pr(>|t|)
value_demeaned2 0.11668 0.01260 9.259 < 2e-16 ***
capital_demeaned2 0.35144 0.02051 17.135 < 2e-16 ***
factor(year)1935 41.85916 15.33714 2.729 0.00692 **
factor(year)1936 24.89993 15.00703 1.659 0.09866 .
factor(year)1937 5.48352 15.41684 0.356 0.72246
factor(year)1938 6.23543 14.84962 0.420 0.67501
factor(year)1939 -21.24024 14.69853 -1.445 0.15002
factor(year)1940 2.03439 14.72881 0.138 0.89028
factor(year)1941 25.37139 14.64477 1.732 0.08475 .
factor(year)1942 23.85983 14.70399 1.623 0.10625
factor(year)1943 4.08671 14.57883 0.280 0.77953
factor(year)1944 3.53909 14.56985 0.243 0.80833
factor(year)1945 -7.68033 14.59185 -0.526 0.59924
factor(year)1946 14.10477 14.63597 0.964 0.33637
factor(year)1947 6.98162 14.66620 0.476 0.63457
factor(year)1948 3.52843 14.80270 0.238 0.81185
factor(year)1949 -23.34160 14.86583 -1.570 0.11798
factor(year)1950 -25.52857 14.83278 -1.721 0.08680 .
factor(year)1951 -12.97548 14.78247 -0.878 0.38114
factor(year)1952 -14.62988 15.01966 -0.974 0.33122
factor(year)1953 -16.65342 15.68305 -1.062 0.28959
factor(year)1954 -39.93475 16.07066 -2.485 0.01379 *
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 48.17 on 198 degrees of freedom
Multiple R-squared: 0.7953, Adjusted R-squared: 0.7726
F-statistic: 34.97 on 22 and 198 DF, p-value: < 2.2e-16
The estimated coefficients, 0.11668 for value and 0.35144 for capital, changed for the two-way fixed effect models. The previous FE model controlled for entity fixed effects only by removing the effect of time-invariant firm-specific characteristics. In comparison, the two-way FE model also controls for time fixed effects and captures time-varying shocks or changes that affect all firms similarly. The fixed effects model, now including both entity and time fixed effects, controls for both time-invariant characteristics of the entities (e.g., firm-specific factors) and time-varying characteristics affecting all entities (e.g., macroeconomic conditions or industry trends). Including both effects helps mitigate omitted variable bias from unobserved factors that are constant within firms over time and those that vary over time across all firms.