Primeramente, cargamos las principales bibliotecas que vamos a emplear (en las más relevantes concretamos cual es su función):
library(reshape2)
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
library(car) # regression analysis
## Loading required package: carData
##
## Attaching package: 'car'
## The following object is masked from 'package:dplyr':
##
## recode
library(plotly)
## Loading required package: ggplot2
##
## Attaching package: 'plotly'
## The following object is masked from 'package:ggplot2':
##
## last_plot
## The following object is masked from 'package:stats':
##
## filter
## The following object is masked from 'package:graphics':
##
## layout
library(ggplot2)
library(lmtest) # lm Breusch-Godfrey test for serial correlation
## Loading required package: zoo
##
## Attaching package: 'zoo'
## The following objects are masked from 'package:base':
##
## as.Date, as.Date.numeric
library(ecm) # to calculate Durbin's h-statistic
library(orcutt) # cochrane-orcutt procedure in Case of First Order Autocorrelation
library(dynlm) # dynamic models
library(AER) # ivreg
## Loading required package: sandwich
## Loading required package: survival
library(stargazer) # regression table
##
## Please cite as:
## Hlavac, Marek (2022). stargazer: Well-Formatted Regression and Summary Statistics Tables.
## R package version 5.2.3. https://CRAN.R-project.org/package=stargazer
knitr::opts_chunk$set(echo = TRUE, warning = FALSE)
Ahora tenemos que importar a RStudio los datos que previamente hemos descargado de las Penn World Table con los que vamos a trabajar, en nuestro caso, son datos para poder formar la función de producción Cobb-Douglas de Bélgica. Una vez descargados, vemos las primeras 10 observaciones:
data = read.csv("~/Desktop/Scripts de R/Datos/FebPwtExport2232023 19.41.35.csv")
head(data, 10)
Los remodelamos al formato ancho para poder trabajar cómodamente:
names(data)
## [1] "VariableCode" "RegionCode" "YearCode" "AggValue"
data_wide <- dcast(data, RegionCode + YearCode ~ VariableCode, value.var="AggValue")
head(data_wide, 10)
amos a considerar las siguientes variables de manera que podamos especificar la función de producción de la economía belga:
\(Y_t\) = Nivel de producción (variable GDP = rgdpna).
\(X_{2t}\) = Factor de producción trabajo (variable L = avh*emp).
\(X_{3t}\) = Factor de producción capital físico (variable K = rnna).
names(data_wide)
## [1] "RegionCode" "YearCode" "avh" "emp" "rgdpna"
## [6] "rnna"
data_wide <- data_wide %>%
dplyr::mutate(l=avh*emp) %>%
dplyr::rename(k = rnna,
pib = rgdpna,
Year = YearCode)
Tomamos una submuestra del conjunto de datos (del año 1960 en adelante):
data_sample <- data_wide %>%
dplyr::filter(Year >= 1960)
data_ts = ts(data_sample, frequency=1, start=c(1960,1))
plot.ts(data_ts)
Mantenemos las gráficas de las variables que vamos a emplear a la hora de estimar los modelos, de manera que podamos observar de manera visual el comportamiento de estas variables de la economía belga a lo largo del tiempo:
plot.ts(data_ts[,-c(1,2,3,4)])
Podemos representar también una única variable, en este caso, el PIB:
plot.ts(data_ts[,c("pib")])
Primero vamos a escribir las ecuaciones de los modelos que posteriormente estimaremos: \[ln(Y_t) = \beta_1 + \beta_2 \ln(X_{2t}) + \beta_3 \ln(X_{3t})u_t\] \[ln(Y_t) = \beta_1 + \beta_2 \ln(X_{2t}) + \beta_3 \ln(X_{2t-1}) + \beta_4 \ln(X_{3t}) + \beta_5 \ln(X_{3t-1}) + \beta_6 \ln(Y_{t-1}) + u_t\] \[ln(Y_t) = \beta_1 + \beta_2 \ln(X_{2t}) + \beta_3 \ln(X_{3t}) + \beta_4 \ln(Y_{t-1}) + u_t\]
model_1 = lm( log(pib) ~ log(l) + log(k), data = data_ts)
summary(model_1)
##
## Call:
## lm(formula = log(pib) ~ log(l) + log(k), data = data_ts)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.08205 -0.01986 -0.00249 0.01849 0.07267
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 2.484115 0.501768 4.951 6.92e-06 ***
## log(l) -0.305371 0.056905 -5.366 1.53e-06 ***
## log(k) 0.900725 0.008164 110.323 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.02985 on 57 degrees of freedom
## Multiple R-squared: 0.9954, Adjusted R-squared: 0.9952
## F-statistic: 6111 on 2 and 57 DF, p-value: < 2.2e-16
cat("AIC:", AIC(model_1),"\n")
## AIC: -246.2069
model_2 = lm(log(pib) ~ log(l) + lag(log(l),1) + log(k) + lag(log(k),1) + lag(log(pib),1), data = data_ts)
summary(model_2)
##
## Call:
## lm(formula = log(pib) ~ log(l) + lag(log(l), 1) + log(k) + lag(log(k),
## 1) + lag(log(pib), 1), data = data_ts)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.032034 -0.004551 -0.000032 0.004536 0.023390
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.64227 0.26205 2.451 0.017582 *
## log(l) 0.46766 0.13582 3.443 0.001131 **
## lag(log(l), 1) -0.52845 0.12660 -4.174 0.000112 ***
## log(k) 1.31477 0.32596 4.034 0.000177 ***
## lag(log(k), 1) -1.21004 0.29508 -4.101 0.000142 ***
## lag(log(pib), 1) 0.87238 0.06458 13.508 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.01146 on 53 degrees of freedom
## (1 observation deleted due to missingness)
## Multiple R-squared: 0.9993, Adjusted R-squared: 0.9992
## F-statistic: 1.53e+04 on 5 and 53 DF, p-value: < 2.2e-16
model_3 = lm( log(pib) ~ log(l) + log(k)+ lag(log(pib),1), data = data_ts)
summary(model_3)
##
## Call:
## lm(formula = log(pib) ~ log(l) + log(k) + lag(log(pib), 1), data = data_ts)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.039348 -0.008048 -0.000502 0.012191 0.027420
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.13169 0.32539 0.405 0.687
## log(l) 0.01958 0.03969 0.493 0.624
## log(k) 0.09136 0.07021 1.301 0.199
## lag(log(pib), 1) 0.87458 0.07634 11.457 3.42e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.01526 on 55 degrees of freedom
## (1 observation deleted due to missingness)
## Multiple R-squared: 0.9987, Adjusted R-squared: 0.9987
## F-statistic: 1.437e+04 on 3 and 55 DF, p-value: < 2.2e-16
cat("AIC:", AIC(model_3),"\n")
## AIC: -320.2503
Para ello, realice:
fig <- plot_ly(x = ~ lag(resid(model_1)), y = ~ resid(model_1))
fig
## No trace type specified:
## Based on info supplied, a 'scatter' trace seems appropriate.
## Read more about this trace type -> https://plotly.com/r/reference/#scatter
## No scatter mode specifed:
## Setting the mode to markers
## Read more about this attribute -> https://plotly.com/r/reference/#scatter-mode