Introducción

Primeramente, cargamos las principales bibliotecas que vamos a emplear (en las más relevantes concretamos cual es su función):

library(reshape2)
library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(car) # regression analysis
## Loading required package: carData
## 
## Attaching package: 'car'
## The following object is masked from 'package:dplyr':
## 
##     recode
library(plotly)
## Loading required package: ggplot2
## 
## Attaching package: 'plotly'
## The following object is masked from 'package:ggplot2':
## 
##     last_plot
## The following object is masked from 'package:stats':
## 
##     filter
## The following object is masked from 'package:graphics':
## 
##     layout
library(ggplot2)
library(lmtest) # lm Breusch-Godfrey test for serial correlation
## Loading required package: zoo
## 
## Attaching package: 'zoo'
## The following objects are masked from 'package:base':
## 
##     as.Date, as.Date.numeric
library(ecm) # to calculate Durbin's h-statistic
library(orcutt) # cochrane-orcutt procedure in Case of First Order Autocorrelation
library(dynlm) # dynamic models
library(AER) # ivreg
## Loading required package: sandwich
## Loading required package: survival
library(stargazer) # regression table
## 
## Please cite as:
##  Hlavac, Marek (2022). stargazer: Well-Formatted Regression and Summary Statistics Tables.
##  R package version 5.2.3. https://CRAN.R-project.org/package=stargazer
knitr::opts_chunk$set(echo = TRUE, warning = FALSE)

Importamos los datos

Ahora tenemos que importar a RStudio los datos que previamente hemos descargado de las Penn World Table con los que vamos a trabajar, en nuestro caso, son datos para poder formar la función de producción Cobb-Douglas de Bélgica. Una vez descargados, vemos las primeras 10 observaciones:

data = read.csv("~/Desktop/Scripts de R/Datos/FebPwtExport2232023 19.41.35.csv") 
head(data, 10)

Los remodelamos al formato ancho para poder trabajar cómodamente:

names(data)
## [1] "VariableCode" "RegionCode"   "YearCode"     "AggValue"
data_wide <- dcast(data, RegionCode + YearCode ~ VariableCode, value.var="AggValue")
head(data_wide, 10)

amos a considerar las siguientes variables de manera que podamos especificar la función de producción de la economía belga:

\(Y_t\) = Nivel de producción (variable GDP = rgdpna).

\(X_{2t}\) = Factor de producción trabajo (variable L = avh*emp).

\(X_{3t}\) = Factor de producción capital físico (variable K = rnna).

names(data_wide)
## [1] "RegionCode" "YearCode"   "avh"        "emp"        "rgdpna"    
## [6] "rnna"
data_wide <- data_wide %>%
dplyr::mutate(l=avh*emp) %>%
dplyr::rename(k = rnna,
              pib = rgdpna,
              Year = YearCode)

Tomamos una submuestra del conjunto de datos (del año 1960 en adelante):

data_sample <- data_wide %>%
dplyr::filter(Year >= 1960)
data_ts = ts(data_sample, frequency=1, start=c(1960,1))
plot.ts(data_ts)

Mantenemos las gráficas de las variables que vamos a emplear a la hora de estimar los modelos, de manera que podamos observar de manera visual el comportamiento de estas variables de la economía belga a lo largo del tiempo:

plot.ts(data_ts[,-c(1,2,3,4)])

Podemos representar también una única variable, en este caso, el PIB:

plot.ts(data_ts[,c("pib")])

Inicio de la práctica.

Primero vamos a escribir las ecuaciones de los modelos que posteriormente estimaremos: \[ln(Y_t) = \beta_1 + \beta_2 \ln(X_{2t}) + \beta_3 \ln(X_{3t})u_t\] \[ln(Y_t) = \beta_1 + \beta_2 \ln(X_{2t}) + \beta_3 \ln(X_{2t-1}) + \beta_4 \ln(X_{3t}) + \beta_5 \ln(X_{3t-1}) + \beta_6 \ln(Y_{t-1}) + u_t\] \[ln(Y_t) = \beta_1 + \beta_2 \ln(X_{2t}) + \beta_3 \ln(X_{3t}) + \beta_4 \ln(Y_{t-1}) + u_t\]

1. Estime las siguientes especificaciones de modelo econométrico por MCO.

model_1 = lm( log(pib) ~ log(l) + log(k), data = data_ts)
summary(model_1)
## 
## Call:
## lm(formula = log(pib) ~ log(l) + log(k), data = data_ts)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.08205 -0.01986 -0.00249  0.01849  0.07267 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  2.484115   0.501768   4.951 6.92e-06 ***
## log(l)      -0.305371   0.056905  -5.366 1.53e-06 ***
## log(k)       0.900725   0.008164 110.323  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.02985 on 57 degrees of freedom
## Multiple R-squared:  0.9954, Adjusted R-squared:  0.9952 
## F-statistic:  6111 on 2 and 57 DF,  p-value: < 2.2e-16
cat("AIC:", AIC(model_1),"\n")
## AIC: -246.2069
model_2 = lm(log(pib) ~ log(l) + lag(log(l),1) + log(k) + lag(log(k),1) + lag(log(pib),1), data = data_ts)
summary(model_2)
## 
## Call:
## lm(formula = log(pib) ~ log(l) + lag(log(l), 1) + log(k) + lag(log(k), 
##     1) + lag(log(pib), 1), data = data_ts)
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -0.032034 -0.004551 -0.000032  0.004536  0.023390 
## 
## Coefficients:
##                  Estimate Std. Error t value Pr(>|t|)    
## (Intercept)       0.64227    0.26205   2.451 0.017582 *  
## log(l)            0.46766    0.13582   3.443 0.001131 ** 
## lag(log(l), 1)   -0.52845    0.12660  -4.174 0.000112 ***
## log(k)            1.31477    0.32596   4.034 0.000177 ***
## lag(log(k), 1)   -1.21004    0.29508  -4.101 0.000142 ***
## lag(log(pib), 1)  0.87238    0.06458  13.508  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.01146 on 53 degrees of freedom
##   (1 observation deleted due to missingness)
## Multiple R-squared:  0.9993, Adjusted R-squared:  0.9992 
## F-statistic: 1.53e+04 on 5 and 53 DF,  p-value: < 2.2e-16
model_3 = lm( log(pib) ~ log(l) + log(k)+ lag(log(pib),1), data = data_ts)
summary(model_3)
## 
## Call:
## lm(formula = log(pib) ~ log(l) + log(k) + lag(log(pib), 1), data = data_ts)
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -0.039348 -0.008048 -0.000502  0.012191  0.027420 
## 
## Coefficients:
##                  Estimate Std. Error t value Pr(>|t|)    
## (Intercept)       0.13169    0.32539   0.405    0.687    
## log(l)            0.01958    0.03969   0.493    0.624    
## log(k)            0.09136    0.07021   1.301    0.199    
## lag(log(pib), 1)  0.87458    0.07634  11.457 3.42e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.01526 on 55 degrees of freedom
##   (1 observation deleted due to missingness)
## Multiple R-squared:  0.9987, Adjusted R-squared:  0.9987 
## F-statistic: 1.437e+04 on 3 and 55 DF,  p-value: < 2.2e-16
cat("AIC:", AIC(model_3),"\n")
## AIC: -320.2503

2. Investigue la existencia de autocorrelación en el término de error del modelo (1).

Para ello, realice:

  1. Diagrama de dispersión de los residuos (\(\hat{u}_t\) y \(\hat{u}_{t-1}\)). Podemos ver los dos gráficos siguientes:
fig <- plot_ly(x = ~ lag(resid(model_1)), y = ~ resid(model_1))
fig
## No trace type specified:
##   Based on info supplied, a 'scatter' trace seems appropriate.
##   Read more about this trace type -> https://plotly.com/r/reference/#scatter
## No scatter mode specifed:
##   Setting the mode to markers
##   Read more about this attribute -> https://plotly.com/r/reference/#scatter-mode