Suponha que os rendimentos anuais e o consumo de álcool sejam
determinados pelo Sistema de Equações Simultâneas
\(log(earnings) = β0 + β1alcohol + β2educ +
u1\)
\(alcohol = γ0 +
γ1log(earnings) + γ2educ + γ3log(price) + u2\)
Onde o price é um índice local de preços do alcohol, que inclui
impostos estaduais e locais. Suponha que educ e price sejam exógenos. Se
β1, β2, γ1, γ2, γ3 são todos diferentes de zero, qual equação é
identificada? Como você estimaria essa equação?
Equação Estrutural Identificada:
A
equação de log(earnings) é identificada, pois temos uma variável exógena
(log(price)) que não está na equação de log(earnings) mas está na
equação de alcohol
Estimação da Equação de log(earnings):
Para estimar a equação identificada de log(earnings) usando Mínimos
Quadrados de Dois Estágios (MQ2E):
Primeiro Estágio: Regressar alcohol nas variáveis exógenas (educ e
log(price)).
\(alcohol= δ0 + δ1educ + δ2
log(price) + ϵ\)
Para então Obter os valores ajustados de
alcohol a partir dessa regressão.
Segundo Estágio: Regressar log(earnings) nos valores ajustados de
alcohol e educ.
\(log(earnings) = β0 +
β1alcohol + β2educ + u1\)
A base de dados “CAFE” contém observações de 1990 a 2022 das
seguintes variáveis:
• pi: preço do café;
• pf: preço dos
fatores de produção;
• ps: preço do bem substituto (erva mate);
• di: renda disponível per capita dos indivíduos.
• qi: quantidade
de café produzida.
Importe para o RStudio a base de dados. Verifique
a existência de dados faltantes, analise a classe das variáveis e as
estatísticas descritivas.
As equações de Oferta e Demanda do modelo são:
Oferta:
\(Qi = β1 + β2Pi + β3P Fi + εsi\)
Demanda:
\(Qi = α1 + α2Pi + α3P Si + α4DIi
+ εdi\)
E as equações na forma reduzida são:
\(Qi = π11 + π12P Si + π13DIi + π14P Fi +
vi1\)
\(Pi = π11 + π22P Si +
π23DIi + π24P Fi + vi2\)
Estime “manualmente” as equações de oferta e demanda por MQO.
#encoding
options(encoding = "UTF-8") #codificação dos caracteres
options(scipen = 999) #desliga a notação científica
#install.packages("systemfit")
#packages
library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.1.4 ✔ readr 2.1.5
## ✔ forcats 1.0.0 ✔ stringr 1.5.1
## ✔ ggplot2 3.5.1 ✔ tibble 3.2.1
## ✔ lubridate 1.9.3 ✔ tidyr 1.3.1
## ✔ purrr 1.0.2
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(ggplot2)
library(zoo)
##
## Attaching package: 'zoo'
##
## The following objects are masked from 'package:base':
##
## as.Date, as.Date.numeric
library(lmtest)
library(sandwich)
library(stargazer)
##
## Please cite as:
##
## Hlavac, Marek (2022). stargazer: Well-Formatted Regression and Summary Statistics Tables.
## R package version 5.2.3. https://CRAN.R-project.org/package=stargazer
library(hrbrthemes)
library(foreign)
library(AER)
## Carregando pacotes exigidos: car
## Carregando pacotes exigidos: carData
##
## Attaching package: 'car'
##
## The following object is masked from 'package:dplyr':
##
## recode
##
## The following object is masked from 'package:purrr':
##
## some
##
## Carregando pacotes exigidos: survival
library(ivmodel)
library(magrittr)
##
## Attaching package: 'magrittr'
##
## The following object is masked from 'package:purrr':
##
## set_names
##
## The following object is masked from 'package:tidyr':
##
## extract
library(systemfit)
## Carregando pacotes exigidos: Matrix
##
## Attaching package: 'Matrix'
##
## The following objects are masked from 'package:tidyr':
##
## expand, pack, unpack
##
##
## Please cite the 'systemfit' package as:
## Arne Henningsen and Jeff D. Hamann (2007). systemfit: A Package for Estimating Systems of Simultaneous Equations in R. Journal of Statistical Software 23(4), 1-40. http://www.jstatsoft.org/v23/i04/.
##
## If you have questions, suggestions, or comments regarding the 'systemfit' package, please use a forum or 'tracker' at systemfit's R-Forge site:
## https://r-forge.r-project.org/projects/systemfit/
#data---------------------------------------------------------------------------------
cafe <- read.csv("CAFE.csv", sep = ";")
colSums(is.na(cafe))
## X ano qi pi ps di pf
## 0 0 0 0 0 0 0
str(cafe)
## 'data.frame': 33 obs. of 7 variables:
## $ X : int 1 2 3 4 5 6 7 8 9 10 ...
## $ ano: int 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 ...
## $ qi : int 2929711 3040763 2588745 2557518 2614578 1860269 2738391 2457025 3378952 3263704 ...
## $ pi : chr "45,6754" "249,5395" "2421,358" "68919,7322" ...
## $ ps : chr "859,1" "4398,1" "36949,9" "896847,6" ...
## $ di : chr "13,3360436363636" "17,4161818181818" "24,72644" "35,1244727272727" ...
## $ pf : chr "1504,25" "1924,58" "2250,65" "2100,75" ...
# Converter variáveis para numérico
cafe$pi <- as.numeric(gsub(",", ".", cafe$pi))
cafe$ps <- as.numeric(gsub(",", ".", cafe$ps))
cafe$di <- as.numeric(gsub(",", ".", cafe$di))
cafe$pf <- as.numeric(gsub(",", ".", cafe$pf))
str(cafe)
## 'data.frame': 33 obs. of 7 variables:
## $ X : int 1 2 3 4 5 6 7 8 9 10 ...
## $ ano: int 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 ...
## $ qi : int 2929711 3040763 2588745 2557518 2614578 1860269 2738391 2457025 3378952 3263704 ...
## $ pi : num 45.7 249.5 2421.4 68919.7 2916684 ...
## $ ps : num 859 4398 36950 896848 37991000 ...
## $ di : num 13.3 17.4 24.7 35.1 440.5 ...
## $ pf : num 1504 1925 2251 2101 2557 ...
# Calcular estatísticas descritivas
summary(cafe)
## X ano qi pi
## Min. : 1 Min. :1990 Min. : 1860269 Min. : 46
## 1st Qu.: 9 1st Qu.:1998 1st Qu.: 2320343 1st Qu.: 4016365
## Median :17 Median :2006 Median : 2835162 Median : 14542289
## Mean :17 Mean :2006 Mean :11626745 Mean :2489806742
## 3rd Qu.:25 3rd Qu.:2014 3rd Qu.:23818000 3rd Qu.:5957358160
## Max. :33 Max. :2022 Max. :37950000 Max. :9492054000
## ps di pf
## Min. : 859 Min. : 13.34 Min. : 1504
## 1st Qu.: 70558000 1st Qu.: 757.68 1st Qu.: 4098
## Median :132413000 Median :1088.80 Median : 6544
## Mean :239828517 Mean :1421.98 Mean : 8337
## 3rd Qu.:426385000 3rd Qu.:2129.30 3rd Qu.:11012
## Max. :846551000 Max. :3146.00 Max. :32128
# Oferta: Qi = β1 + β2Pi + β3PFi + εsi
oferta_mqo <- lm(qi ~ pi + pf, data = cafe)
summary(oferta_mqo)
##
## Call:
## lm(formula = qi ~ pi + pf, data = cafe)
##
## Residuals:
## Min 1Q Median 3Q Max
## -934254 -248470 -48783 313779 1187884
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 2750890.23997053 162340.07895196 16.945 <0.0000000000000002 ***
## pi 0.00367114 0.00002385 153.910 <0.0000000000000002 ***
## pf -31.73364501 13.96712148 -2.272 0.0304 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 488800 on 30 degrees of freedom
## Multiple R-squared: 0.9988, Adjusted R-squared: 0.9987
## F-statistic: 1.216e+04 on 2 and 30 DF, p-value: < 0.00000000000000022
# Demanda: Qi = α1 + α2Pi + α3PSi + α4DIi + εdi
demanda_mqo <- lm(qi ~ pi + ps + di, data = cafe)
summary(demanda_mqo)
##
## Call:
## lm(formula = qi ~ pi + ps + di, data = cafe)
##
## Residuals:
## Min 1Q Median 3Q Max
## -944607 -266646 -19918 229442 1127605
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 2704981.5005554 167303.3970455 16.168 0.000000000000000478 ***
## pi 0.0036571 0.0000248 147.447 < 0.0000000000000002 ***
## ps -0.0014286 0.0008903 -1.605 0.119
## di 111.7784015 203.7106460 0.549 0.587
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 479500 on 29 degrees of freedom
## Multiple R-squared: 0.9989, Adjusted R-squared: 0.9987
## F-statistic: 8425 on 3 and 29 DF, p-value: < 0.00000000000000022
Estime as equações de oferta e demanda através do pacote systemfit
pelos métodos:
• MQO
• MQ2E
• MQ3E
A partir dos
resultados compare os modelos.
# Verificar a estrutura dos dados novamente
str(cafe)
## 'data.frame': 33 obs. of 7 variables:
## $ X : int 1 2 3 4 5 6 7 8 9 10 ...
## $ ano: int 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 ...
## $ qi : int 2929711 3040763 2588745 2557518 2614578 1860269 2738391 2457025 3378952 3263704 ...
## $ pi : num 45.7 249.5 2421.4 68919.7 2916684 ...
## $ ps : num 859 4398 36950 896848 37991000 ...
## $ di : num 13.3 17.4 24.7 35.1 440.5 ...
## $ pf : num 1504 1925 2251 2101 2557 ...
summary(cafe)
## X ano qi pi
## Min. : 1 Min. :1990 Min. : 1860269 Min. : 46
## 1st Qu.: 9 1st Qu.:1998 1st Qu.: 2320343 1st Qu.: 4016365
## Median :17 Median :2006 Median : 2835162 Median : 14542289
## Mean :17 Mean :2006 Mean :11626745 Mean :2489806742
## 3rd Qu.:25 3rd Qu.:2014 3rd Qu.:23818000 3rd Qu.:5957358160
## Max. :33 Max. :2022 Max. :37950000 Max. :9492054000
## ps di pf
## Min. : 859 Min. : 13.34 Min. : 1504
## 1st Qu.: 70558000 1st Qu.: 757.68 1st Qu.: 4098
## Median :132413000 Median :1088.80 Median : 6544
## Mean :239828517 Mean :1421.98 Mean : 8337
## 3rd Qu.:426385000 3rd Qu.:2129.30 3rd Qu.:11012
## Max. :846551000 Max. :3146.00 Max. :32128
# Especificar as equações
eq1 <- qi ~ pi + pf # Oferta
eq2 <- qi ~ pi + di # Demanda
eq.sys = list(eq1, eq2)
instrum = ~ di + pf
twosls = systemfit(eq.sys, inst = instrum, method = "2SLS", data = cafe)
summary(twosls)
##
## systemfit results
## method: 2SLS
##
## N DF SSR detRCov OLS-R2 McElroy-R2
## system 66 60 518152318275882 4724993647189394148462226 0.955461 0.997553
##
## N DF SSR MSE RMSE R2 Adj R2
## eq1 33 30 11495174706850 383172490228 619009 0.998024 0.997892
## eq2 33 30 506657143569031 16888571452301 4109571 0.912898 0.907091
##
## The covariance matrix of the residuals
## eq1 eq2
## eq1 383172490228 -1321454627518
## eq2 -1321454627518 16888571452301
##
## The correlations of the residuals
## eq1 eq2
## eq1 1.000000 -0.519468
## eq2 -0.519468 1.000000
##
##
## 2SLS estimates for 'eq1' (equation 1)
## Model Formula: qi ~ pi + pf
## Instruments: ~di + pf
##
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 2425397.935286877 934129.296503755 2.59643 0.014449
## pi 0.003772668 0.000285839 13.19860 0.000000000000049738
## pf -23.013601654 30.147036420 -0.76338 0.451200
##
## (Intercept) *
## pi ***
## pf
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 619009.281213 on 30 degrees of freedom
## Number of observations: 33 Degrees of Freedom: 30
## SSR: 11495174706850.4 MSE: 383172490228.346 Root MSE: 619009.281213
## Multiple R-Squared: 0.998024 Adjusted R-Squared: 0.997892
##
##
## 2SLS estimates for 'eq2' (equation 2)
## Model Formula: qi ~ pi + di
## Instruments: ~di + pf
##
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 6478804.34167496 40866209.44946314 0.15854 0.87510
## pi 0.00257312 0.01202812 0.21393 0.83205
## di -885.11961447 7697.70557598 -0.11498 0.90922
##
## Residual standard error: 4109570.71387 on 30 degrees of freedom
## Number of observations: 33 Degrees of Freedom: 30
## SSR: 506657143569031 MSE: 16888571452301 Root MSE: 4109570.71387
## Multiple R-Squared: 0.912898 Adjusted R-Squared: 0.907091