Abstract
This is an undergrad student level exercise for class use. The example is drawn following Greene (2003).This work is licensed under the Creative Commons Attribution-ShareAlike 4.0 International License. To view a copy of this license, visit http://creativecommons.org/licenses/by-sa/4.0/ or send a letter to Creative Commons, PO Box 1866, Mountain View, CA 94042, USA.
License: CC BY-SA 4.0
Sugestão de citação: FIGUEIREDO, Adriano Marcos Rodrigues. Econometria: exercício Longley, Forças Armadas, de Greene (2003), Tabela F4_2. Campo Grande-MS, Brasil: RStudio/Rpubs, 2018 (atualizado em 2020). Disponível em https://rpubs.com/amrofi/example_greene_F4_2.
O pacote datasets
tem disponíveis distintos conjuntos de dados. Por exemplo, os dados de Longley referentes (o leitor pode investigar detalhes fazendo help("Greene2003", package = "AER")
ou disponíveis no pacote datasets
) são dos Estados Unidos - 1947-1962, anuais:
Y ao numero de pessoas empregadas (Employed),
X1 ao deflator implícito do Produto Nacional Bruto (GNP.deflator),
X2 ao Produto Nacional Bruto (GNP),
X3 ao numero de desempregados (Unemployed),
X4 ao numero de pessoas nas forças armadas (Armed.Forces),
X5 à população ativa (>16 anos) (Population), y
X6 uma tendencia linear (Year).
Obtendo os dados a partir do pacote, chamando diretamente ou pela opção do dput()
.
# attach(longley) # puxa direto do pacote
dados <- structure(list(obs = c(1947, 1948, 1949, 1950, 1951, 1952, 1953, 1954, 1955,
1956, 1957, 1958, 1959, 1960, 1961, 1962), Employed = c(60323, 61122, 60171,
61187, 63221, 63639, 64989, 63761, 66019, 67857, 68169, 66513, 68655, 69564,
69331, 70551), GNP.deflator = c(83, 88.5, 88.2, 89.5, 96.2, 98.1, 99, 100, 101.2,
104.6, 108.4, 110.8, 112.6, 114.2, 115.7, 116.9), GNP = c(234289, 259426, 258054,
284599, 328975, 346999, 365385, 363112, 397469, 419180, 442769, 444546, 482704,
502601, 518173, 554894), Unemployed = c(2356, 2325, 3682, 3351, 2099, 1932, 1870,
3578, 2904, 2822, 2936, 4681, 3813, 3931, 4806, 4007), Armed.Forces = c(1590,
1456, 1616, 1650, 3099, 3594, 3547, 3350, 3048, 2857, 2798, 2637, 2552, 2514,
2572, 2827), Population = c(107608, 108632, 109773, 110929, 112075, 113270, 115094,
116219, 117388, 118734, 120445, 121950, 123366, 125368, 127852, 130081), Year = c(1,
2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16)), row.names = c(NA, -16L),
class = c("tbl_df", "tbl", "data.frame"))
attach(dados)
Estima-se o modelo de regressão linear do número de personas empregadas em função do deflator do GNP, dos desempregos, do número de pessoas nas forças armadas, da população e da tendência anual.
dados <- longley
tsdata <- ts(dados, start = c(1947), frequency = 1)
reg1 = lm(Employed ~ GNP.deflator + GNP + Unemployed + Armed.Forces + Population +
Year, data = tsdata)
summary(reg1)
Call:
lm(formula = Employed ~ GNP.deflator + GNP + Unemployed + Armed.Forces +
Population + Year, data = tsdata)
Residuals:
Min 1Q Median 3Q Max
-0.41011 -0.15767 -0.02816 0.10155 0.45539
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -3.482e+03 8.904e+02 -3.911 0.003560 **
GNP.deflator 1.506e-02 8.492e-02 0.177 0.863141
GNP -3.582e-02 3.349e-02 -1.070 0.312681
Unemployed -2.020e-02 4.884e-03 -4.136 0.002535 **
Armed.Forces -1.033e-02 2.143e-03 -4.822 0.000944 ***
Population -5.110e-02 2.261e-01 -0.226 0.826212
Year 1.829e+00 4.555e-01 4.016 0.003037 **
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.3049 on 9 degrees of freedom
Multiple R-squared: 0.9955, Adjusted R-squared: 0.9925
F-statistic: 330.3 on 6 and 9 DF, p-value: 4.984e-10
reg1$AIC <- AIC(reg1)
reg1$BIC <- BIC(reg1)
Outra opção pela função dynlm
:
library(dynlm)
mod1 <- dynlm(Employed ~ GNP.deflator + GNP + Unemployed + Armed.Forces + Population +
Year, data = tsdata)
Se a saída fosse apenas pelo comando summary, sairia da forma:
summary(mod1)
Time series regression with "ts" data:
Start = 1947, End = 1962
Call:
dynlm(formula = Employed ~ GNP.deflator + GNP + Unemployed +
Armed.Forces + Population + Year, data = tsdata)
Residuals:
Min 1Q Median 3Q Max
-0.41011 -0.15767 -0.02816 0.10155 0.45539
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -3.482e+03 8.904e+02 -3.911 0.003560 **
GNP.deflator 1.506e-02 8.492e-02 0.177 0.863141
GNP -3.582e-02 3.349e-02 -1.070 0.312681
Unemployed -2.020e-02 4.884e-03 -4.136 0.002535 **
Armed.Forces -1.033e-02 2.143e-03 -4.822 0.000944 ***
Population -5.110e-02 2.261e-01 -0.226 0.826212
Year 1.829e+00 4.555e-01 4.016 0.003037 **
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.3049 on 9 degrees of freedom
Multiple R-squared: 0.9955, Adjusted R-squared: 0.9925
F-statistic: 330.3 on 6 and 9 DF, p-value: 4.984e-10
Agora, criando uma tabela com a saída do modelo, com o pacote stargazer tem-se, com a geração de AIC e BIC:
library(stargazer)
mod1$AIC <- AIC(mod1)
mod1$BIC <- BIC(mod1)
library(stargazer)
star.1 <- stargazer(mod1, reg1, title = "Título: Resultado da Regressão", align = TRUE,
type = "text", style = "all", keep.stat = c("aic", "bic", "rsq", "adj.rsq", "n"))
Título: Resultado da Regressão
================================================
Dependent variable:
----------------------------
Employed
dynamic OLS
linear
(1) (2)
------------------------------------------------
GNP.deflator 0.015 0.015
(0.085) (0.085)
t = 0.177 t = 0.177
p = 0.864 p = 0.864
GNP -0.036 -0.036
(0.033) (0.033)
t = -1.070 t = -1.070
p = 0.313 p = 0.313
Unemployed -0.020*** -0.020***
(0.005) (0.005)
t = -4.136 t = -4.136
p = 0.003 p = 0.003
Armed.Forces -0.010*** -0.010***
(0.002) (0.002)
t = -4.822 t = -4.822
p = 0.001 p = 0.001
Population -0.051 -0.051
(0.226) (0.226)
t = -0.226 t = -0.226
p = 0.827 p = 0.827
Year 1.829*** 1.829***
(0.455) (0.455)
t = 4.016 t = 4.016
p = 0.004 p = 0.004
Constant -3,482.259*** -3,482.259***
(890.420) (890.420)
t = -3.911 t = -3.911
p = 0.004 p = 0.004
------------------------------------------------
Observations 16 16
R2 0.995 0.995
Adjusted R2 0.992 0.992
Akaike Inf. Crit. 14.187 14.187
Bayesian Inf. Crit. 20.367 20.367
================================================
Note: *p<0.1; **p<0.05; ***p<0.01
A análise de correlação permitirá ter uma ideia inicial de possível multicolinearidade:
# install.packages('corrplot')
library(corrplot)
corel <- cor(tsdata)
corrplot(corel, method = "number")
As correlações em módulo com valores acima de 0.8 podem indicar presença de relação linear entre as variáveis.
Ou pode-se fazer:
library(car)
library(lmtest)
library(sandwich)
# vif
reg1.vif <- vif(mod1)
reg1.vif
GNP.deflator GNP Unemployed Armed.Forces Population Year
135.53244 1788.51348 33.61889 3.58893 399.15102 758.98060
# regressoes auxiliares
reg1.GNPdef <- lm(GNP.deflator ~ GNP + Unemployed + Armed.Forces + Population + Year,
data = tsdata)
summary(reg1.GNPdef)
Call:
lm(formula = GNP.deflator ~ GNP + Unemployed + Armed.Forces +
Population + Year, data = tsdata)
Residuals:
Min 1Q Median 3Q Max
-2.0123 -0.4510 0.1170 0.4191 1.5339
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 2.149e+03 3.246e+03 0.662 0.5229
GNP 2.561e-01 9.484e-02 2.701 0.0223 *
Unemployed 3.192e-02 1.513e-02 2.110 0.0611 .
Armed.Forces 8.802e-03 7.478e-03 1.177 0.2665
Population -1.755e+00 6.331e-01 -2.772 0.0197 *
Year -9.992e-01 1.667e+00 -0.600 0.5621
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 1.135 on 10 degrees of freedom
Multiple R-squared: 0.9926, Adjusted R-squared: 0.9889
F-statistic: 269.1 on 5 and 10 DF, p-value: 2.541e-10
reg1.GNP <- lm(GNP ~ GNP.deflator + Unemployed + Armed.Forces + Population + Year,
data = tsdata)
summary(reg1.GNP)
Call:
lm(formula = GNP ~ GNP.deflator + Unemployed + Armed.Forces +
Population + Year, data = tsdata)
Residuals:
Min 1Q Median 3Q Max
-3.8685 -1.4809 -0.3626 1.5030 4.9323
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -2.170e+04 4.859e+03 -4.466 0.001205 **
GNP.deflator 1.647e+00 6.097e-01 2.701 0.022289 *
Unemployed -1.379e-01 1.500e-02 -9.192 3.42e-06 ***
Armed.Forces -2.998e-02 1.787e-02 -1.677 0.124388
Population 5.624e+00 1.180e+00 4.765 0.000763 ***
Year 1.090e+01 2.571e+00 4.241 0.001713 **
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 2.878 on 10 degrees of freedom
Multiple R-squared: 0.9994, Adjusted R-squared: 0.9992
F-statistic: 3575 on 5 and 10 DF, p-value: 6.405e-16
reg1.unemp <- lm(Unemployed ~ GNP + GNP.deflator + +Armed.Forces + Population + Year,
data = tsdata)
summary(reg1.unemp)
Call:
lm(formula = Unemployed ~ GNP + GNP.deflator + +Armed.Forces +
Population + Year, data = tsdata)
Residuals:
Min 1Q Median 3Q Max
-28.565 -8.126 2.513 10.095 32.775
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -1.524e+05 3.164e+04 -4.818 0.000705 ***
GNP -6.484e+00 7.054e-01 -9.192 3.42e-06 ***
GNP.deflator 9.649e+00 4.574e+00 2.110 0.061060 .
Armed.Forces -2.714e-01 1.090e-01 -2.489 0.032018 *
Population 3.510e+01 9.543e+00 3.678 0.004261 **
Year 7.686e+01 1.671e+01 4.601 0.000979 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 19.74 on 10 degrees of freedom
Multiple R-squared: 0.9703, Adjusted R-squared: 0.9554
F-statistic: 65.24 on 5 and 10 DF, p-value: 2.631e-07
reg1.armed <- lm(Armed.Forces ~ Unemployed + GNP + GNP.deflator + +Population + Year,
data = tsdata)
summary(reg1.armed)
Call:
lm(formula = Armed.Forces ~ Unemployed + GNP + GNP.deflator +
+Population + Year, data = tsdata)
Residuals:
Min 1Q Median 3Q Max
-60.493 -22.175 2.053 23.116 55.942
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -2.284e+05 1.098e+05 -2.081 0.0641 .
Unemployed -1.410e+00 5.663e-01 -2.489 0.0320 *
GNP -7.324e+00 4.366e+00 -1.677 0.1244
GNP.deflator 1.382e+01 1.174e+01 1.177 0.2665
Population 1.993e+01 3.276e+01 0.608 0.5565
Year 1.168e+02 5.617e+01 2.079 0.0643 .
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 44.99 on 10 degrees of freedom
Multiple R-squared: 0.7214, Adjusted R-squared: 0.582
F-statistic: 5.178 on 5 and 10 DF, p-value: 0.01327
reg1.pop <- lm(Population ~ Armed.Forces + Unemployed + GNP + GNP.deflator + +Year,
data = tsdata)
summary(reg1.pop)
Call:
lm(formula = Population ~ Armed.Forces + Unemployed + GNP + GNP.deflator +
+Year, data = tsdata)
Residuals:
Min 1Q Median 3Q Max
-0.57524 -0.18536 0.07539 0.24615 0.58666
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 1.618e+03 1.136e+03 1.424 0.184790
Armed.Forces 1.791e-03 2.943e-03 0.608 0.556517
Unemployed 1.638e-02 4.454e-03 3.678 0.004261 **
GNP 1.234e-01 2.591e-02 4.765 0.000763 ***
GNP.deflator -2.476e-01 8.932e-02 -2.772 0.019720 *
Year -7.820e-01 5.872e-01 -1.332 0.212452
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.4264 on 10 degrees of freedom
Multiple R-squared: 0.9975, Adjusted R-squared: 0.9962
F-statistic: 796.3 on 5 and 10 DF, p-value: 1.154e-12
reg1.year <- lm(Year ~ Population + Armed.Forces + Unemployed + GNP + GNP.deflator,
data = tsdata)
summary(reg1.year)
Call:
lm(formula = Year ~ Population + Armed.Forces + Unemployed +
GNP + GNP.deflator, data = tsdata)
Residuals:
Min 1Q Median 3Q Max
-0.41955 -0.11015 0.01308 0.07981 0.26143
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 1.954e+03 1.540e+01 126.873 < 2e-16 ***
Population -1.927e-01 1.447e-01 -1.332 0.212452
Armed.Forces 2.584e-03 1.243e-03 2.079 0.064295 .
Unemployed 8.837e-03 1.921e-03 4.601 0.000979 ***
GNP 5.895e-02 1.390e-02 4.241 0.001713 **
GNP.deflator -3.473e-02 5.792e-02 -0.600 0.562125
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.2117 on 10 degrees of freedom
Multiple R-squared: 0.9987, Adjusted R-squared: 0.998
F-statistic: 1516 on 5 and 10 DF, p-value: 4.65e-14
library(lmtest)
lmtest::resettest(mod1, power = 2:4, type = "fitted")
RESET test
data: mod1
RESET = 1.4335, df1 = 3, df2 = 6, p-value = 0.3229
# resíduos do modelo
u.hat <- resid(mod1)
# teste de média dos resíduos: H0: resíduos têm média zero
t.test(u.hat)
One Sample t-test
data: u.hat
t = 8.8154e-17, df = 15, p-value = 1
alternative hypothesis: true mean is not equal to 0
95 percent confidence interval:
-0.1258296 0.1258296
sample estimates:
mean of x
5.20417e-18
# histograma
hist.mod1 <- hist(u.hat, freq = FALSE)
curve(dnorm, add = TRUE, col = "red")
hist(u.hat, density = 20, breaks = 20, prob = TRUE, xlab = "x-variable", ylim = c(0,
2), main = "normal curve over histogram")
curve(dnorm(x, mean = mean(u.hat), sd = sqrt(var(u.hat))), col = "darkblue", lwd = 2,
add = TRUE, yaxt = "n")
car::qqPlot(mod1)
1950 1956
4 10
JB.mod1 <- tseries::jarque.bera.test(u.hat)
JB.mod1
Jarque Bera Test
data: u.hat
X-squared = 0.68414, df = 2, p-value = 0.7103
# mod1 <- dynlm(Employed~GNP.deflator + GNP + Unemployed +Armed.Forces +
# Population+Year, data=tsdata )
m <- mod1
data <- dados
# rotina do teste com base em m e data
u2 <- m$residuals^2 #não mexer
reg.auxiliar <- lm(u2 ~ GNP.deflator + GNP + Unemployed + Armed.Forces + Population +
Year + I(GNP.deflator^2) + I(GNP^2) + I(Unemployed^2) + I(Armed.Forces^2) + I(Population^2) +
I(Year^2))
summary(reg.auxiliar)
Call:
lm(formula = u2 ~ GNP.deflator + GNP + Unemployed + Armed.Forces +
Population + Year + I(GNP.deflator^2) + I(GNP^2) + I(Unemployed^2) +
I(Armed.Forces^2) + I(Population^2) + I(Year^2))
Residuals:
1947 1948 1949 1950 1951 1952 1953 1954
-0.003382 -0.008377 -0.010057 0.020678 0.020341 -0.022230 0.019826 0.033812
1955 1956 1957 1958 1959 1960 1961 1962
-0.081556 0.071323 -0.045652 -0.046630 0.049771 -0.011960 0.032570 -0.018477
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 2.343e+02 1.553e+02 1.508 0.229
GNP.deflator -2.773e-01 3.663e-01 -0.757 0.504
GNP 5.242e-06 2.955e-05 0.177 0.870
Unemployed -4.481e-04 5.539e-04 -0.809 0.478
Armed.Forces 3.439e-04 7.304e-04 0.471 0.670
Population -3.584e-03 2.370e-03 -1.512 0.228
Year 9.661e-01 7.268e-01 1.329 0.276
I(GNP.deflator^2) 1.340e-03 1.915e-03 0.700 0.535
I(GNP^2) -2.667e-11 3.485e-11 -0.765 0.500
I(Unemployed^2) 2.715e-08 7.052e-08 0.385 0.726
I(Armed.Forces^2) -8.932e-08 1.415e-07 -0.631 0.573
I(Population^2) 1.425e-08 9.487e-09 1.502 0.230
I(Year^2) -1.643e-02 1.611e-02 -1.020 0.383
Residual standard error: 0.0877 on 3 degrees of freedom
Multiple R-squared: 0.6319, Adjusted R-squared: -0.8403
F-statistic: 0.4292 on 12 and 3 DF, p-value: 0.874
Ru2 <- summary(reg.auxiliar)$r.squared
LM <- nrow(data) * Ru2
# obtendo o numero de regressores menos o intercepto
k <- length(coefficients(reg.auxiliar)) - 1
k
[1] 12
p.value <- 1 - pchisq(LM, k) # O TESTE TEM k TERMOS REGRESSORES EM reg.auxiliar
# c('LM','p.value')
#'Resultado do teste de White sem termos cruzados
#'H0: residuos homocedásticos
c(LM, p.value)
[1] 10.1110191 0.6062215
ou pelo bptest
:
library(lmtest)
bptest(m, ~GNP.deflator + GNP + Unemployed + Armed.Forces + Population + Year + I(GNP.deflator^2) +
I(GNP^2) + I(Unemployed^2) + I(Armed.Forces^2) + I(Population^2) + I(Year^2))
A transformação z=GNP/GNP.deflator, equivalente ao PNB real e retirando as variáveis tendência (Year) e Unemployed
reg2 <- lm(Employed ~ I(GNP/GNP.deflator) + Armed.Forces + Population, data = tsdata)
summary(reg2)
Call:
lm(formula = Employed ~ I(GNP/GNP.deflator) + Armed.Forces +
Population, data = tsdata)
Residuals:
Min 1Q Median 3Q Max
-1.1318 -0.1395 0.0136 0.3063 0.6817
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 65.720366 10.624808 6.186 4.69e-05 ***
I(GNP/GNP.deflator) 9.736496 1.791552 5.435 0.000151 ***
Armed.Forces -0.006880 0.003222 -2.135 0.054074 .
Population -0.299537 0.141761 -2.113 0.056234 .
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.5354 on 12 degrees of freedom
Multiple R-squared: 0.9814, Adjusted R-squared: 0.9768
F-statistic: 211.1 on 3 and 12 DF, p-value: 1.203e-10
De outra forma - fazendo z=I(GNP/GNP.deflator)
z = I(GNP/GNP.deflator)
reg3 <- lm(Employed ~ z + Armed.Forces + Population, data = tsdata)
summary(reg3)
Call:
lm(formula = Employed ~ z + Armed.Forces + Population, data = tsdata)
Residuals:
Min 1Q Median 3Q Max
-1.1318 -0.1395 0.0136 0.3063 0.6817
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 65.720366 10.624808 6.186 4.69e-05 ***
z 0.009736 0.001792 5.435 0.000151 ***
Armed.Forces -0.006880 0.003222 -2.135 0.054074 .
Population -0.299537 0.141761 -2.113 0.056234 .
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.5354 on 12 degrees of freedom
Multiple R-squared: 0.9814, Adjusted R-squared: 0.9768
F-statistic: 211.1 on 3 and 12 DF, p-value: 1.203e-10
reg3.vif <- vif(reg3)
reg3.vif
z Armed.Forces Population
58.447162 2.631026 50.874690
Fazendo regressoes auxiliares para a reg3 e investigar pela regra de Klein
reg3.armed <- lm(Armed.Forces ~ z + Population, data = tsdata)
summary(reg3.armed)
Call:
lm(formula = Armed.Forces ~ z + Population, data = tsdata)
Residuals:
Min 1Q Median 3Q Max
-57.259 -33.966 -8.089 26.493 88.692
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 2335.8063 645.4288 3.619 0.00312 **
z 0.4167 0.1021 4.082 0.00130 **
Population -30.9979 8.6580 -3.580 0.00336 **
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 46.09 on 13 degrees of freedom
Multiple R-squared: 0.6199, Adjusted R-squared: 0.5614
F-statistic: 10.6 on 2 and 13 DF, p-value: 0.001859
reg3.pop <- lm(Population ~ Armed.Forces + z, data = tsdata)
summary(reg3.pop)
Call:
lm(formula = Population ~ Armed.Forces + z, data = tsdata)
Residuals:
Min 1Q Median 3Q Max
-1.4823 -0.7877 -0.1459 0.7479 1.5361
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 74.6821323 1.7512864 42.64 2.35e-15 ***
Armed.Forces -0.0160166 0.0044736 -3.58 0.00336 **
z 0.0124938 0.0005277 23.68 4.48e-12 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 1.048 on 13 degrees of freedom
Multiple R-squared: 0.9803, Adjusted R-squared: 0.9773
F-statistic: 324.2 on 2 and 13 DF, p-value: 8.086e-12
reg3.z <- lm(z ~ Population + Armed.Forces, data = tsdata)
summary(reg3.z)
Call:
lm(formula = z ~ Population + Armed.Forces, data = tsdata)
Residuals:
Min 1Q Median 3Q Max
-130.045 -43.138 3.789 63.084 116.014
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -5781.7480 366.0895 -15.793 7.30e-10 ***
Population 78.2257 3.3040 23.676 4.48e-12 ***
Armed.Forces 1.3480 0.3303 4.082 0.0013 **
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 82.89 on 13 degrees of freedom
Multiple R-squared: 0.9829, Adjusted R-squared: 0.9803
F-statistic: 373.4 on 2 and 13 DF, p-value: 3.281e-12
library(car)
library(lmtest)
library(sandwich)
# Breusch-Godfrey LM Serial Correlation test:
Fiz uma rotina para rodar vários BGtest até ordem 6.
# padrao do teste de BG, com distribuição qui-quadrado
bgorder = 1:6 # definindo até a máxima ordem do bgtest
d = NULL
for (p in bgorder) {
bgtest.chi <- bgtest(reg3, order = p, type = c("Chisq"), data = tsdata)
print(bgtest.chi)
d = rbind(d, data.frame(bgtest.chi$statistic, bgtest.chi$p.value))
}
Breusch-Godfrey test for serial correlation of order up to 1
data: reg3
LM test = 0.33996, df = 1, p-value = 0.5598
Breusch-Godfrey test for serial correlation of order up to 2
data: reg3
LM test = 3.2826, df = 2, p-value = 0.1937
Breusch-Godfrey test for serial correlation of order up to 3
data: reg3
LM test = 4.4483, df = 3, p-value = 0.2169
Breusch-Godfrey test for serial correlation of order up to 4
data: reg3
LM test = 5.0432, df = 4, p-value = 0.2829
Breusch-Godfrey test for serial correlation of order up to 5
data: reg3
LM test = 5.8924, df = 5, p-value = 0.3168
Breusch-Godfrey test for serial correlation of order up to 6
data: reg3
LM test = 6.8386, df = 6, p-value = 0.336
d
Não existe necessidade de correção para autocorrelação serial.
Farei agora com PNB real per capita. Ou seja, a variável será z/Population:
pnbrpc = z/Population
reg4 <- lm(Employed ~ pnbrpc + Armed.Forces, data = tsdata)
summary(reg4)
Call:
lm(formula = Employed ~ pnbrpc + Armed.Forces, data = tsdata)
Residuals:
Min 1Q Median 3Q Max
-1.32458 -0.11141 -0.01696 0.30583 0.63367
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 3.018e+01 1.425e+00 21.170 1.85e-11 ***
pnbrpc 1.183e+03 5.359e+01 22.078 1.09e-11 ***
Armed.Forces -9.576e-03 2.492e-03 -3.842 0.00204 **
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.5407 on 13 degrees of freedom
Multiple R-squared: 0.9795, Adjusted R-squared: 0.9763
F-statistic: 309.9 on 2 and 13 DF, p-value: 1.078e-11
reg4.vif <- vif(reg4)
reg4.vif
pnbrpc Armed.Forces
1.543535 1.543535
reg4$AIC <- AIC(reg4)
reg4$AIC
[1] 30.40844
reg3$AIC <- AIC(reg3)
reg3$AIC
[1] 30.81434
Olhando o resultado do AIC para reg3 e para reg4, é possível concluir por manter a reg4. O leitor pode tentar incluir a variável tendência novamente, mas o modelo não melhora.
library(lmtest)
lmtest::resettest(reg4, power = 2:4, type = "fitted")
RESET test
data: reg4
RESET = 3.0493, df1 = 3, df2 = 10, p-value = 0.07888
JB.reg4 <- tseries::jarque.bera.test(resid(reg4))
JB.reg4
Jarque Bera Test
data: resid(reg4)
X-squared = 5.0641, df = 2, p-value = 0.0795
Se observar o RESET de reg4 - o modelo acusa problema de especificação (talvez poderíamos tentar um log-log) - e o Jarque-Bera acusa erros não normais. Comparando para a reg3, estava melhor com z e Population pois o RESET de reg3 fica melhor e o JB de reg3 também.
library(lmtest)
lmtest::resettest(reg3, power = 2:4, type = "fitted")
RESET test
data: reg3
RESET = 2.3317, df1 = 3, df2 = 9, p-value = 0.1425
JB.reg3 <- tseries::jarque.bera.test(resid(reg3))
JB.reg3
Jarque Bera Test
data: resid(reg3)
X-squared = 1.3196, df = 2, p-value = 0.517
Farei por fim o teste de White para heteroscedasticidade em reg3.
# reg3<-lm(Employed~z+Armed.Forces+ Population,data=tsdata)
m <- reg3
data <- dados
# rotina do teste com base em m e data
u2 <- m$residuals^2 #não mexer
reg.auxiliar <- lm(u2 ~ z + Armed.Forces + Population + I(z^2) + I(Armed.Forces^2) +
I(Population^2) + I(z * Armed.Forces) + I(z * Population) + I(Armed.Forces *
Population))
summary(reg.auxiliar)
Call:
lm(formula = u2 ~ z + Armed.Forces + Population + I(z^2) + I(Armed.Forces^2) +
I(Population^2) + I(z * Armed.Forces) + I(z * Population) +
I(Armed.Forces * Population))
Residuals:
Min 1Q Median 3Q Max
-0.46011 -0.06992 -0.01890 0.08018 0.46426
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 6.041e+02 8.463e+02 0.714 0.502
z 1.698e-01 2.794e-01 0.608 0.566
Armed.Forces -3.807e-02 3.794e-02 -1.004 0.354
Population -1.473e-02 2.250e-02 -0.655 0.537
I(z^2) 1.114e-05 2.422e-05 0.460 0.662
I(Armed.Forces^2) 5.413e-07 7.662e-07 0.706 0.506
I(Population^2) 8.905e-08 1.503e-07 0.592 0.575
I(z * Armed.Forces) -5.051e-06 7.156e-06 -0.706 0.507
I(z * Population) -2.022e-06 3.768e-06 -0.537 0.611
I(Armed.Forces * Population) 4.580e-07 5.193e-07 0.882 0.412
Residual standard error: 0.317 on 6 degrees of freedom
Multiple R-squared: 0.6325, Adjusted R-squared: 0.08116
F-statistic: 1.147 on 9 and 6 DF, p-value: 0.4496
Ru2 <- summary(reg.auxiliar)$r.squared
LM <- nrow(data) * Ru2
# obtendo o numero de regressores menos o intercepto
k <- length(coefficients(reg.auxiliar)) - 1
k
[1] 9
p.value <- 1 - pchisq(LM, k) # O TESTE TEM k TERMOS REGRESSORES EM reg.auxiliar
# c('LM','p.value')
#'Resultado do teste de White sem termos cruzados
#'H0: residuos homocedásticos
c(LM, p.value)
[1] 10.1193980 0.3409061
Ou o mesmo teste pelo bptest
, concluiremos pela ausência de heteroscedasticidade dos resíduos.
library(lmtest)
bptest(reg3, ~z + Armed.Forces + Population + I(z^2) + I(Armed.Forces^2) + I(Population^2) +
I(z * Armed.Forces) + I(z * Population) + I(Armed.Forces * Population))
studentized Breusch-Pagan test
data: reg3
BP = 10.119, df = 9, p-value = 0.3409
Portanto, o modelo está finalizado, seu relatório não necessariamente conterá todos os detalhes aqui expostos, uma vez que vários passos são etapas intermediárias que um leitor não estará interessado, e você poderá optar por colocar em um apêndice de seu relatório. A solução portanto é a reg3, como abaixo.
library(stargazer)
star.3 <- stargazer(reg3, title = "Título: Resultado da Regressão", align = TRUE,
type = "text", style = "all", keep.stat = c("aic", "bic", "rsq", "adj.rsq", "n"))
Título: Resultado da Regressão
=============================================
Dependent variable:
---------------------------
Employed
---------------------------------------------
z 0.010***
(0.002)
t = 5.435
p = 0.0002
Armed.Forces -0.007*
(0.003)
t = -2.135
p = 0.055
Population -0.300*
(0.142)
t = -2.113
p = 0.057
Constant 65.720***
(10.625)
t = 6.186
p = 0.00005
---------------------------------------------
Observations 16
R2 0.981
Adjusted R2 0.977
Akaike Inf. Crit. 30.814
=============================================
Note: *p<0.1; **p<0.05; ***p<0.01
Farei o modelo log-log de reg4 somente para desencargo de consciência. O modelo não melhorou comparado ao reg3.
reg4log <- lm(log(Employed) ~ log(pnbrpc) + log(Armed.Forces), data = tsdata)
summary(reg4log)
Call:
lm(formula = log(Employed) ~ log(pnbrpc) + log(Armed.Forces),
data = tsdata)
Residuals:
Min 1Q Median 3Q Max
-0.0237326 -0.0018919 0.0007783 0.0042859 0.0123076
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 6.42715 0.15610 41.172 3.69e-15 ***
log(pnbrpc) 0.58402 0.03117 18.734 8.65e-11 ***
log(Armed.Forces) -0.04211 0.01097 -3.838 0.00205 **
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.009085 on 13 degrees of freedom
Multiple R-squared: 0.9755, Adjusted R-squared: 0.9717
F-statistic: 258.4 on 2 and 13 DF, p-value: 3.419e-11
reg4log.vif <- vif(reg4log)
reg4log.vif
log(pnbrpc) log(Armed.Forces)
1.947171 1.947171
reg4log$AIC <- AIC(reg4log)
reg4log$AIC
[1] -100.3526
library(lmtest)
lmtest::resettest(reg4log, power = 2:4, type = "fitted")
RESET test
data: reg4log
RESET = 4.2037, df1 = 3, df2 = 10, p-value = 0.03635
JB.reg4log <- tseries::jarque.bera.test(resid(reg4log))
JB.reg4log
Jarque Bera Test
data: resid(reg4log)
X-squared = 8.1529, df = 2, p-value = 0.01697
GREENE, William H. Econometric analysis. Pearson Education, 2006. 1026 p.
GUJARATI, Damodar N.; PORTER, Dawn C. Econometria básica. 5.ed. Porto Alegre: MGH/Bookman/McGraw-Hill do Brasil, 2011.
GUJARATI, Damodar N. Basic Econometrics. 4th edition. The McGraw−Hill Companies, 2004.