Abstract
This is an undergrad student level exercise for class use. We analyse data on 81 cars about MPG (average miles per gallons). The example is draw following Gujarati and Porter (2011, p.408-409), portuguese edition.This work is licensed under the Creative Commons Attribution-ShareAlike 4.0 International License. To view a copy of this license, visit http://creativecommons.org/licenses/by-sa/4.0/ or send a letter to Creative Commons, PO Box 1866, Mountain View, CA 94042, USA.
License: CC BY-SA 4.0
Sugestão de citação: FIGUEIREDO, Adriano Marcos Rodrigues. Econometria: exercício 11.15, automóveis, Gujarati e Porter (2011, p.408-409). Campo Grande-MS,Brasil: RStudio/Rpubs, 2018 (revisto em 2020). Disponível em https://rpubs.com/amrofi/exercicio_gujarati_11_15.
Os primeiros passos são criar ou abrir um diretório de trabalho. Se optar por criar um novo projeto, haverá a possibilidade de criar em uma pasta vazia. No presente caso, geramos um dput()
dos dados de modo a incorporar (embed) a informação no código.
11.15. A Tabela 11.7 apresenta dados de 81 carros sobre MPG (milhas por galão de combustível), HP (potência do motor), VOL (espaço interno em metros cúbicos), SP (velocidade máxima,milhas por hora), e WT (peso do veículo em 100 libras). (1) \[ {MPG_i} = \beta_1 + \beta_2 SP_{i} + \beta_3 HP_{i} + \beta_4 WT_{i} + u_i \]
Chamando os dados de table11-7.wf1.RDS
a partir do arquivo ou com a partir do resultado do dput()
.
> `table11-7.wf1` <- readRDS("~/disciplinas/econometria/material de aula/laboratorio_R/Gujarati_auto_11_15/table11-7.wf1.RDS")
> dados <- structure(list(HP = c(49, 55, 55, 70, 53, 70, 55, 62, 62, 80,
+ 73, 92, 92, 73, 66, 73, 78, 92, 78, 90, 92, 74, 95, 81, 95, 92,
+ 92, 92, 52, 103, 84, 84, 102, 102, 81, 90, 90, 102, 102, 130,
+ 95, 95, 102, 95, 93, 100, 100, 98, 130, 115, 115, 115, 115, 180,
+ 160, 130, 96, 115, 100, 100, 145, 120, 140, 140, 150, 165, 165,
+ 165, 165, 245, 280, 162, 162, 140, 140, 175, 322, 238, 263, 295,
+ 236), MPG = c(65.4, 56, 55.9, 49, 46.5, 46.2, 45.4, 59.2, 53.3,
+ 43.4, 41.1, 40.9, 40.9, 40.4, 39.6, 39.3, 38.9, 38.8, 38.2, 42.2,
+ 40.9, 40.7, 40, 39.3, 38.8, 38.4, 38.4, 38.4, 46.9, 36.3, 36.1,
+ 36.1, 35.4, 35.3, 35.1, 35.1, 35, 33.2, 32.9, 32.3, 32.2, 32.2,
+ 32.2, 32.2, 31.5, 31.5, 31.4, 31.4, 31.2, 33.7, 32.6, 31.3, 31.3,
+ 30.4, 28.9, 28, 28, 28, 28, 28, 27.7, 25.6, 25.3, 23.9, 23.6,
+ 23.6, 23.6, 23.6, 23.6, 23.5, 23.4, 23.4, 23.1, 22.9, 22.9, 19.5,
+ 18.1, 17.2, 17, 16.7, 13.2), SP = c(96, 97, 97, 105, 96, 105,
+ 97, 98, 98, 107, 103, 113, 113, 103, 100, 103, 106, 113, 106,
+ 109, 110, 101, 111, 105, 111, 110, 110, 110, 90, 112, 103, 103,
+ 111, 111, 102, 106, 106, 109, 109, 120, 106, 106, 109, 106, 105,
+ 108, 108, 107, 120, 109, 109, 109, 109, 133, 125, 115, 102, 109,
+ 104, 105, 120, 107, 114, 114, 117, 122, 122, 122, 122, 148, 160,
+ 121, 121, 110, 110, 121, 165, 140, 147, 157, 130), VOL = c(89,
+ 92, 92, 92, 92, 89, 92, 50, 50, 94, 89, 50, 99, 89, 89, 89, 91,
+ 50, 91, 103, 99, 107, 101, 96, 89, 50, 117, 99, 104, 107, 114,
+ 101, 97, 113, 101, 98, 88, 86, 86, 92, 113, 106, 92, 88, 102,
+ 99, 111, 103, 86, 101, 101, 101, 124, 113, 113, 124, 92, 101,
+ 94, 115, 111, 116, 131, 123, 121, 50, 114, 127, 123, 112, 50,
+ 135, 132, 160, 129, 129, 50, 115, 50, 119, 107), WT = c(17.5,
+ 20, 20, 20, 20, 20, 20, 22.5, 22.5, 22.5, 22.5, 22.5, 22.5, 22.5,
+ 22.5, 22.5, 22.5, 22.5, 22.5, 25, 25, 25, 25, 25, 25, 25, 25,
+ 25, 27.5, 27.5, 27.5, 27.5, 27.5, 27.5, 27.5, 27.5, 27.5, 30,
+ 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 35, 35, 35, 35, 35,
+ 35, 35, 35, 35, 35, 35, 35, 40, 40, 40, 40, 40, 40, 40, 40, 40,
+ 40, 40, 40, 45, 45, 45, 45, 45, 45, 45, 55), TABLE01 = c(8.34402699627665e-309,
+ 0, 0, 3.20429648707324e-306, 3.20445946173795e-306, 0, 0, 0,
+ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+ 0, 0, 0, 3.23790861658519e-319, 2.4484543718768e-312, 0, 1.51327318667226e-312,
+ 0, 6.65811783208703e-315, 0, 6.65455613260878e-315, 0, 1.61929299987769e-312,
+ 0, 6.99097483787198e-315, 0, 5.92878775009496e-323, 0, 5.92878775009496e-323,
+ 0, 5.92878775009496e-323, 0, 5.92878775009496e-323, 0, 5.92878775009496e-323,
+ 0, 5.92878775009496e-323, 0, 1.587209602416e-317, 0, -2.35343793465486e-185,
+ 5.33100607018305e-315, 7.99801570164402e-317, 5.33348307323913e-315,
+ 7.99801570164402e-317, 5.32846431488342e-315, 7.99801570164402e-317,
+ 5.33291643923123e-315, 7.99801570164402e-317, 5.32101712506528e-315,
+ 1.61958868858186e-317, 0, 7.99801570164402e-317, 5.32959758289923e-315,
+ 7.99801570164402e-317, 5.33356402095455e-315, 7.99801570164402e-317,
+ 5.3294356874684e-315, 7.99801570164402e-317, 5.33315928237747e-315,
+ 7.99801570164402e-317, 5.32182660221942e-315)), class = "data.frame", row.names = c(NA,
+ -81L))
> attach(dados)
Vamos ver como está a tabela importada:
> library(knitr)
> kable(dados[,1:5],caption="Dados da tabela 11.7")
HP | MPG | SP | VOL | WT |
---|---|---|---|---|
49 | 65.4 | 96 | 89 | 17.5 |
55 | 56.0 | 97 | 92 | 20.0 |
55 | 55.9 | 97 | 92 | 20.0 |
70 | 49.0 | 105 | 92 | 20.0 |
53 | 46.5 | 96 | 92 | 20.0 |
70 | 46.2 | 105 | 89 | 20.0 |
55 | 45.4 | 97 | 92 | 20.0 |
62 | 59.2 | 98 | 50 | 22.5 |
62 | 53.3 | 98 | 50 | 22.5 |
80 | 43.4 | 107 | 94 | 22.5 |
73 | 41.1 | 103 | 89 | 22.5 |
92 | 40.9 | 113 | 50 | 22.5 |
92 | 40.9 | 113 | 99 | 22.5 |
73 | 40.4 | 103 | 89 | 22.5 |
66 | 39.6 | 100 | 89 | 22.5 |
73 | 39.3 | 103 | 89 | 22.5 |
78 | 38.9 | 106 | 91 | 22.5 |
92 | 38.8 | 113 | 50 | 22.5 |
78 | 38.2 | 106 | 91 | 22.5 |
90 | 42.2 | 109 | 103 | 25.0 |
92 | 40.9 | 110 | 99 | 25.0 |
74 | 40.7 | 101 | 107 | 25.0 |
95 | 40.0 | 111 | 101 | 25.0 |
81 | 39.3 | 105 | 96 | 25.0 |
95 | 38.8 | 111 | 89 | 25.0 |
92 | 38.4 | 110 | 50 | 25.0 |
92 | 38.4 | 110 | 117 | 25.0 |
92 | 38.4 | 110 | 99 | 25.0 |
52 | 46.9 | 90 | 104 | 27.5 |
103 | 36.3 | 112 | 107 | 27.5 |
84 | 36.1 | 103 | 114 | 27.5 |
84 | 36.1 | 103 | 101 | 27.5 |
102 | 35.4 | 111 | 97 | 27.5 |
102 | 35.3 | 111 | 113 | 27.5 |
81 | 35.1 | 102 | 101 | 27.5 |
90 | 35.1 | 106 | 98 | 27.5 |
90 | 35.0 | 106 | 88 | 27.5 |
102 | 33.2 | 109 | 86 | 30.0 |
102 | 32.9 | 109 | 86 | 30.0 |
130 | 32.3 | 120 | 92 | 30.0 |
95 | 32.2 | 106 | 113 | 30.0 |
95 | 32.2 | 106 | 106 | 30.0 |
102 | 32.2 | 109 | 92 | 30.0 |
95 | 32.2 | 106 | 88 | 30.0 |
93 | 31.5 | 105 | 102 | 30.0 |
100 | 31.5 | 108 | 99 | 30.0 |
100 | 31.4 | 108 | 111 | 30.0 |
98 | 31.4 | 107 | 103 | 30.0 |
130 | 31.2 | 120 | 86 | 30.0 |
115 | 33.7 | 109 | 101 | 35.0 |
115 | 32.6 | 109 | 101 | 35.0 |
115 | 31.3 | 109 | 101 | 35.0 |
115 | 31.3 | 109 | 124 | 35.0 |
180 | 30.4 | 133 | 113 | 35.0 |
160 | 28.9 | 125 | 113 | 35.0 |
130 | 28.0 | 115 | 124 | 35.0 |
96 | 28.0 | 102 | 92 | 35.0 |
115 | 28.0 | 109 | 101 | 35.0 |
100 | 28.0 | 104 | 94 | 35.0 |
100 | 28.0 | 105 | 115 | 35.0 |
145 | 27.7 | 120 | 111 | 35.0 |
120 | 25.6 | 107 | 116 | 40.0 |
140 | 25.3 | 114 | 131 | 40.0 |
140 | 23.9 | 114 | 123 | 40.0 |
150 | 23.6 | 117 | 121 | 40.0 |
165 | 23.6 | 122 | 50 | 40.0 |
165 | 23.6 | 122 | 114 | 40.0 |
165 | 23.6 | 122 | 127 | 40.0 |
165 | 23.6 | 122 | 123 | 40.0 |
245 | 23.5 | 148 | 112 | 40.0 |
280 | 23.4 | 160 | 50 | 40.0 |
162 | 23.4 | 121 | 135 | 40.0 |
162 | 23.1 | 121 | 132 | 40.0 |
140 | 22.9 | 110 | 160 | 45.0 |
140 | 22.9 | 110 | 129 | 45.0 |
175 | 19.5 | 121 | 129 | 45.0 |
322 | 18.1 | 165 | 50 | 45.0 |
238 | 17.2 | 140 | 115 | 45.0 |
263 | 17.0 | 147 | 50 | 45.0 |
295 | 16.7 | 157 | 119 | 45.0 |
236 | 13.2 | 130 | 107 | 55.0 |
Estimando o modelo linear de regressao multipla fazendo conforme a expressão do enunciado.
Fazendo as regressoes com uso de logaritmo e depois sem logaritmos.
> mod1 <- lm(MPG~SP+HP+WT)
> mod2<-lm(log(MPG)~log(SP)+log(HP)+log(WT))
Vamos utilizar o pacote stargazer para organizar as saídas de resultados. Se a saída fosse apenas pelo comando summary, sairia da forma:
> summary(mod1)
Call:
lm(formula = MPG ~ SP + HP + WT)
Residuals:
Min 1Q Median 3Q Max
-6.1349 -2.9527 -0.0052 1.6701 12.4834
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 189.95968 22.52879 8.432 1.50e-12 ***
SP -1.27170 0.23312 -5.455 5.72e-07 ***
HP 0.39043 0.07625 5.121 2.19e-06 ***
WT -1.90327 0.18552 -10.259 4.64e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 3.508 on 77 degrees of freedom
Multiple R-squared: 0.8829, Adjusted R-squared: 0.8783
F-statistic: 193.5 on 3 and 77 DF, p-value: < 2.2e-16
> summary(mod2)
Call:
lm(formula = log(MPG) ~ log(SP) + log(HP) + log(WT))
Residuals:
Min 1Q Median 3Q Max
-0.23855 -0.04908 0.00072 0.04239 0.24049
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 5.0681 2.5426 1.993 0.0498 *
log(SP) 0.5732 0.6750 0.849 0.3984
log(HP) -0.5052 0.2919 -1.731 0.0875 .
log(WT) -0.5687 0.2204 -2.580 0.0118 *
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.08305 on 77 degrees of freedom
Multiple R-squared: 0.9291, Adjusted R-squared: 0.9264
F-statistic: 336.6 on 3 and 77 DF, p-value: < 2.2e-16
Agora, criando uma tabela com as várias saídas de modelos, com o pacote stargazer tem-se, com a geração de AIC e BIC:
> library(stargazer)
> mod1$AIC <- AIC(mod1)
> mod2$AIC <- AIC(mod2)
> mod1$BIC <- BIC(mod1)
> mod2$BIC <- BIC(mod2)
> library(stargazer)
> star.1 <- stargazer(mod1, mod2,
+ title="Título: Resultados das Regressões",
+ align=TRUE,
+ type = "text", style = "all",
+ keep.stat=c("aic","bic","rsq", "adj.rsq","n")
+ )
Título: Resultados das Regressões
================================================
Dependent variable:
----------------------------
MPG log(MPG)
(1) (2)
------------------------------------------------
SP -1.272***
(0.233)
t = -5.455
p = 0.00000
HP 0.390***
(0.076)
t = 5.121
p = 0.00001
WT -1.903***
(0.186)
t = -10.259
p = 0.000
log(SP) 0.573
(0.675)
t = 0.849
p = 0.399
log(HP) -0.505*
(0.292)
t = -1.731
p = 0.088
log(WT) -0.569**
(0.220)
t = -2.580
p = 0.012
Constant 189.960*** 5.068**
(22.529) (2.543)
t = 8.432 t = 1.993
p = 0.000 p = 0.050
------------------------------------------------
Observations 81 81
R2 0.883 0.929
Adjusted R2 0.878 0.926
Akaike Inf. Crit. 439.078 -167.339
Bayesian Inf. Crit. 451.050 -155.367
================================================
Note: *p<0.1; **p<0.05; ***p<0.01
A análise de correlação permitirá ter uma ideia inicial de possível multicolinearidade:
> # install.packages('corrplot')
> library(corrplot)
corrplot 0.84 loaded
> corel <- cor(dados)
Warning in cor(dados): o desvio padrão é zero
> corrplot(corel, method = "number")
Esses testes fazem uma estatística para avaliar um modelo restrito contra o modelo irrestrito. No presente caso, faremos apenas para o modelo log-log (mod2
).
> library(car)
Loading required package: carData
> # F test: APENAS ESPECIFICO AS QUE TERAO TESTE DE OMISSAO
> myH0 <- c("log(SP)")
> linearHypothesis(mod2, myH0)
> library(lmtest)
Loading required package: zoo
Attaching package: 'zoo'
The following objects are masked from 'package:base':
as.Date, as.Date.numeric
> lmtest::resettest(mod2,power = 2:4,type = "fitted")
RESET test
data: mod2
RESET = 11.087, df1 = 3, df2 = 74, p-value = 4.307e-06
> mod3<-lm(log(MPG)~log(HP)+log(WT))
> lmtest::resettest(mod3,power = 2:4,type = "fitted")
RESET test
data: mod3
RESET = 6.8878, df1 = 3, df2 = 75, p-value = 0.000369
> # resíduos do modelo 2
> u.hat<-resid(mod2)
> # teste de média dos resíduos: H0: resíduos têm média zero
> t.test(u.hat)
One Sample t-test
data: u.hat
t = 5.5732e-16, df = 80, p-value = 1
alternative hypothesis: true mean is not equal to 0
95 percent confidence interval:
-0.01801646 0.01801646
sample estimates:
mean of x
5.045556e-18
Histograma
> # histograma
> hist.mod2<- hist(u.hat, freq = FALSE)
> curve(dnorm, add = TRUE,col="red")
Q-QPlot
> car::qqPlot(mod2)
[1] 8 81
Teste de normalidade Jarque-Bera
> JB.mod2<-tseries::jarque.bera.test(u.hat)
Registered S3 method overwritten by 'quantmod':
method from
as.zoo.data.frame zoo
> JB.mod2
Jarque Bera Test
data: u.hat
X-squared = 2.5379, df = 2, p-value = 0.2811
Outros testes de normalidade
> library(nortest)
> # Testes
> t1.2 <- ks.test(u.hat, "pnorm") # KS
Warning in ks.test(u.hat, "pnorm"): ties should not be present for the
Kolmogorov-Smirnov test
> t2.2 <- lillie.test(u.hat) # Lilliefors
> t3.2 <- cvm.test(u.hat) # Cramér-von Mises
> t4.2 <- shapiro.test(u.hat) # Shapiro-Wilk
> t5.2 <- sf.test(u.hat) # Shapiro-Francia
> t6.2 <- ad.test(u.hat) # Anderson-Darling
> # Tabela de resultados
> testes <- c(t1.2$method, t2.2$method, t3.2$method, t4.2$method, t5.2$method,
+ t6.2$method)
> estt <- as.numeric(c(t1.2$statistic, t2.2$statistic, t3.2$statistic,
+ t4.2$statistic, t5.2$statistic, t6.2$statistic))
> valorp <- c(t1.2$p.value, t2.2$p.value, t3.2$p.value, t4.2$p.value, t5.2$p.value,
+ t6.2$p.value)
> resultados <- cbind(estt, valorp)
> rownames(resultados) <- testes
> colnames(resultados) <- c("Estatística", "valor-prob")
> print(resultados, digits = 4)
Estatística valor-prob
One-sample Kolmogorov-Smirnov test 0.42940 2.132e-13
Lilliefors (Kolmogorov-Smirnov) normality test 0.09677 5.835e-02
Cramer-von Mises normality test 0.12013 5.839e-02
Shapiro-Wilk normality test 0.98151 2.927e-01
Shapiro-Francia normality test 0.97686 1.323e-01
Anderson-Darling normality test 0.63615 9.392e-02
> knitr::kable(resultados, digits = 4,caption = "Outros testes de normalidade dos resíduos")
Estatística | valor-prob | |
---|---|---|
One-sample Kolmogorov-Smirnov test | 0.4294 | 0.0000 |
Lilliefors (Kolmogorov-Smirnov) normality test | 0.0968 | 0.0583 |
Cramer-von Mises normality test | 0.1201 | 0.0584 |
Shapiro-Wilk normality test | 0.9815 | 0.2927 |
Shapiro-Francia normality test | 0.9769 | 0.1323 |
Anderson-Darling normality test | 0.6361 | 0.0939 |
> #teste de White
> # mod2<-lm(log(MPG)~log(SP)+log(HP)+log(WT))
> m <- mod2
> data <- dados
> #rotina do teste com base em m e data
> u2 <- m$residuals^2
> reg.auxiliar <- lm(u2 ~ log(SP)+log(HP)+log(WT)+ I(log(SP)^2)+I(log(HP)^2)+I(log(WT)^2)) #sem termos cruzados, no cross-terms
> summary(reg.auxiliar)
Call:
lm(formula = u2 ~ log(SP) + log(HP) + log(WT) + I(log(SP)^2) +
I(log(HP)^2) + I(log(WT)^2))
Residuals:
Min 1Q Median 3Q Max
-0.013737 -0.004442 -0.002139 0.000865 0.046592
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -4.70786 5.64805 -0.834 0.4072
log(SP) 2.51946 2.28752 1.101 0.2743
log(HP) -0.47058 0.22131 -2.126 0.0368 *
log(WT) 0.09334 0.24070 0.388 0.6993
I(log(SP)^2) -0.28350 0.22285 -1.272 0.2073
I(log(HP)^2) 0.05646 0.02693 2.096 0.0395 *
I(log(WT)^2) -0.01887 0.03839 -0.491 0.6246
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.01022 on 74 degrees of freedom
Multiple R-squared: 0.2082, Adjusted R-squared: 0.144
F-statistic: 3.243 on 6 and 74 DF, p-value: 0.006953
> Ru2<- summary(reg.auxiliar)$r.squared
> LM <- nrow(data)*Ru2
> #obtendo o numero de regressores menos o intercepto
> k <- length(coefficients(reg.auxiliar))-1
> k
[1] 6
> p.value <- 1-pchisq(LM, k) # O TESTE TEM k TERMOS REGRESSORES EM reg.auxiliar
> #c("LM","p.value")
> c(LM, p.value)
[1] 16.862401860 0.009802465
> vcov.white0<-hccm(mod2,type=c("hc1"))
> coeftest(mod2,vcov.white0)
t test of coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 5.06808 2.80526 1.8066 0.07473 .
log(SP) 0.57320 0.75101 0.7632 0.44766
log(HP) -0.50524 0.32114 -1.5733 0.11975
log(WT) -0.56866 0.23124 -2.4592 0.01617 *
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
GUJARATI, Damodar N.; PORTER, Dawn C. Econometria básica. 5.ed. Porto Alegre: AMGH/Bookman/McGraw-Hill do Brasil, 2011.