Abstract
We analyse soybean production at Sinop-MT with dummies to sepparate census sectors.This work is licensed under the Creative Commons Attribution-ShareAlike 4.0 International License. To view a copy of this license, visit http://creativecommons.org/licenses/by-sa/4.0/ or send a letter to Creative Commons, PO Box 1866, Mountain View, CA 94042, USA.
License: CC BY-SA 4.0
Sugestão de citação: FIGUEIREDO, Adriano Marcos Rodrigues. Econometria: exercício produção de soja em Sinop com dummies. Campo Grande-MS,Brasil: RStudio/Rpubs, 2019. Disponível em http://rpubs.com/amrofi/exercicio_sinop.
Os primeiros passos são criar ou abrir um diretório de trabalho. Se optar por criar um novo projeto, haverá a possibilidade de criar em uma pasta vazia. Em seguida, sugere-se que coloque os dados nesta pasta, se possível em um arquivo MS Excel e chame a planilha de ‘dados’.Neste caso, a planilha de dados será colocada dentro do código gerado pela função dput()
(desculpe pelo volume de números mas foi a forma que pensamos ser mais prática para reprodutibilidade do exercício).
O exercício traz dados dos 59 setores censitários de Sinop-MT, a partir do Censo Agropecuário de 2006. As variaveis foram logaritmizadas no excel. VP = valor total da produção; AREA = terra (medida através da área total colhida); QTMAQ = capital (medido através da quantidade total de maquinários), QMO = trabalho (medido através da mão de obra ocupada por pessoas com mais de 14 anos), PCALC = tecnologia (medida através da proporção do uso de calcário); PADUB = tecnologia (medida através da proporção do uso de adubo); PAGROT = tecnologia (medida através da proporção do uso de agrotóxico); PNIVEL = tecnologia (medida através da proporção do uso de curva de nível); PPRAGA = tecnologia (medida através da proporção do uso de controle geral de pragas); e, PROT = tecnologia (medida através da proporção do uso de rotação de culturas).
> dados <- structure(list(obs = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14,
+ 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32,
+ 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50,
+ 51, 52, 53, 54, 55, 56, 57, 58, 59), VP = c(2217238, 3769800, 23653812,
+ 2697579, 6152680, 105352999, 69309029, 912942, 2e+06, 5663000, 9133720,
+ 9946310, 9890900, 12273665, 7313059, 7578840, 2778200, 7309825, 27220, 10167225,
+ 9990067, 13664200, 7054860, 19683575, 14797746, 9478066, 3926460, 9004231,
+ 10225420, 10675076, 59500, 185476, 840000, 79804101, 110434855, 21876960,
+ 66886216, 52673412, 22431430, 15715844, 44087420, 16723000, 29595226, 58868284,
+ 73931129, 7400000, 861940, 133120, 746650, 13804400, 25598775, 519000, 5520000,
+ 21304937, 2736098, 21038425, 21509368, 25718000, 9205120), AREA = c(2909,
+ 4260, 25258, 4011, 3863, 118115, 80492, 1213, 1830, 5727, 7424, 10331, 13128,
+ 15045, 9203, 10184, 2801, 9276, 46, 11066, 10674, 11560, 6206, 21383, 19053,
+ 13021, 5917, 10031, 12917, 10582, 84, 330, 800, 72487, 124889, 19445, 72320,
+ 53864, 23002, 20695, 48025, 15037, 34892, 60234, 87207, 6530, 948, 165,
+ 605, 14533, 14127, 735, 3350, 3931, 2503, 17983, 24196, 29291, 12004), QTMAQ = c(8505,
+ 10867, 39368, 40715, 7909, 183, 138, 4, 7, 19, 18, 30, 37, 48, 22, 21, 12,
+ 43, 1, 19, 34, 6, 19, 17, 40, 41, 17, 33, 20, 36, 9, 16, 6, 131, 175, 36,
+ 121, 85, 43, 69, 77, 20, 57, 117, 154, 35, 19, 24, 10, 7, 32, 8, 15, 10,
+ 10, 45, 60, 46, 33), QMO = c(111, 109, 161, 213, 131, 625, 198, 106, 10,
+ 51, 65, 115, 183, 49, 25, 13, 8, 267, 66, 135, 368, 69, 126, 149, 43, 52,
+ 118, 144, 29, 231, 22, 803, 68, 89, 171, 149, 319, 59, 584, 515, 93, 19,
+ 61, 187, 151, 34, 533, 301, 41, 655, 580, 52, 25, 94, 59, 74, 463, 58, 225),
+ PCALC = c(0.547169811, 0.625, 0.273809524, 0.672897196, 0.46031746, 0.782258065,
+ 0.761363636, 0.217948718, 0.285714286, 0.378378378, 0.166666667, 0.105263158,
+ 0.516483516, 1, 0.777777778, 0.75, 0.666666667, 0.194915254, 0.466666667,
+ 0.289156627, 0.423076923, 0.044444444, 0.479166667, 0.133333333, 0.857142857,
+ 0.958333333, 0.680851064, 0.528735632, 0.714285714, 0.447552448, 0.266666667,
+ 0.102040816, 0.290322581, 0.823529412, 0.97260274, 0.484375, 0.836206897,
+ 0.939393939, 0.5, 0.617977528, 0.942857143, 0.833333333, 0.681818182,
+ 0.806451613, 0.516666667, 0.555555556, 0.184210526, 0.074324324, 0.136363636,
+ 0.016620499, 0.15015015, 0.095238095, 0.4, 0.288461538, 0.448275862,
+ 0.4375, 0.230769231, 0.619047619, 0.069306931), PADUB = c(0.555555556,
+ 0.704545455, 0.317073171, 0.662162162, 0.625, 0.758196721, 0.744186047,
+ 0.038961039, 0.285714286, 0.263157895, 0.117647059, 0.081081081, 0.544303797,
+ 0.96, 0.705882353, 0.666666667, 0.6, 0.371900826, 0.7, 0.353658537,
+ 0.387878788, 0.0487804879999999, 0.351351351, 0.258928571, 1, 0.84,
+ 0.76, 0.512820513, 0.75, 0.601449275, 0.272727273, 0.077639752, 0.470588235,
+ 0.88, 0.985507246, 0.666666667, 0.846153846, 0.909090909, 0.795180723,
+ 0.483443709, 0.96969697, 1, 0.954545455, 0.875, 0.918032787, 0.5, 0.124423963,
+ 0.073825503, 0.1, 0.016759777, 0.121019108, 0.238095238, 0.4, 0.421052632,
+ 0.566666667, 0.741935484, 0.296137339, 0.727272727, 0.106796117), PAGROT = c(0.339622642,
+ 0.3125, 0.285714286, 0.46728972, 0.396825397, 0.649193548, 0.727272727,
+ 0.038461538, 0.571428571, 0.216216216, 0.194444444, 0.131578947, 0.428571429,
+ 0.928571429, 0.722222222, 0.416666667, 0.833333333, 0.254237288, 0.5,
+ 0.34939759, 0.466346154, 0.088888889, 0.197916667, 0.161904762, 0.761904762,
+ 0.791666667, 0.574468085, 0.517241379, 0.80952381, 0.426573427, 0.266666667,
+ 0.110787172, 0.258064516, 0.882352941, 0.97260274, 0.375, 0.663793103,
+ 0.939393939, 0.245833333, 0.47752809, 0.942857143, 1, 0.954545455, 0.774193548,
+ 0.916666667, 0.666666667, 0.105263158, 0.128378378, 0.090909091, 0.033240997,
+ 0.144144144, 0.238095238, 0.4, 0.25, 0.517241379, 0.65625, 0.243589744,
+ 0.761904762, 0.069306931), PNIVEL = c(0.150943396, 0.041666667, 0.238095238,
+ 0.429906542, 0.333333333, 0.665322581, 0.659090909, 0.025641026, 0.142857143,
+ 0.081081081, 0.083333333, 0.105263158, 0.263736264, 0.535714286, 0.444444444,
+ 0.25, 0.666666667, 0.127118644, 1e-11, 0.180722892, 0.158653846, 0.088888889,
+ 0.479166667, 0.0380952379999999, 0.619047619, 0.541666667, 0.212765957,
+ 0.24137931, 0.142857143, 0.748251748, 0.066666667, 0.06122449, 1e-11,
+ 0.745098039, 0.945205479, 0.21875, 0.560344828, 0.696969697, 0.95, 0.061797753,
+ 0.828571429, 1, 0.818181818, 0.403225806, 0.733333333, 0.555555556,
+ 0.013157895, 1e-11, 0.136363636, 1e-11, 0.066066066, 0.142857143, 0.4,
+ 0.019230769, 0.206896552, 0.71875, 0.282051282, 0.476190476, 0.03960396),
+ PROT = c(0.018867925, 0.145833333, 0.19047619, 0.037383178, 1e-11, 0.302419355,
+ 0.625, 0.025641026, 1e-11, 0.054054054, 0.027777778, 0.013157895, 0.351648352,
+ 0.821428571, 0.444444444, 0.5, 0.333333333, 0.13559322, 0.1, 0.313253012,
+ 0.173076923, 0.044444444, 0.020833333, 0.028571429, 0.666666667, 0.666666667,
+ 0.319148936, 0.390804598, 0.714285714, 0.097902098, 0.066666667, 0.017492711,
+ 0.129032258, 0.745098039, 0.164383562, 0.109375, 0.49137931, 0.787878788,
+ 0.029166667, 0.123595506, 0.257142857, 0.333333333, 0.272727273, 0.435483871,
+ 0.416666667, 0.333333333, 0.171052632, 0.006756757, 0.045454545, 0.002770083,
+ 0.093093093, 0.095238095, 1e-11, 0.115384615, 0.137931034, 0.09375,
+ 0.08974359, 0.380952381, 0.03960396), PPRAGA = c(0.283018868, 0.083333333,
+ 0.047619048, 0.373831776, 0.126984127, 0.060483871, 0.125, 1e-11, 0.714285714,
+ 0.027027027, 0.027777778, 0.013157895, 0.208791209, 0.857142857, 0.222222222,
+ 0.5, 0.833333333, 0.440677966, 0.266666667, 0.168674699, 0.471153846,
+ 0.044444444, 1e-11, 0.114285714, 0.476190476, 0.291666667, 0.340425532,
+ 0.275862069, 0.476190476, 0.006993007, 0.2, 0.093294461, 0.129032258,
+ 0.274509804, 0.01369863, 0.25, 0.318965517, 0.151515152, 0.0125, 0.651685393,
+ 0.171428571, 0.166666667, 0.090909091, 0.370967742, 0.183333333, 0.111111111,
+ 0.039473684, 0.006756757, 1e-11, 0.041551247, 0.204204204, 0.142857143,
+ 1e-11, 0.365384615, 0.068965517, 0.15625, 0.141025641, 0.142857143,
+ 0.03960396), D51 = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0), D54 = c(0,
+ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+ 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0), D51_54 = c(0, 0, 0, 0, 0, 0, 0,
+ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0,
+ 1, 0, 0, 0, 0, 0)), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA,
+ -59L))
> attach(dados)
Vamos ver como está parte da tabela importada:
> library(knitr)
> kable(dados[, 1:11], digits = 2, caption = "Dados das colunas 1 a 11")
obs | VP | AREA | QTMAQ | QMO | PCALC | PADUB | PAGROT | PNIVEL | PROT | PPRAGA |
---|---|---|---|---|---|---|---|---|---|---|
1 | 2217238 | 2909 | 8505 | 111 | 0.55 | 0.56 | 0.34 | 0.15 | 0.02 | 0.28 |
2 | 3769800 | 4260 | 10867 | 109 | 0.62 | 0.70 | 0.31 | 0.04 | 0.15 | 0.08 |
3 | 23653812 | 25258 | 39368 | 161 | 0.27 | 0.32 | 0.29 | 0.24 | 0.19 | 0.05 |
4 | 2697579 | 4011 | 40715 | 213 | 0.67 | 0.66 | 0.47 | 0.43 | 0.04 | 0.37 |
5 | 6152680 | 3863 | 7909 | 131 | 0.46 | 0.62 | 0.40 | 0.33 | 0.00 | 0.13 |
6 | 105352999 | 118115 | 183 | 625 | 0.78 | 0.76 | 0.65 | 0.67 | 0.30 | 0.06 |
7 | 69309029 | 80492 | 138 | 198 | 0.76 | 0.74 | 0.73 | 0.66 | 0.62 | 0.12 |
8 | 912942 | 1213 | 4 | 106 | 0.22 | 0.04 | 0.04 | 0.03 | 0.03 | 0.00 |
9 | 2000000 | 1830 | 7 | 10 | 0.29 | 0.29 | 0.57 | 0.14 | 0.00 | 0.71 |
10 | 5663000 | 5727 | 19 | 51 | 0.38 | 0.26 | 0.22 | 0.08 | 0.05 | 0.03 |
11 | 9133720 | 7424 | 18 | 65 | 0.17 | 0.12 | 0.19 | 0.08 | 0.03 | 0.03 |
12 | 9946310 | 10331 | 30 | 115 | 0.11 | 0.08 | 0.13 | 0.11 | 0.01 | 0.01 |
13 | 9890900 | 13128 | 37 | 183 | 0.52 | 0.54 | 0.43 | 0.26 | 0.35 | 0.21 |
14 | 12273665 | 15045 | 48 | 49 | 1.00 | 0.96 | 0.93 | 0.54 | 0.82 | 0.86 |
15 | 7313059 | 9203 | 22 | 25 | 0.78 | 0.71 | 0.72 | 0.44 | 0.44 | 0.22 |
16 | 7578840 | 10184 | 21 | 13 | 0.75 | 0.67 | 0.42 | 0.25 | 0.50 | 0.50 |
17 | 2778200 | 2801 | 12 | 8 | 0.67 | 0.60 | 0.83 | 0.67 | 0.33 | 0.83 |
18 | 7309825 | 9276 | 43 | 267 | 0.19 | 0.37 | 0.25 | 0.13 | 0.14 | 0.44 |
19 | 27220 | 46 | 1 | 66 | 0.47 | 0.70 | 0.50 | 0.00 | 0.10 | 0.27 |
20 | 10167225 | 11066 | 19 | 135 | 0.29 | 0.35 | 0.35 | 0.18 | 0.31 | 0.17 |
21 | 9990067 | 10674 | 34 | 368 | 0.42 | 0.39 | 0.47 | 0.16 | 0.17 | 0.47 |
22 | 13664200 | 11560 | 6 | 69 | 0.04 | 0.05 | 0.09 | 0.09 | 0.04 | 0.04 |
23 | 7054860 | 6206 | 19 | 126 | 0.48 | 0.35 | 0.20 | 0.48 | 0.02 | 0.00 |
24 | 19683575 | 21383 | 17 | 149 | 0.13 | 0.26 | 0.16 | 0.04 | 0.03 | 0.11 |
25 | 14797746 | 19053 | 40 | 43 | 0.86 | 1.00 | 0.76 | 0.62 | 0.67 | 0.48 |
26 | 9478066 | 13021 | 41 | 52 | 0.96 | 0.84 | 0.79 | 0.54 | 0.67 | 0.29 |
27 | 3926460 | 5917 | 17 | 118 | 0.68 | 0.76 | 0.57 | 0.21 | 0.32 | 0.34 |
28 | 9004231 | 10031 | 33 | 144 | 0.53 | 0.51 | 0.52 | 0.24 | 0.39 | 0.28 |
29 | 10225420 | 12917 | 20 | 29 | 0.71 | 0.75 | 0.81 | 0.14 | 0.71 | 0.48 |
30 | 10675076 | 10582 | 36 | 231 | 0.45 | 0.60 | 0.43 | 0.75 | 0.10 | 0.01 |
31 | 59500 | 84 | 9 | 22 | 0.27 | 0.27 | 0.27 | 0.07 | 0.07 | 0.20 |
32 | 185476 | 330 | 16 | 803 | 0.10 | 0.08 | 0.11 | 0.06 | 0.02 | 0.09 |
33 | 840000 | 800 | 6 | 68 | 0.29 | 0.47 | 0.26 | 0.00 | 0.13 | 0.13 |
34 | 79804101 | 72487 | 131 | 89 | 0.82 | 0.88 | 0.88 | 0.75 | 0.75 | 0.27 |
35 | 110434855 | 124889 | 175 | 171 | 0.97 | 0.99 | 0.97 | 0.95 | 0.16 | 0.01 |
36 | 21876960 | 19445 | 36 | 149 | 0.48 | 0.67 | 0.38 | 0.22 | 0.11 | 0.25 |
37 | 66886216 | 72320 | 121 | 319 | 0.84 | 0.85 | 0.66 | 0.56 | 0.49 | 0.32 |
38 | 52673412 | 53864 | 85 | 59 | 0.94 | 0.91 | 0.94 | 0.70 | 0.79 | 0.15 |
39 | 22431430 | 23002 | 43 | 584 | 0.50 | 0.80 | 0.25 | 0.95 | 0.03 | 0.01 |
40 | 15715844 | 20695 | 69 | 515 | 0.62 | 0.48 | 0.48 | 0.06 | 0.12 | 0.65 |
41 | 44087420 | 48025 | 77 | 93 | 0.94 | 0.97 | 0.94 | 0.83 | 0.26 | 0.17 |
42 | 16723000 | 15037 | 20 | 19 | 0.83 | 1.00 | 1.00 | 1.00 | 0.33 | 0.17 |
43 | 29595226 | 34892 | 57 | 61 | 0.68 | 0.95 | 0.95 | 0.82 | 0.27 | 0.09 |
44 | 58868284 | 60234 | 117 | 187 | 0.81 | 0.88 | 0.77 | 0.40 | 0.44 | 0.37 |
45 | 73931129 | 87207 | 154 | 151 | 0.52 | 0.92 | 0.92 | 0.73 | 0.42 | 0.18 |
46 | 7400000 | 6530 | 35 | 34 | 0.56 | 0.50 | 0.67 | 0.56 | 0.33 | 0.11 |
47 | 861940 | 948 | 19 | 533 | 0.18 | 0.12 | 0.11 | 0.01 | 0.17 | 0.04 |
48 | 133120 | 165 | 24 | 301 | 0.07 | 0.07 | 0.13 | 0.00 | 0.01 | 0.01 |
49 | 746650 | 605 | 10 | 41 | 0.14 | 0.10 | 0.09 | 0.14 | 0.05 | 0.00 |
50 | 13804400 | 14533 | 7 | 655 | 0.02 | 0.02 | 0.03 | 0.00 | 0.00 | 0.04 |
51 | 25598775 | 14127 | 32 | 580 | 0.15 | 0.12 | 0.14 | 0.07 | 0.09 | 0.20 |
52 | 519000 | 735 | 8 | 52 | 0.10 | 0.24 | 0.24 | 0.14 | 0.10 | 0.14 |
53 | 5520000 | 3350 | 15 | 25 | 0.40 | 0.40 | 0.40 | 0.40 | 0.00 | 0.00 |
54 | 21304937 | 3931 | 10 | 94 | 0.29 | 0.42 | 0.25 | 0.02 | 0.12 | 0.37 |
55 | 2736098 | 2503 | 10 | 59 | 0.45 | 0.57 | 0.52 | 0.21 | 0.14 | 0.07 |
56 | 21038425 | 17983 | 45 | 74 | 0.44 | 0.74 | 0.66 | 0.72 | 0.09 | 0.16 |
57 | 21509368 | 24196 | 60 | 463 | 0.23 | 0.30 | 0.24 | 0.28 | 0.09 | 0.14 |
58 | 25718000 | 29291 | 46 | 58 | 0.62 | 0.73 | 0.76 | 0.48 | 0.38 | 0.14 |
59 | 9205120 | 12004 | 33 | 225 | 0.07 | 0.11 | 0.07 | 0.04 | 0.04 | 0.04 |
Vamos inicialmente buscar outliers na série VP. Um meio é usar um scatter. Outra forma é gerar um boxplot.
> # Buscando os outliers
> dispersao <- plot(PADUB ~ VP)
> bp1_vp <- boxplot(VP, names = c("VP"))
> (bp1_vp_out <- bp1_vp$out) # retorna os valores de VP outliers
[1] 105352999 69309029 79804101 110434855 66886216 52673412 58868284
[8] 73931129
> bp2_proportions <- boxplot(PCALC, PADUB, PAGROT, PNIVEL, PPRAGA, PROT, horizontal = TRUE,
+ names = c("PCALC", "PADUB", "PAGROT", "PNIVEL", "PPRAGA", "PROT"))
Um boxplot que julgo mais útil é por meio do pacote car
, que reporta qual a observação atípica.
> library(car)
> Boxplot(VP, label = dados$obs)
[1] 6 7 34 35 37 38 44 45
Vamos utilizar o pacote stargazer para organizar as saídas de resultados. Se a saída fosse apenas pelo comando summary, sairia da forma:
> regressao1 <- lm(log(VP) ~ log(AREA) + log(QMO) + log(PPRAGA) + log(PROT))
> regressao1$AIC <- AIC(regressao1)
> regressao1$BIC <- BIC(regressao1)
> suppressMessages(library(stargazer))
> stargazer(regressao1, title = "Título: Resultado da Regressão OLS", align = TRUE,
+ type = "text", style = "all", keep.stat = c("AIC", "BIC", "rsq", "adj.rsq",
+ "n"))
Título: Resultado da Regressão OLS
===============================================
Dependent variable:
---------------------------
log(VP)
-----------------------------------------------
log(AREA) 1.041***
(0.025)
t = 41.000
p = 0.000
log(QMO) -0.020
(0.039)
t = -0.514
p = 0.610
log(PPRAGA) -0.006
(0.007)
t = -0.904
p = 0.370
log(PROT) -0.018**
(0.008)
t = -2.197
p = 0.033
Constant 6.496***
(0.279)
t = 23.279
p = 0.000
-----------------------------------------------
Observations 59
R2 0.971
Adjusted R2 0.969
Akaike Inf. Crit. 36.063
Bayesian Inf. Crit. 48.528
===============================================
Note: *p<0.1; **p<0.05; ***p<0.01
> summary(regressao1)
Call:
lm(formula = log(VP) ~ log(AREA) + log(QMO) + log(PPRAGA) + log(PROT))
Residuals:
Min 1Q Median 3Q Max
-0.35851 -0.15577 -0.03277 0.08441 1.80533
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 6.495877 0.279044 23.279 <2e-16 ***
log(AREA) 1.041318 0.025398 41.000 <2e-16 ***
log(QMO) -0.020008 0.038909 -0.514 0.6092
log(PPRAGA) -0.006484 0.007170 -0.904 0.3698
log(PROT) -0.018056 0.008217 -2.197 0.0323 *
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.3101 on 54 degrees of freedom
Multiple R-squared: 0.9713, Adjusted R-squared: 0.9691
F-statistic: 456.2 on 4 and 54 DF, p-value: < 2.2e-16
> # stargazer(list(regressao1),type='text',style='all', omit.stat = c('f') ) #
> # opcao
> split.screen(c(2, 2))
[1] 1 2 3 4
> screen(1)
> reg1.plot1 <- plot(regressao1, which = 1)
>
> screen(2)
> reg1.plot2 <- plot(regressao1, which = 2)
>
> screen(3)
> reg1.plot3 <- plot(regressao1, which = 3)
>
> screen(4)
> reg1.plot4 <- plot(regressao1, which = 4)
> close.screen(all = TRUE) # exit split-screen mode
> library(car)
> outlierTest(regressao1)
rstudent unadjusted p-value Bonferroni p
54 9.789123 1.7633e-13 1.0404e-11
H0: média dos resíduos =0; Hipótese alternativa: média dos resíduos é diferente de zero.
> # obtendo residuos do modelo
> u.hat_reg1 <- resid(regressao1)
> t.test(u.hat_reg1)
One Sample t-test
data: u.hat_reg1
t = -2.2888e-17, df = 58, p-value = 1
alternative hypothesis: true mean is not equal to 0
95 percent confidence interval:
-0.07798565 0.07798565
sample estimates:
mean of x
-8.917103e-19
Ao verificar um p-value>0.10, não rejeito H0.
> mod51_54 <- lm(log(VP) ~ log(AREA) + log(QMO) + log(PPRAGA) + log(PROT) + D51_54)
> summary(mod51_54)
Call:
lm(formula = log(VP) ~ log(AREA) + log(QMO) + log(PPRAGA) + log(PROT) +
D51_54)
Residuals:
Min 1Q Median 3Q Max
-0.53609 -0.12275 0.02064 0.11632 0.53609
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 6.521749 0.170251 38.307 < 2e-16 ***
log(AREA) 1.049127 0.015515 67.619 < 2e-16 ***
log(QMO) -0.052553 0.023977 -2.192 0.032811 *
log(PPRAGA) -0.009423 0.004385 -2.149 0.036211 *
log(PROT) -0.018117 0.005013 -3.614 0.000671 ***
D51_54 1.323510 0.137910 9.597 3.48e-13 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.1892 on 53 degrees of freedom
Multiple R-squared: 0.9895, Adjusted R-squared: 0.9885
F-statistic: 999.2 on 5 and 53 DF, p-value: < 2.2e-16
> mod51_54$AIC <- AIC(mod51_54)
> mod51_54$BIC <- BIC(mod51_54)
> # suppressMessages(library(stargazer))
> stargazer(list(regressao1, mod51_54), title = "Título: Resultado da Regressão OLS",
+ align = TRUE, type = "text", style = "all", keep.stat = c("AIC", "BIC",
+ "rsq", "adj.rsq", "n"))
Título: Resultado da Regressão OLS
================================================
Dependent variable:
----------------------------
log(VP)
(1) (2)
------------------------------------------------
log(AREA) 1.041*** 1.049***
(0.025) (0.016)
t = 41.000 t = 67.619
p = 0.000 p = 0.000
log(QMO) -0.020 -0.053**
(0.039) (0.024)
t = -0.514 t = -2.192
p = 0.610 p = 0.033
log(PPRAGA) -0.006 -0.009**
(0.007) (0.004)
t = -0.904 t = -2.149
p = 0.370 p = 0.037
log(PROT) -0.018** -0.018***
(0.008) (0.005)
t = -2.197 t = -3.614
p = 0.033 p = 0.001
D51_54 1.324***
(0.138)
t = 9.597
p = 0.000
Constant 6.496*** 6.522***
(0.279) (0.170)
t = 23.279 t = 38.307
p = 0.000 p = 0.000
------------------------------------------------
Observations 59 59
R2 0.971 0.990
Adjusted R2 0.969 0.989
Akaike Inf. Crit. 36.063 -21.358
Bayesian Inf. Crit. 48.528 -6.815
================================================
Note: *p<0.1; **p<0.05; ***p<0.01
> mod51_54incl <- lm(log(VP) ~ log(AREA) + log(QMO) + log(PPRAGA) + log(PROT) +
+ D51_54 + I(log(AREA) * D51_54))
> summary(mod51_54incl)
Call:
lm(formula = log(VP) ~ log(AREA) + log(QMO) + log(PPRAGA) + log(PROT) +
D51_54 + I(log(AREA) * D51_54))
Residuals:
Min 1Q Median 3Q Max
-0.30924 -0.11933 0.00000 0.09946 0.28607
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 6.422928 0.144040 44.591 < 2e-16 ***
log(AREA) 1.052109 0.013009 80.874 < 2e-16 ***
log(QMO) -0.037739 0.020313 -1.858 0.0688 .
log(PPRAGA) -0.009752 0.003673 -2.655 0.0105 *
log(PROT) -0.018799 0.004201 -4.475 4.19e-05 ***
D51_54 9.003078 1.586565 5.675 6.26e-07 ***
I(log(AREA) * D51_54) -0.862481 0.177712 -4.853 1.15e-05 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.1585 on 52 degrees of freedom
Multiple R-squared: 0.9928, Adjusted R-squared: 0.9919
F-statistic: 1191 on 6 and 52 DF, p-value: < 2.2e-16
> mod51_54incl$AIC <- AIC(mod51_54incl)
> mod51_54incl$BIC <- BIC(mod51_54incl)
> # suppressMessages(library(stargazer))
> stargazer(list(mod51_54incl, mod51_54), title = "Título: Resultado da Regressão OLS",
+ align = TRUE, type = "text", style = "all", keep.stat = c("AIC", "BIC",
+ "rsq", "adj.rsq", "n"))
Título: Resultado da Regressão OLS
==================================================
Dependent variable:
----------------------------
log(VP)
(1) (2)
--------------------------------------------------
log(AREA) 1.052*** 1.049***
(0.013) (0.016)
t = 80.874 t = 67.619
p = 0.000 p = 0.000
log(QMO) -0.038* -0.053**
(0.020) (0.024)
t = -1.858 t = -2.192
p = 0.069 p = 0.033
log(PPRAGA) -0.010** -0.009**
(0.004) (0.004)
t = -2.655 t = -2.149
p = 0.011 p = 0.037
log(PROT) -0.019*** -0.018***
(0.004) (0.005)
t = -4.475 t = -3.614
p = 0.00005 p = 0.001
D51_54 9.003*** 1.324***
(1.587) (0.138)
t = 5.675 t = 9.597
p = 0.00000 p = 0.000
I(log(AREA) * D51_54) -0.862***
(0.178)
t = -4.853
p = 0.00002
Constant 6.423*** 6.522***
(0.144) (0.170)
t = 44.591 t = 38.307
p = 0.000 p = 0.000
--------------------------------------------------
Observations 59 59
R2 0.993 0.990
Adjusted R2 0.992 0.989
Akaike Inf. Crit. -41.400 -21.358
Bayesian Inf. Crit. -24.780 -6.815
==================================================
Note: *p<0.1; **p<0.05; ***p<0.01
> u.hat <- resid(regressao1)
> library(car)
> residualPlots(regressao1)
Test stat Pr(>|Test stat|)
log(AREA) -1.5989 0.1158
log(QMO) -0.7432 0.4606
log(PPRAGA) -0.3127 0.7558
log(PROT) -0.5945 0.5547
Tukey test -1.5889 0.1121
Olhando o normal QQPlot
> reg1.plot2 <- plot(regressao1, which = 2)
Outra opção de gráfico QQPlot pelo pacote car
:
> car::qqPlot(regressao1)
[1] 51 54
H0: resíduos sao normais
> library(tseries)
> JB.reg1 <- jarque.bera.test(u.hat)
> JB.reg1
Jarque Bera Test
data: u.hat
X-squared = 1224.2, df = 2, p-value < 2.2e-16
> # histograma
> hist.reg1 <- hist(u.hat, freq = FALSE)
> # adicionar curva normal teorica :::add normal curve
> curve(dnorm, add = TRUE, col = "red")
> # Pacote com alguns testes
> library(nortest)
> # Testes para a regressao 1
> t1.1 <- ks.test(u.hat, "pnorm", mean(u.hat), summary(regressao1)$sigma) # KS
> t2.1 <- lillie.test(u.hat) # Lilliefors
> t3.1 <- cvm.test(u.hat) # Cramér-von Mises
> t4.1 <- shapiro.test(u.hat) # Shapiro-Wilk
> t5.1 <- sf.test(u.hat) # Shapiro-Francia
> t6.1 <- ad.test(u.hat) # Anderson-Darling
> # Tabela de resultados
> testes <- c(t1.1$method, t2.1$method, t3.1$method, t4.1$method, t5.1$method,
+ t6.1$method)
> estt <- as.numeric(c(t1.1$statistic, t2.1$statistic, t3.1$statistic, t4.1$statistic,
+ t5.1$statistic, t6.1$statistic))
> valorp <- c(t1.1$p.value, t2.1$p.value, t3.1$p.value, t4.1$p.value, t5.1$p.value,
+ t6.1$p.value)
> resultados <- cbind(estt, valorp)
> rownames(resultados) <- testes
> colnames(resultados) <- c("Estatística", "prob.")
> kable(resultados, digits = 3)
Estatística | prob. | |
---|---|---|
One-sample Kolmogorov-Smirnov test | 0.190 | 0.025 |
Lilliefors (Kolmogorov-Smirnov) normality test | 0.182 | 0.000 |
Cramer-von Mises normality test | 0.595 | 0.000 |
Shapiro-Wilk normality test | 0.650 | 0.000 |
Shapiro-Francia normality test | 0.629 | 0.000 |
Anderson-Darling normality test | 3.880 | 0.000 |
Seja a regressão com a dummy alterando os interceptos das observações 51 e 54.
> reg.dum <- lm(log(VP) ~ log(AREA) + log(QMO) + log(PPRAGA) + log(PROT) + D51_54,
+ data = dados)
> summary(reg.dum)
Call:
lm(formula = log(VP) ~ log(AREA) + log(QMO) + log(PPRAGA) + log(PROT) +
D51_54, data = dados)
Residuals:
Min 1Q Median 3Q Max
-0.53609 -0.12275 0.02064 0.11632 0.53609
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 6.521749 0.170251 38.307 < 2e-16 ***
log(AREA) 1.049127 0.015515 67.619 < 2e-16 ***
log(QMO) -0.052553 0.023977 -2.192 0.032811 *
log(PPRAGA) -0.009423 0.004385 -2.149 0.036211 *
log(PROT) -0.018117 0.005013 -3.614 0.000671 ***
D51_54 1.323510 0.137910 9.597 3.48e-13 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.1892 on 53 degrees of freedom
Multiple R-squared: 0.9895, Adjusted R-squared: 0.9885
F-statistic: 999.2 on 5 and 53 DF, p-value: < 2.2e-16
Faremos os testes de normalidade para esta última regressão com dummies:
> require(nortest)
> # teste de normalidade para regressao com a dummy no intercepto agora para a
> # regressao
> u.hat <- resid(reg.dum)
> # Testes
> t1.1 <- ks.test(u.hat, "pnorm", mean(u.hat), summary(reg.dum)$sigma) # KS
> t2.1 <- lillie.test(u.hat) # Lilliefors
> t3.1 <- cvm.test(u.hat) # Cramér-von Mises
> t4.1 <- shapiro.test(u.hat) # Shapiro-Wilk
> t5.1 <- sf.test(u.hat) # Shapiro-Francia
> t6.1 <- ad.test(u.hat) # Anderson-Darling
> # Tabela de resultados
> testes <- c(t1.1$method, t2.1$method, t3.1$method, t4.1$method, t5.1$method,
+ t6.1$method)
> estt <- as.numeric(c(t1.1$statistic, t2.1$statistic, t3.1$statistic, t4.1$statistic,
+ t5.1$statistic, t6.1$statistic))
> valorp <- c(t1.1$p.value, t2.1$p.value, t3.1$p.value, t4.1$p.value, t5.1$p.value,
+ t6.1$p.value)
> resultados <- cbind(estt, valorp)
> rownames(resultados) <- testes
> colnames(resultados) <- c("Estatística", "prob.")
> kable(resultados, digits = 3)
Estatística | prob. | |
---|---|---|
One-sample Kolmogorov-Smirnov test | 0.062 | 0.967 |
Lilliefors (Kolmogorov-Smirnov) normality test | 0.059 | 0.873 |
Cramer-von Mises normality test | 0.040 | 0.679 |
Shapiro-Wilk normality test | 0.979 | 0.391 |
Shapiro-Francia normality test | 0.971 | 0.146 |
Anderson-Darling normality test | 0.343 | 0.478 |