Link to rpubs: https://rpubs.com/theodorapo/take-home_ex-04
Brazil is the world’s fifth-largest country by area and the sixth most populous. Brazil is classified as an upper-middle income economy by the World Bank. As a developing country, Brazil has the largest share of global wealth in Latin America. It is considered an advanced emerging economy. It has the ninth largest GDP in the world by nominal, and eighth by PPP measures. Behind all this impressive figures, the spatial development of Brazil is highly unequal. The GDP per capita of the poorest municipality is R$3190.6. On the other hand, the GDP per capita of the richest municipality is R$314638. Half of the municipalities with GDP per capita less than R$16000 and the top 25% municipalities earn R$26155 and above.
In this take-home exercise, we will be determining the factors affecting the unequal development of Brazil at the municipality level by using the data provided. The specific task of the analysis are as follows:
1. Prepare a choropleth map showing the distribution of GDP per capita, 2016 at municipality level.
2. Calibrate an explanatory model to explain factors affecting the GDP per capita at the municipality level by using multiple linear regression method.
3. Prepare a choropleth map showing the distribution of the residual of the GDP per capita.
4. Calibrate an explanatory model to explain factors affecting the GDP per capita at the municipality level by using geographically weighted regression method.
5. Prepare a series of choropleth maps showing the outputs of the geographically weighted regression model.
We are provided with the first 2 data sets, and the last data set is retrived from the geobr package in r 1. BRAZIL_CITIES.csv. This data file consists of 81 columns and 5573 rows. Each row representing one municipality.
2. Data_Dictionary.csv. This file provides meta data of each columns in BRAZIL_CITIES.csv.
3. 2016 municipality boundary file
packages = c('olsrr', 'corrplot', 'ggpubr', 'sf', 'spdep', 'GWmodel', 'tmap', 'tidyverse', 'geobr', 'readr', 'anchors', 'DT', 'fitdistrplus', 'Orcs')
for (p in packages){
if(!require(p, character.only = T)){
install.packages(p)
}
library(p,character.only = T)
}
## Loading required package: olsrr
##
## Attaching package: 'olsrr'
## The following object is masked from 'package:datasets':
##
## rivers
## Loading required package: corrplot
## corrplot 0.84 loaded
## Loading required package: ggpubr
## Loading required package: ggplot2
## Loading required package: sf
## Linking to GEOS 3.8.0, GDAL 3.0.4, PROJ 6.3.1
## Loading required package: spdep
## Loading required package: sp
## Loading required package: spData
## To access larger datasets in this package, install the spDataLarge
## package with: `install.packages('spDataLarge',
## repos='https://nowosad.github.io/drat/', type='source')`
## Loading required package: GWmodel
## Loading required package: maptools
## Checking rgeos availability: FALSE
## Note: when rgeos is not available, polygon geometry computations in maptools depend on gpclib,
## which has a restricted licence. It is disabled by default;
## to enable gpclib, type gpclibPermit()
## Loading required package: robustbase
## Loading required package: Rcpp
## Loading required package: spatialreg
## Loading required package: Matrix
## Registered S3 methods overwritten by 'spatialreg':
## method from
## residuals.stsls spdep
## deviance.stsls spdep
## coef.stsls spdep
## print.stsls spdep
## summary.stsls spdep
## print.summary.stsls spdep
## residuals.gmsar spdep
## deviance.gmsar spdep
## coef.gmsar spdep
## fitted.gmsar spdep
## print.gmsar spdep
## summary.gmsar spdep
## print.summary.gmsar spdep
## print.lagmess spdep
## summary.lagmess spdep
## print.summary.lagmess spdep
## residuals.lagmess spdep
## deviance.lagmess spdep
## coef.lagmess spdep
## fitted.lagmess spdep
## logLik.lagmess spdep
## fitted.SFResult spdep
## print.SFResult spdep
## fitted.ME_res spdep
## print.ME_res spdep
## print.lagImpact spdep
## plot.lagImpact spdep
## summary.lagImpact spdep
## HPDinterval.lagImpact spdep
## print.summary.lagImpact spdep
## print.sarlm spdep
## summary.sarlm spdep
## residuals.sarlm spdep
## deviance.sarlm spdep
## coef.sarlm spdep
## vcov.sarlm spdep
## fitted.sarlm spdep
## logLik.sarlm spdep
## anova.sarlm spdep
## predict.sarlm spdep
## print.summary.sarlm spdep
## print.sarlm.pred spdep
## as.data.frame.sarlm.pred spdep
## residuals.spautolm spdep
## deviance.spautolm spdep
## coef.spautolm spdep
## fitted.spautolm spdep
## print.spautolm spdep
## summary.spautolm spdep
## logLik.spautolm spdep
## print.summary.spautolm spdep
## print.WXImpact spdep
## summary.WXImpact spdep
## print.summary.WXImpact spdep
## predict.SLX spdep
##
## Attaching package: 'spatialreg'
## The following objects are masked from 'package:spdep':
##
## anova.sarlm, as.spam.listw, as_dgRMatrix_listw, as_dsCMatrix_I,
## as_dsCMatrix_IrW, as_dsTMatrix_listw, bptest.sarlm, can.be.simmed,
## cheb_setup, coef.gmsar, coef.sarlm, coef.spautolm, coef.stsls,
## create_WX, deviance.gmsar, deviance.sarlm, deviance.spautolm,
## deviance.stsls, do_ldet, eigen_pre_setup, eigen_setup, eigenw,
## errorsarlm, fitted.gmsar, fitted.ME_res, fitted.sarlm,
## fitted.SFResult, fitted.spautolm, get.ClusterOption,
## get.coresOption, get.mcOption, get.VerboseOption,
## get.ZeroPolicyOption, GMargminImage, GMerrorsar, griffith_sone,
## gstsls, Hausman.test, HPDinterval.lagImpact, impacts, intImpacts,
## Jacobian_W, jacobianSetup, l_max, lagmess, lagsarlm, lextrB,
## lextrS, lextrW, lmSLX, logLik.sarlm, logLik.spautolm, LR.sarlm,
## LR1.sarlm, LR1.spautolm, LU_prepermutate_setup, LU_setup,
## Matrix_J_setup, Matrix_setup, mcdet_setup, MCMCsamp, ME, mom_calc,
## mom_calc_int2, moments_setup, powerWeights, predict.sarlm,
## predict.SLX, print.gmsar, print.ME_res, print.sarlm,
## print.sarlm.pred, print.SFResult, print.spautolm, print.stsls,
## print.summary.gmsar, print.summary.sarlm, print.summary.spautolm,
## print.summary.stsls, residuals.gmsar, residuals.sarlm,
## residuals.spautolm, residuals.stsls, sacsarlm, SE_classic_setup,
## SE_interp_setup, SE_whichMin_setup, set.ClusterOption,
## set.coresOption, set.mcOption, set.VerboseOption,
## set.ZeroPolicyOption, similar.listw, spam_setup, spam_update_setup,
## SpatialFiltering, spautolm, spBreg_err, spBreg_lag, spBreg_sac,
## stsls, subgraph_eigenw, summary.gmsar, summary.sarlm,
## summary.spautolm, summary.stsls, trW, vcov.sarlm, Wald1.sarlm
## Welcome to GWmodel version 2.1-4.
## The new version of GWmodel 2.1-4 now is readyLoading required package: tmap
## Loading required package: tidyverse
## -- Attaching packages ---------------------------------------------------------------------- tidyverse 1.3.0 --
## v tibble 3.0.1 v dplyr 0.8.5
## v tidyr 1.1.0 v stringr 1.4.0
## v readr 1.3.1 v forcats 0.5.0
## v purrr 0.3.4
## -- Conflicts ------------------------------------------------------------------------- tidyverse_conflicts() --
## x tidyr::expand() masks Matrix::expand()
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
## x tidyr::pack() masks Matrix::pack()
## x tidyr::unpack() masks Matrix::unpack()
## Loading required package: geobr
## Loading required package: anchors
## Loading required package: rgenoud
## ## rgenoud (Version 5.8-3.0, Build Date: 2019-01-22)
## ## See http://sekhon.berkeley.edu/rgenoud for additional documentation.
## ## Please cite software as:
## ## Walter Mebane, Jr. and Jasjeet S. Sekhon. 2011.
## ## ``Genetic Optimization Using Derivatives: The rgenoud package for R.''
## ## Journal of Statistical Software, 42(11): 1-26.
## ##
##
## Loading required package: MASS
##
## Attaching package: 'MASS'
##
## The following object is masked from 'package:dplyr':
##
## select
##
## The following object is masked from 'package:olsrr':
##
## cement
##
##
## ## anchors (Version 3.0-8, Build Date: 2014-02-24)
## ## See http://wand.stanford.edu/anchors for additional documentation and support.
##
##
## Loading required package: DT
## Loading required package: fitdistrplus
## Loading required package: survival
##
## Attaching package: 'survival'
##
## The following object is masked from 'package:robustbase':
##
## heart
##
## Loading required package: Orcs
## Loading required package: raster
##
## Attaching package: 'raster'
##
## The following objects are masked from 'package:MASS':
##
## area, select
##
## The following object is masked from 'package:dplyr':
##
## select
##
## The following object is masked from 'package:tidyr':
##
## extract
##
## The following object is masked from 'package:ggpubr':
##
## rotate
Brazil 2016 municipality boundary
mun <- read_municipality(code_muni= "all", year=2016)
## Using year 2016
## Loading data for the whole country. This might take a few minutes.
##
|
| | 0%
|
|=== | 4%
|
|===== | 7%
|
|======== | 11%
|
|========== | 15%
|
|============= | 19%
|
|================ | 22%
|
|================== | 26%
|
|===================== | 30%
|
|======================= | 33%
|
|========================== | 37%
|
|============================= | 41%
|
|=============================== | 44%
|
|================================== | 48%
|
|==================================== | 52%
|
|======================================= | 56%
|
|========================================= | 59%
|
|============================================ | 63%
|
|=============================================== | 67%
|
|================================================= | 70%
|
|==================================================== | 74%
|
|====================================================== | 78%
|
|========================================================= | 81%
|
|============================================================ | 85%
|
|============================================================== | 89%
|
|================================================================= | 93%
|
|=================================================================== | 96%
|
|======================================================================| 100%
Plot the boundary of Brazil
no_axis <- theme(axis.title=element_blank(),
axis.text=element_blank(),
axis.ticks=element_blank())
ggplot() +
geom_sf(data=mun, fill="#2D3E50", color="#FEBF57", size=.15, show.legend = FALSE) +
labs(subtitle="Brazil", size=8) +
theme_minimal() +
no_axis
Check crs of mun
st_crs(mun)
## Coordinate Reference System:
## User input: SIRGAS 2000
## wkt:
## GEOGCRS["SIRGAS 2000",
## DATUM["Sistema de Referencia Geocentrico para las AmericaS 2000",
## ELLIPSOID["GRS 1980",6378137,298.257222101,
## LENGTHUNIT["metre",1]]],
## PRIMEM["Greenwich",0,
## ANGLEUNIT["degree",0.0174532925199433]],
## CS[ellipsoidal,2],
## AXIS["geodetic latitude (Lat)",north,
## ORDER[1],
## ANGLEUNIT["degree",0.0174532925199433]],
## AXIS["geodetic longitude (Lon)",east,
## ORDER[2],
## ANGLEUNIT["degree",0.0174532925199433]],
## USAGE[
## SCOPE["unknown"],
## AREA["Latin America - SIRGAS 2000 by country"],
## BBOX[-59.87,-122.19,32.72,-25.28]],
## ID["EPSG",4674]]
The crs of brazil is set at 4674.
We will check if the geometry in mun is valid
all(st_is_valid(mun))
## [1] FALSE
Since FALSE, we need to make sure geometry is valid
mun <- st_make_valid(mun)
Now, check again if geometry is valid
all(st_is_valid(mun))
## [1] TRUE
Next, we will check for any empty geometries
any(is.na(st_dimension(mun)))
## [1] FALSE
brazil <- read_delim("data/aspatial/BRAZIL_CITIES.csv", delim = ";")
## Parsed with column specification:
## cols(
## .default = col_double(),
## CITY = col_character(),
## STATE = col_character(),
## AREA = col_number(),
## REGIAO_TUR = col_character(),
## CATEGORIA_TUR = col_character(),
## RURAL_URBAN = col_character(),
## GVA_MAIN = col_character()
## )
## See spec(...) for full column specifications.
After importing the data, we will need to check if the data is imported correctly.
We will use glimpse() to check.
glimpse(brazil)
## Rows: 5,573
## Columns: 81
## $ CITY <chr> "Abadia De Goiás", "Abadia Dos Dourados", ...
## $ STATE <chr> "GO", "MG", "GO", "MG", "PA", "CE", "BA", ...
## $ CAPITAL <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...
## $ IBGE_RES_POP <dbl> 6876, 6704, 15757, 22690, 141100, 10496, 8...
## $ IBGE_RES_POP_BRAS <dbl> 6876, 6704, 15609, 22690, 141040, 10496, 8...
## $ IBGE_RES_POP_ESTR <dbl> 0, 0, 148, 0, 60, 0, 0, 0, 0, 0, 0, 16, 17...
## $ IBGE_DU <dbl> 2137, 2328, 4655, 7694, 31061, 2791, 2572,...
## $ IBGE_DU_URBAN <dbl> 1546, 1481, 3233, 6667, 19057, 1251, 1193,...
## $ IBGE_DU_RURAL <dbl> 591, 847, 1422, 1027, 12004, 1540, 1379, 1...
## $ IBGE_POP <dbl> 5300, 4154, 10656, 18464, 82956, 4538, 372...
## $ IBGE_1 <dbl> 69, 38, 139, 176, 1354, 98, 37, 167, 69, 1...
## $ `IBGE_1-4` <dbl> 318, 207, 650, 856, 5567, 323, 156, 733, 3...
## $ `IBGE_5-9` <dbl> 438, 260, 894, 1233, 7618, 421, 263, 978, ...
## $ `IBGE_10-14` <dbl> 517, 351, 1087, 1539, 8905, 483, 277, 927,...
## $ `IBGE_15-59` <dbl> 3542, 2709, 6896, 11979, 53516, 2631, 2319...
## $ `IBGE_60+` <dbl> 416, 589, 990, 2681, 5996, 582, 673, 803, ...
## $ IBGE_PLANTED_AREA <dbl> 319, 4479, 10307, 1862, 25200, 2598, 895, ...
## $ `IBGE_CROP_PRODUCTION_$` <dbl> 1843, 18017, 33085, 7502, 700872, 5234, 39...
## $ `IDHM Ranking 2010` <dbl> 1689, 2207, 2202, 1994, 3530, 3522, 4086, ...
## $ IDHM <dbl> 0.708, 0.690, 0.690, 0.698, 0.628, 0.628, ...
## $ IDHM_Renda <dbl> 0.687, 0.693, 0.671, 0.720, 0.579, 0.540, ...
## $ IDHM_Longevidade <dbl> 0.830, 0.839, 0.841, 0.848, 0.798, 0.748, ...
## $ IDHM_Educacao <dbl> 0.622, 0.563, 0.579, 0.556, 0.537, 0.612, ...
## $ LONG <dbl> -49.44055, -47.39683, -48.71881, -45.44619...
## $ LAT <dbl> -16.758812, -18.487565, -16.182672, -19.15...
## $ ALT <dbl> 893.60, 753.12, 1017.55, 644.74, 10.12, 40...
## $ PAY_TV <dbl> 360, 77, 227, 1230, 3389, 29, 952, 51, 55,...
## $ FIXED_PHONES <dbl> 842, 296, 720, 1716, 1218, 34, 335, 222, 3...
## $ AREA <dbl> 147.26, 881.06, 1045.13, 1817.07, 1610.65,...
## $ REGIAO_TUR <chr> NA, "Caminhos Do Cerrado", "Região Turísti...
## $ CATEGORIA_TUR <chr> NA, "D", "C", "D", "D", NA, "D", NA, NA, "...
## $ ESTIMATED_POP <dbl> 8583, 6972, 19614, 23223, 156292, 11663, 8...
## $ RURAL_URBAN <chr> "Urbano", "Rural Adjacente", "Rural Adjace...
## $ GVA_AGROPEC <dbl> 6.20, 50524.57, 42.84, 113824.60, 140463.7...
## $ GVA_INDUSTRY <dbl> 27991.25, 25917.70, 16728.30, 31002.62, 58...
## $ GVA_SERVICES <dbl> 74750.32, 62689.23, 138198.58, 172.33, 468...
## $ GVA_PUBLIC <dbl> 36915.04, 28083.79, 63396.20, 86081.41, 48...
## $ ` GVA_TOTAL ` <dbl> 145857.60, 167215.28, 261161.91, 403241.27...
## $ TAXES <dbl> 20554.20, 12873.50, 26822.58, 26994.09, 95...
## $ GDP <dbl> 166.41, 180.09, 287984.49, 430235.36, 1249...
## $ POP_GDP <dbl> 8053, 7037, 18427, 23574, 151934, 11483, 9...
## $ GDP_CAPITA <dbl> 20664.57, 25591.70, 15628.40, 18250.42, 82...
## $ GVA_MAIN <chr> "Demais serviços", "Demais serviços", "Dem...
## $ MUN_EXPENDIT <dbl> 28227691, 17909274, 37513019, NA, NA, NA, ...
## $ COMP_TOT <dbl> 284, 476, 288, 621, 931, 86, 191, 87, 285,...
## $ COMP_A <dbl> 5, 6, 5, 18, 4, 1, 6, 2, 5, 2, 0, 8, 3, 1,...
## $ COMP_B <dbl> 1, 6, 9, 1, 2, 0, 0, 0, 0, 0, 0, 2, 2, 0, ...
## $ COMP_C <dbl> 56, 30, 26, 40, 43, 4, 8, 3, 20, 4, 9, 40,...
## $ COMP_D <dbl> 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, ...
## $ COMP_E <dbl> 2, 2, 2, 1, 1, 0, 1, 0, 1, 0, 1, 1, 1, 2, ...
## $ COMP_F <dbl> 29, 34, 7, 20, 27, 6, 4, 0, 10, 2, 0, 25, ...
## $ COMP_G <dbl> 110, 190, 117, 303, 500, 48, 97, 71, 133, ...
## $ COMP_H <dbl> 26, 70, 12, 62, 16, 2, 5, 0, 18, 8, 1, 67,...
## $ COMP_I <dbl> 4, 28, 57, 30, 31, 10, 5, 1, 14, 3, 0, 25,...
## $ COMP_J <dbl> 5, 11, 2, 9, 6, 2, 3, 1, 8, 1, 1, 9, 5, 14...
## $ COMP_K <dbl> 0, 0, 1, 6, 1, 0, 1, 0, 0, 1, 0, 4, 3, 3, ...
## $ COMP_L <dbl> 2, 4, 0, 4, 1, 0, 0, 0, 4, 0, 0, 7, 4, 4, ...
## $ COMP_M <dbl> 10, 15, 7, 28, 22, 2, 5, 0, 11, 4, 2, 26, ...
## $ COMP_N <dbl> 12, 29, 15, 27, 16, 3, 5, 1, 26, 0, 1, 16,...
## $ COMP_O <dbl> 4, 2, 3, 2, 2, 2, 2, 2, 2, 2, 6, 2, 4, 2, ...
## $ COMP_P <dbl> 6, 9, 11, 15, 155, 0, 8, 0, 8, 1, 6, 14, 1...
## $ COMP_Q <dbl> 6, 14, 5, 19, 33, 2, 1, 2, 9, 3, 0, 13, 22...
## $ COMP_R <dbl> 1, 6, 1, 9, 15, 0, 2, 0, 4, 0, 0, 4, 6, 6,...
## $ COMP_S <dbl> 5, 19, 8, 27, 56, 4, 38, 4, 12, 3, 4, 23, ...
## $ COMP_T <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...
## $ COMP_U <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...
## $ HOTELS <dbl> NA, NA, 1, NA, NA, NA, 1, NA, NA, NA, NA, ...
## $ BEDS <dbl> NA, NA, 34, NA, NA, NA, 24, NA, NA, NA, NA...
## $ Pr_Agencies <dbl> NA, NA, 1, 2, 2, NA, NA, 1, 0, 0, 0, 1, 0,...
## $ Pu_Agencies <dbl> NA, NA, 1, 2, 4, NA, NA, 0, 1, 1, 1, 2, 1,...
## $ Pr_Bank <dbl> NA, NA, 1, 2, 2, NA, NA, 1, 0, 0, 0, 1, 0,...
## $ Pu_Bank <dbl> NA, NA, 1, 2, 4, NA, NA, 0, 1, 1, 1, 2, 1,...
## $ Pr_Assets <dbl> NA, NA, 33724584, 44974716, 76181384, NA, ...
## $ Pu_Assets <dbl> NA, NA, 67091904, 371922572, 800078483, NA...
## $ Cars <dbl> 2158, 2227, 2838, 6928, 5277, 553, 896, 61...
## $ Motorcycles <dbl> 1246, 1142, 1426, 2953, 25661, 1674, 696, ...
## $ Wheeled_tractor <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 0, 1, 0, 0, ...
## $ UBER <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA...
## $ MAC <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA...
## $ `WAL-MART` <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA...
## $ POST_OFFICES <dbl> 1, 1, 3, 4, 2, 1, 1, 1, 1, 1, 1, 1, 2, 1, ...
head(brazil$LONG)
## [1] -49.44055 -47.39683 -48.71881 -45.44619 -48.88440 -39.04755
head(brazil$LAT)
## [1] -16.758812 -18.487565 -16.182672 -19.155848 -1.723470 -7.356977
Convert columns to factor types
brazil$STATE <- factor(brazil$STATE)
brazil$CAPITAL <- factor(brazil$CAPITAL)
brazil$REGIAO_TUR <- factor(brazil$REGIAO_TUR)
brazil$CATEGORIA_TUR <- factor(brazil$CATEGORIA_TUR)
brazil$UBER <- factor(brazil$UBER)
Check the summary of brazil
summary(brazil)
## CITY STATE CAPITAL IBGE_RES_POP
## Length:5573 MG : 853 0:5546 Min. : 805
## Class :character SP : 645 1: 27 1st Qu.: 5235
## Mode :character RS : 498 Median : 10934
## BA : 418 Mean : 34278
## PR : 399 3rd Qu.: 23424
## SC : 295 Max. :11253503
## (Other):2465 NA's :8
## IBGE_RES_POP_BRAS IBGE_RES_POP_ESTR IBGE_DU IBGE_DU_URBAN
## Min. : 805 Min. : 0.0 Min. : 239 Min. : 60
## 1st Qu.: 5230 1st Qu.: 0.0 1st Qu.: 1572 1st Qu.: 874
## Median : 10926 Median : 0.0 Median : 3174 Median : 1846
## Mean : 34200 Mean : 77.5 Mean : 10303 Mean : 8859
## 3rd Qu.: 23390 3rd Qu.: 10.0 3rd Qu.: 6726 3rd Qu.: 4624
## Max. :11133776 Max. :119727.0 Max. :3576148 Max. :3548433
## NA's :8 NA's :8 NA's :10 NA's :10
## IBGE_DU_RURAL IBGE_POP IBGE_1 IBGE_1-4
## Min. : 3 Min. : 174 Min. : 0.0 Min. : 5
## 1st Qu.: 487 1st Qu.: 2801 1st Qu.: 38.0 1st Qu.: 158
## Median : 931 Median : 6170 Median : 92.0 Median : 376
## Mean : 1463 Mean : 27595 Mean : 383.3 Mean : 1544
## 3rd Qu.: 1832 3rd Qu.: 15302 3rd Qu.: 232.0 3rd Qu.: 951
## Max. :33809 Max. :10463636 Max. :129464.0 Max. :514794
## NA's :81 NA's :8 NA's :8 NA's :8
## IBGE_5-9 IBGE_10-14 IBGE_15-59 IBGE_60+
## Min. : 7 Min. : 12 Min. : 94 Min. : 29
## 1st Qu.: 220 1st Qu.: 259 1st Qu.: 1734 1st Qu.: 341
## Median : 516 Median : 588 Median : 3841 Median : 722
## Mean : 2069 Mean : 2381 Mean : 18212 Mean : 3004
## 3rd Qu.: 1300 3rd Qu.: 1478 3rd Qu.: 9628 3rd Qu.: 1724
## Max. :684443 Max. :783702 Max. :7058221 Max. :1293012
## NA's :8 NA's :8 NA's :8 NA's :8
## IBGE_PLANTED_AREA IBGE_CROP_PRODUCTION_$ IDHM Ranking 2010 IDHM
## Min. : 0.0 Min. : 0 Min. : 1 Min. :0.4180
## 1st Qu.: 910.2 1st Qu.: 2326 1st Qu.:1392 1st Qu.:0.5990
## Median : 3471.5 Median : 13846 Median :2783 Median :0.6650
## Mean : 14179.9 Mean : 57384 Mean :2783 Mean :0.6592
## 3rd Qu.: 11194.2 3rd Qu.: 55619 3rd Qu.:4174 3rd Qu.:0.7180
## Max. :1205669.0 Max. :3274885 Max. :5565 Max. :0.8620
## NA's :3 NA's :3 NA's :8 NA's :8
## IDHM_Renda IDHM_Longevidade IDHM_Educacao LONG
## Min. :0.4000 Min. :0.6720 Min. :0.2070 Min. :-72.92
## 1st Qu.:0.5720 1st Qu.:0.7690 1st Qu.:0.4900 1st Qu.:-50.87
## Median :0.6540 Median :0.8080 Median :0.5600 Median :-46.52
## Mean :0.6429 Mean :0.8016 Mean :0.5591 Mean :-46.23
## 3rd Qu.:0.7070 3rd Qu.:0.8360 3rd Qu.:0.6310 3rd Qu.:-41.40
## Max. :0.8910 Max. :0.8940 Max. :0.8250 Max. :-32.44
## NA's :8 NA's :8 NA's :8 NA's :9
## LAT ALT PAY_TV FIXED_PHONES
## Min. :-33.688 Min. : 0.0 Min. : 1 Min. : 3
## 1st Qu.:-22.838 1st Qu.: 169.8 1st Qu.: 88 1st Qu.: 119
## Median :-18.089 Median : 406.5 Median : 247 Median : 327
## Mean :-16.444 Mean : 893.8 Mean : 3094 Mean : 6567
## 3rd Qu.: -8.489 3rd Qu.: 628.9 3rd Qu.: 815 3rd Qu.: 1151
## Max. : 4.585 Max. :874579.0 Max. :2047668 Max. :5543127
## NA's :9 NA's :9 NA's :3 NA's :3
## AREA REGIAO_TUR CATEGORIA_TUR
## Min. : 3.57 Corredores Das Águas: 59 A : 51
## 1st Qu.: 204.44 Vale Do Contestado : 45 B : 168
## Median : 416.59 Amazônia Atlântica : 40 C : 521
## Mean : 1517.44 Araguaia-Tocantins : 39 D :1892
## 3rd Qu.: 1026.57 Cariri : 37 E : 653
## Max. :159533.33 (Other) :3065 NA's:2288
## NA's :3 NA's :2288
## ESTIMATED_POP RURAL_URBAN GVA_AGROPEC GVA_INDUSTRY
## Min. : 786 Length:5573 Min. : 0 Min. : 1
## 1st Qu.: 5454 Class :character 1st Qu.: 4189 1st Qu.: 1726
## Median : 11590 Mode :character Median : 20426 Median : 7424
## Mean : 37432 Mean : 47271 Mean : 175928
## 3rd Qu.: 25296 3rd Qu.: 51227 3rd Qu.: 41022
## Max. :12176866 Max. :1402282 Max. :63306755
## NA's :3 NA's :3 NA's :3
## GVA_SERVICES GVA_PUBLIC GVA_TOTAL TAXES
## Min. : 2 Min. : 7 Min. : 17 Min. : -14159
## 1st Qu.: 10112 1st Qu.: 17267 1st Qu.: 42253 1st Qu.: 1305
## Median : 31211 Median : 35866 Median : 119492 Median : 5100
## Mean : 489451 Mean : 123768 Mean : 832987 Mean : 118864
## 3rd Qu.: 115406 3rd Qu.: 89245 3rd Qu.: 313963 3rd Qu.: 22197
## Max. :464656988 Max. :41902893 Max. :569910503 Max. :117125387
## NA's :3 NA's :3 NA's :3 NA's :3
## GDP POP_GDP GDP_CAPITA GVA_MAIN
## Min. : 15 Min. : 815 Min. : 3191 Length:5573
## 1st Qu.: 43709 1st Qu.: 5483 1st Qu.: 9058 Class :character
## Median : 125153 Median : 11578 Median : 15870 Mode :character
## Mean : 954584 Mean : 36998 Mean : 21126
## 3rd Qu.: 329539 3rd Qu.: 25085 3rd Qu.: 26155
## Max. :687035890 Max. :12038175 Max. :314638
## NA's :3 NA's :3 NA's :3
## MUN_EXPENDIT COMP_TOT COMP_A COMP_B
## Min. :1.421e+06 Min. : 6.0 Min. : 0.00 Min. : 0.000
## 1st Qu.:1.573e+07 1st Qu.: 68.0 1st Qu.: 1.00 1st Qu.: 0.000
## Median :2.746e+07 Median : 162.0 Median : 2.00 Median : 0.000
## Mean :1.043e+08 Mean : 906.8 Mean : 18.25 Mean : 1.852
## 3rd Qu.:5.666e+07 3rd Qu.: 448.0 3rd Qu.: 8.00 3rd Qu.: 2.000
## Max. :4.577e+10 Max. :530446.0 Max. :1948.00 Max. :274.000
## NA's :1492 NA's :3 NA's :3 NA's :3
## COMP_C COMP_D COMP_E COMP_F
## Min. : 0.00 Min. : 0.0000 Min. : 0.000 Min. : 0.00
## 1st Qu.: 3.00 1st Qu.: 0.0000 1st Qu.: 0.000 1st Qu.: 1.00
## Median : 11.00 Median : 0.0000 Median : 0.000 Median : 4.00
## Mean : 73.44 Mean : 0.4262 Mean : 2.029 Mean : 43.26
## 3rd Qu.: 39.00 3rd Qu.: 0.0000 3rd Qu.: 1.000 3rd Qu.: 15.00
## Max. :31566.00 Max. :332.0000 Max. :657.000 Max. :25222.00
## NA's :3 NA's :3 NA's :3 NA's :3
## COMP_G COMP_H COMP_I COMP_J
## Min. : 1.0 Min. : 0 Min. : 0.00 Min. : 0.00
## 1st Qu.: 32.0 1st Qu.: 1 1st Qu.: 2.00 1st Qu.: 0.00
## Median : 74.5 Median : 7 Median : 7.00 Median : 1.00
## Mean : 348.0 Mean : 41 Mean : 55.88 Mean : 24.74
## 3rd Qu.: 199.0 3rd Qu.: 25 3rd Qu.: 24.00 3rd Qu.: 5.00
## Max. :150633.0 Max. :19515 Max. :29290.00 Max. :38720.00
## NA's :3 NA's :3 NA's :3 NA's :3
## COMP_K COMP_L COMP_M COMP_N
## Min. : 0.00 Min. : 0.00 Min. : 0.00 Min. : 0.0
## 1st Qu.: 0.00 1st Qu.: 0.00 1st Qu.: 1.00 1st Qu.: 1.0
## Median : 0.00 Median : 0.00 Median : 4.00 Median : 4.0
## Mean : 15.55 Mean : 15.14 Mean : 51.29 Mean : 83.7
## 3rd Qu.: 2.00 3rd Qu.: 3.00 3rd Qu.: 13.00 3rd Qu.: 14.0
## Max. :23738.00 Max. :14003.00 Max. :49181.00 Max. :76757.0
## NA's :3 NA's :3 NA's :3 NA's :3
## COMP_O COMP_P COMP_Q COMP_R
## Min. : 0.000 Min. : 0.00 Min. : 0.00 Min. : 0.00
## 1st Qu.: 2.000 1st Qu.: 2.00 1st Qu.: 1.00 1st Qu.: 0.00
## Median : 2.000 Median : 6.00 Median : 3.00 Median : 2.00
## Mean : 3.269 Mean : 30.96 Mean : 34.15 Mean : 12.18
## 3rd Qu.: 3.000 3rd Qu.: 17.00 3rd Qu.: 12.00 3rd Qu.: 6.00
## Max. :204.000 Max. :16030.00 Max. :22248.00 Max. :6687.00
## NA's :3 NA's :3 NA's :3 NA's :3
## COMP_S COMP_T COMP_U HOTELS
## Min. : 0.00 Min. :0 Min. : 0.00000 Min. : 1.000
## 1st Qu.: 5.00 1st Qu.:0 1st Qu.: 0.00000 1st Qu.: 1.000
## Median : 12.00 Median :0 Median : 0.00000 Median : 1.000
## Mean : 51.61 Mean :0 Mean : 0.05027 Mean : 3.131
## 3rd Qu.: 31.00 3rd Qu.:0 3rd Qu.: 0.00000 3rd Qu.: 3.000
## Max. :24832.00 Max. :0 Max. :123.00000 Max. :97.000
## NA's :3 NA's :3 NA's :3 NA's :4686
## BEDS Pr_Agencies Pu_Agencies Pr_Bank
## Min. : 2.0 Min. : 0.000 Min. : 0.000 Min. : 0.000
## 1st Qu.: 40.0 1st Qu.: 0.000 1st Qu.: 1.000 1st Qu.: 0.000
## Median : 82.0 Median : 1.000 Median : 2.000 Median : 1.000
## Mean : 257.5 Mean : 3.383 Mean : 2.829 Mean : 1.312
## 3rd Qu.: 200.0 3rd Qu.: 2.000 3rd Qu.: 2.000 3rd Qu.: 2.000
## Max. :13247.0 Max. :1693.000 Max. :626.000 Max. :83.000
## NA's :4686 NA's :2231 NA's :2231 NA's :2231
## Pu_Bank Pr_Assets Pu_Assets Cars
## Min. :0.00 Min. :0.000e+00 Min. :0.000e+00 Min. : 2
## 1st Qu.:1.00 1st Qu.:0.000e+00 1st Qu.:4.047e+07 1st Qu.: 602
## Median :2.00 Median :3.231e+07 Median :1.339e+08 Median : 1438
## Mean :1.58 Mean :9.180e+09 Mean :6.005e+09 Mean : 9859
## 3rd Qu.:2.00 3rd Qu.:1.148e+08 3rd Qu.:4.970e+08 3rd Qu.: 4086
## Max. :8.00 Max. :1.947e+13 Max. :8.016e+12 Max. :5740995
## NA's :2231 NA's :2231 NA's :2231 NA's :11
## Motorcycles Wheeled_tractor UBER MAC
## Min. : 4 Min. : 0.000 1 : 125 Min. : 1.000
## 1st Qu.: 591 1st Qu.: 0.000 NA's:5448 1st Qu.: 1.000
## Median : 1285 Median : 0.000 Median : 2.000
## Mean : 4879 Mean : 5.754 Mean : 4.277
## 3rd Qu.: 3294 3rd Qu.: 1.000 3rd Qu.: 3.000
## Max. :1134570 Max. :3236.000 Max. :130.000
## NA's :11 NA's :11 NA's :5407
## WAL-MART POST_OFFICES
## Min. : 1.000 Min. : 1.000
## 1st Qu.: 1.000 1st Qu.: 1.000
## Median : 1.000 Median : 1.000
## Mean : 2.059 Mean : 2.081
## 3rd Qu.: 1.750 3rd Qu.: 2.000
## Max. :26.000 Max. :225.000
## NA's :5471 NA's :120
### 4.2 Check for NAs From the summary of brazil, we have identified that there are 9 NAs for LATLONG value
We will now check which are the rows with NA values for LATLONG data
brazil[!complete.cases(brazil$LAT),]
## # A tibble: 9 x 81
## CITY STATE CAPITAL IBGE_RES_POP IBGE_RES_POP_BR~ IBGE_RES_POP_ES~ IBGE_DU
## <chr> <fct> <fct> <dbl> <dbl> <dbl> <dbl>
## 1 Baln~ SC 0 NA NA NA NA
## 2 Lago~ RS 0 NA NA NA NA
## 3 Moju~ PA 0 NA NA NA NA
## 4 Para~ MS 0 NA NA NA NA
## 5 Pesc~ SC 0 NA NA NA NA
## 6 Pinh~ RS 0 2130 2130 0 745
## 7 Pint~ RS 0 NA NA NA NA
## 8 Sant~ BA 0 9648 9648 0 2891
## 9 São ~ PE 0 NA NA NA NA
## # ... with 74 more variables: IBGE_DU_URBAN <dbl>, IBGE_DU_RURAL <dbl>,
## # IBGE_POP <dbl>, IBGE_1 <dbl>, `IBGE_1-4` <dbl>, `IBGE_5-9` <dbl>,
## # `IBGE_10-14` <dbl>, `IBGE_15-59` <dbl>, `IBGE_60+` <dbl>,
## # IBGE_PLANTED_AREA <dbl>, `IBGE_CROP_PRODUCTION_$` <dbl>, `IDHM Ranking
## # 2010` <dbl>, IDHM <dbl>, IDHM_Renda <dbl>, IDHM_Longevidade <dbl>,
## # IDHM_Educacao <dbl>, LONG <dbl>, LAT <dbl>, ALT <dbl>, PAY_TV <dbl>,
## # FIXED_PHONES <dbl>, AREA <dbl>, REGIAO_TUR <fct>, CATEGORIA_TUR <fct>,
## # ESTIMATED_POP <dbl>, RURAL_URBAN <chr>, GVA_AGROPEC <dbl>,
## # GVA_INDUSTRY <dbl>, GVA_SERVICES <dbl>, GVA_PUBLIC <dbl>, ` GVA_TOTAL
## # ` <dbl>, TAXES <dbl>, GDP <dbl>, POP_GDP <dbl>, GDP_CAPITA <dbl>,
## # GVA_MAIN <chr>, MUN_EXPENDIT <dbl>, COMP_TOT <dbl>, COMP_A <dbl>,
## # COMP_B <dbl>, COMP_C <dbl>, COMP_D <dbl>, COMP_E <dbl>, COMP_F <dbl>,
## # COMP_G <dbl>, COMP_H <dbl>, COMP_I <dbl>, COMP_J <dbl>, COMP_K <dbl>,
## # COMP_L <dbl>, COMP_M <dbl>, COMP_N <dbl>, COMP_O <dbl>, COMP_P <dbl>,
## # COMP_Q <dbl>, COMP_R <dbl>, COMP_S <dbl>, COMP_T <dbl>, COMP_U <dbl>,
## # HOTELS <dbl>, BEDS <dbl>, Pr_Agencies <dbl>, Pu_Agencies <dbl>,
## # Pr_Bank <dbl>, Pu_Bank <dbl>, Pr_Assets <dbl>, Pu_Assets <dbl>, Cars <dbl>,
## # Motorcycles <dbl>, Wheeled_tractor <dbl>, UBER <fct>, MAC <dbl>,
## # `WAL-MART` <dbl>, POST_OFFICES <dbl>
We have identified that the cities that do not have LATLONG are:
1. Balneário Rincão
2. Lagoa Dos Patos
3. Mojuí Dos Campos
4. Paraíso Das Águas
5. Pescaria Brava
6. Pinhal Da Serra
7. Pinto Bandeira
8. Santa Terezinha
9. São Caetano
We will now fill the LATLONG values for the above cities, using https://pt.db-city.com/ to find the values.
brazil$LONG[brazil$CITY == "Balneário Rincão"] <- -49.2361
brazil$LAT[brazil$CITY == "Balneário Rincão"] <- -28.8344
brazil$LONG[brazil$CITY == "Lagoa Dos Patos"] <- --51.4725
brazil$LAT[brazil$CITY == "Lagoa Dos Patos"] <- -31.0697
brazil$LONG[brazil$CITY == "Mojuí Dos Campos"] <- -54.6431
brazil$LAT[brazil$CITY == "Mojuí Dos Campos"] <- -2.68472
brazil$LONG[brazil$CITY == "Paraíso Das Águas"] <- -53.0102
brazil$LAT[brazil$CITY == "Paraíso Das Águas"] <- -19.0257
brazil$LONG[brazil$CITY == "Pescaria Brava"] <- -48.8956
brazil$LAT[brazil$CITY == "Pescaria Brava"] <- -28.4247
brazil$LONG[brazil$CITY == "Pinhal Da Serra"] <- -51.1733
brazil$LAT[brazil$CITY == "Pinhal Da Serra"] <- -27.8747
brazil$LONG[brazil$CITY == "Pinto Bandeira"] <- -51.4503
brazil$LAT[brazil$CITY == "Pinto Bandeira"] <- -29.0978
brazil$LONG[brazil$CITY == "Santa Terezinha"] <- -39.5184
brazil$LAT[brazil$CITY == "Santa Terezinha"] <- -12.7498
brazil$LONG[brazil$CITY == "São Caetano"] <- -36.1459
brazil$LAT[brazil$CITY == "São Caetano"] <- -8.33
Comparing between cities in brazils and mun, we have identified that there are 2 CITIES which exist in brazil, but not in mun. The two cities are Santa Terezinha and São Caetano. Hence, we will remove these two cities for consistency.
brazil <- brazil%>%
filter(CITY!="Santa Terezinha") %>%
filter(CITY!="São Caetano")
Check the summary of brazil again
summary(brazil)
## CITY STATE CAPITAL IBGE_RES_POP
## Length:5568 MG : 853 0:5541 Min. : 805
## Class :character SP : 645 1: 27 1st Qu.: 5231
## Mode :character RS : 498 Median : 10936
## BA : 417 Mean : 34296
## PR : 399 3rd Qu.: 23513
## SC : 294 Max. :11253503
## (Other):2462 NA's :7
## IBGE_RES_POP_BRAS IBGE_RES_POP_ESTR IBGE_DU IBGE_DU_URBAN
## Min. : 805 Min. : 0.00 Min. : 239 Min. : 60
## 1st Qu.: 5223 1st Qu.: 0.00 1st Qu.: 1572 1st Qu.: 874
## Median : 10934 Median : 0.00 Median : 3178 Median : 1850
## Mean : 34218 Mean : 77.56 Mean : 10308 Mean : 8864
## 3rd Qu.: 23397 3rd Qu.: 10.00 3rd Qu.: 6727 3rd Qu.: 4628
## Max. :11133776 Max. :119727.00 Max. :3576148 Max. :3548433
## NA's :7 NA's :7 NA's :9 NA's :9
## IBGE_DU_RURAL IBGE_POP IBGE_1 IBGE_1-4
## Min. : 3.0 Min. : 174 Min. : 0.0 Min. : 5
## 1st Qu.: 486.8 1st Qu.: 2802 1st Qu.: 38.0 1st Qu.: 158
## Median : 931.0 Median : 6177 Median : 92.0 Median : 377
## Mean : 1462.6 Mean : 27612 Mean : 383.5 Mean : 1546
## 3rd Qu.: 1831.2 3rd Qu.: 15306 3rd Qu.: 232.0 3rd Qu.: 952
## Max. :33809.0 Max. :10463636 Max. :129464.0 Max. :514794
## NA's :80 NA's :7 NA's :7 NA's :7
## IBGE_5-9 IBGE_10-14 IBGE_15-59 IBGE_60+
## Min. : 7 Min. : 12 Min. : 94 Min. : 29
## 1st Qu.: 220 1st Qu.: 260 1st Qu.: 1735 1st Qu.: 341
## Median : 516 Median : 589 Median : 3842 Median : 723
## Mean : 2071 Mean : 2383 Mean : 18223 Mean : 3006
## 3rd Qu.: 1301 3rd Qu.: 1479 3rd Qu.: 9633 3rd Qu.: 1725
## Max. :684443 Max. :783702 Max. :7058221 Max. :1293012
## NA's :7 NA's :7 NA's :7 NA's :7
## IBGE_PLANTED_AREA IBGE_CROP_PRODUCTION_$ IDHM Ranking 2010 IDHM
## Min. : 0.0 Min. : 0 Min. : 1 Min. :0.4180
## 1st Qu.: 910.2 1st Qu.: 2328 1st Qu.:1391 1st Qu.:0.5990
## Median : 3471.5 Median : 13846 Median :2782 Median :0.6650
## Mean : 14180.2 Mean : 57389 Mean :2783 Mean :0.6592
## 3rd Qu.: 11173.2 3rd Qu.: 55594 3rd Qu.:4174 3rd Qu.:0.7180
## Max. :1205669.0 Max. :3274885 Max. :5565 Max. :0.8620
## NA's :2 NA's :2 NA's :6 NA's :6
## IDHM_Renda IDHM_Longevidade IDHM_Educacao LONG
## Min. :0.4000 Min. :0.6720 Min. :0.2070 Min. :-72.92
## 1st Qu.:0.5720 1st Qu.:0.7690 1st Qu.:0.4900 1st Qu.:-50.87
## Median :0.6540 Median :0.8080 Median :0.5600 Median :-46.52
## Mean :0.6429 Mean :0.8016 Mean :0.5591 Mean :-46.20
## 3rd Qu.:0.7070 3rd Qu.:0.8360 3rd Qu.:0.6310 3rd Qu.:-41.40
## Max. :0.8910 Max. :0.8940 Max. :0.8250 Max. : 51.47
## NA's :6 NA's :6 NA's :6
## LAT ALT PAY_TV FIXED_PHONES
## Min. :-33.688 Min. : 0.0 Min. : 1.0 Min. : 3
## 1st Qu.:-22.845 1st Qu.: 169.4 1st Qu.: 88.0 1st Qu.: 118
## Median :-18.107 Median : 406.4 Median : 247.0 Median : 328
## Mean :-16.457 Mean : 894.0 Mean : 3095.8 Mean : 6570
## 3rd Qu.: -8.495 3rd Qu.: 628.8 3rd Qu.: 815.5 3rd Qu.: 1151
## Max. : 4.585 Max. :874579.0 Max. :2047668.0 Max. :5543127
## NA's :7 NA's :1 NA's :1
## AREA REGIAO_TUR CATEGORIA_TUR
## Min. : 3.57 Corredores Das Águas: 59 A : 51
## 1st Qu.: 204.44 Vale Do Contestado : 45 B : 168
## Median : 415.86 Amazônia Atlântica : 40 C : 521
## Mean : 1517.07 Araguaia-Tocantins : 39 D :1890
## 3rd Qu.: 1026.57 Cariri : 37 E : 653
## Max. :159533.33 (Other) :3063 NA's:2285
## NA's :2 NA's :2285
## ESTIMATED_POP RURAL_URBAN GVA_AGROPEC GVA_INDUSTRY
## Min. : 786 Length:5568 Min. : 0 Min. : 1
## 1st Qu.: 5452 Class :character 1st Qu.: 4193 1st Qu.: 1725
## Median : 11591 Mode :character Median : 20432 Median : 7428
## Mean : 37447 Mean : 47285 Mean : 176050
## 3rd Qu.: 25301 3rd Qu.: 51227 3rd Qu.: 41240
## Max. :12176866 Max. :1402282 Max. :63306755
## NA's :1 NA's :2 NA's :2
## GVA_SERVICES GVA_PUBLIC GVA_TOTAL TAXES
## Min. : 2 Min. : 7 Min. : 17 Min. : -14159
## 1st Qu.: 10107 1st Qu.: 17254 1st Qu.: 42223 1st Qu.: 1305
## Median : 31214 Median : 35838 Median : 119492 Median : 5108
## Mean : 489787 Mean : 123829 Mean : 833504 Mean : 118947
## 3rd Qu.: 115503 3rd Qu.: 89301 3rd Qu.: 314139 3rd Qu.: 22208
## Max. :464656988 Max. :41902893 Max. :569910503 Max. :117125387
## NA's :2 NA's :2 NA's :2 NA's :2
## GDP POP_GDP GDP_CAPITA GVA_MAIN
## Min. : 15 Min. : 815 Min. : 3191 Length:5568
## 1st Qu.: 43706 1st Qu.: 5480 1st Qu.: 9062 Class :character
## Median : 125153 Median : 11584 Median : 15870 Mode :character
## Mean : 955185 Mean : 37018 Mean : 21132
## 3rd Qu.: 329764 3rd Qu.: 25098 3rd Qu.: 26156
## Max. :687035890 Max. :12038175 Max. :314638
## NA's :2 NA's :2 NA's :2
## MUN_EXPENDIT COMP_TOT COMP_A COMP_B
## Min. :1.421e+06 Min. : 6.0 Min. : 0.00 Min. : 0.000
## 1st Qu.:1.573e+07 1st Qu.: 68.0 1st Qu.: 1.00 1st Qu.: 0.000
## Median :2.748e+07 Median : 162.0 Median : 2.00 Median : 0.000
## Mean :1.044e+08 Mean : 907.3 Mean : 18.27 Mean : 1.853
## 3rd Qu.:5.678e+07 3rd Qu.: 448.8 3rd Qu.: 8.00 3rd Qu.: 2.000
## Max. :4.577e+10 Max. :530446.0 Max. :1948.00 Max. :274.000
## NA's :1491 NA's :2 NA's :2 NA's :2
## COMP_C COMP_D COMP_E COMP_F
## Min. : 0.00 Min. : 0.0000 Min. : 0.000 Min. : 0.00
## 1st Qu.: 3.00 1st Qu.: 0.0000 1st Qu.: 0.000 1st Qu.: 1.00
## Median : 11.00 Median : 0.0000 Median : 0.000 Median : 4.00
## Mean : 73.49 Mean : 0.4265 Mean : 2.031 Mean : 43.29
## 3rd Qu.: 39.00 3rd Qu.: 0.0000 3rd Qu.: 1.000 3rd Qu.: 15.00
## Max. :31566.00 Max. :332.0000 Max. :657.000 Max. :25222.00
## NA's :2 NA's :2 NA's :2 NA's :2
## COMP_G COMP_H COMP_I COMP_J
## Min. : 1.0 Min. : 0.00 Min. : 0.00 Min. : 0.00
## 1st Qu.: 32.0 1st Qu.: 1.00 1st Qu.: 2.00 1st Qu.: 0.00
## Median : 75.0 Median : 7.00 Median : 7.00 Median : 1.00
## Mean : 348.2 Mean : 41.02 Mean : 55.91 Mean : 24.76
## 3rd Qu.: 199.8 3rd Qu.: 25.00 3rd Qu.: 24.00 3rd Qu.: 5.00
## Max. :150633.0 Max. :19515.00 Max. :29290.00 Max. :38720.00
## NA's :2 NA's :2 NA's :2 NA's :2
## COMP_K COMP_L COMP_M COMP_N
## Min. : 0.00 Min. : 0.00 Min. : 0.00 Min. : 0.00
## 1st Qu.: 0.00 1st Qu.: 0.00 1st Qu.: 1.00 1st Qu.: 1.00
## Median : 0.00 Median : 0.00 Median : 4.00 Median : 4.00
## Mean : 15.56 Mean : 15.15 Mean : 51.33 Mean : 83.76
## 3rd Qu.: 2.00 3rd Qu.: 3.00 3rd Qu.: 13.00 3rd Qu.: 14.00
## Max. :23738.00 Max. :14003.00 Max. :49181.00 Max. :76757.00
## NA's :2 NA's :2 NA's :2 NA's :2
## COMP_O COMP_P COMP_Q COMP_R
## Min. : 0.00 Min. : 0.00 Min. : 0.00 Min. : 0.00
## 1st Qu.: 2.00 1st Qu.: 2.00 1st Qu.: 1.00 1st Qu.: 0.00
## Median : 2.00 Median : 6.00 Median : 3.00 Median : 2.00
## Mean : 3.27 Mean : 30.98 Mean : 34.17 Mean : 12.19
## 3rd Qu.: 3.00 3rd Qu.: 17.00 3rd Qu.: 12.00 3rd Qu.: 6.00
## Max. :204.00 Max. :16030.00 Max. :22248.00 Max. :6687.00
## NA's :2 NA's :2 NA's :2 NA's :2
## COMP_S COMP_T COMP_U HOTELS
## Min. : 0.00 Min. :0 Min. : 0.00000 Min. : 1.000
## 1st Qu.: 5.00 1st Qu.:0 1st Qu.: 0.00000 1st Qu.: 1.000
## Median : 12.00 Median :0 Median : 0.00000 Median : 1.000
## Mean : 51.64 Mean :0 Mean : 0.05031 Mean : 3.131
## 3rd Qu.: 31.00 3rd Qu.:0 3rd Qu.: 0.00000 3rd Qu.: 3.000
## Max. :24832.00 Max. :0 Max. :123.00000 Max. :97.000
## NA's :2 NA's :2 NA's :2 NA's :4681
## BEDS Pr_Agencies Pu_Agencies Pr_Bank
## Min. : 2.0 Min. : 0.000 Min. : 0.000 Min. : 0.000
## 1st Qu.: 40.0 1st Qu.: 0.000 1st Qu.: 1.000 1st Qu.: 0.000
## Median : 82.0 Median : 1.000 Median : 2.000 Median : 1.000
## Mean : 257.5 Mean : 3.383 Mean : 2.829 Mean : 1.312
## 3rd Qu.: 200.0 3rd Qu.: 2.000 3rd Qu.: 2.000 3rd Qu.: 2.000
## Max. :13247.0 Max. :1693.000 Max. :626.000 Max. :83.000
## NA's :4681 NA's :2226 NA's :2226 NA's :2226
## Pu_Bank Pr_Assets Pu_Assets Cars
## Min. :0.00 Min. :0.000e+00 Min. :0.000e+00 Min. : 2
## 1st Qu.:1.00 1st Qu.:0.000e+00 1st Qu.:4.047e+07 1st Qu.: 602
## Median :2.00 Median :3.231e+07 Median :1.339e+08 Median : 1440
## Mean :1.58 Mean :9.180e+09 Mean :6.005e+09 Mean : 9864
## 3rd Qu.:2.00 3rd Qu.:1.148e+08 3rd Qu.:4.970e+08 3rd Qu.: 4088
## Max. :8.00 Max. :1.947e+13 Max. :8.016e+12 Max. :5740995
## NA's :2226 NA's :2226 NA's :2226 NA's :9
## Motorcycles Wheeled_tractor UBER MAC
## Min. : 4 Min. : 0.000 1 : 125 Min. : 1.000
## 1st Qu.: 591 1st Qu.: 0.000 NA's:5443 1st Qu.: 1.000
## Median : 1285 Median : 0.000 Median : 2.000
## Mean : 4881 Mean : 5.756 Mean : 4.277
## 3rd Qu.: 3297 3rd Qu.: 1.000 3rd Qu.: 3.000
## Max. :1134570 Max. :3236.000 Max. :130.000
## NA's :9 NA's :9 NA's :5402
## WAL-MART POST_OFFICES
## Min. : 1.000 Min. : 1.000
## 1st Qu.: 1.000 1st Qu.: 1.000
## Median : 1.000 Median : 1.000
## Mean : 2.059 Mean : 2.081
## 3rd Qu.: 1.750 3rd Qu.: 2.000
## Max. :26.000 Max. :225.000
## NA's :5466 NA's :119
We have also identified that there are NA values for GDP_CAPITA, and we will remove these NA values, as it is hard to assign a value to it.
brazil <- brazil%>%
filter(!is.na(`GDP_CAPITA`))
summary(brazil)
## CITY STATE CAPITAL IBGE_RES_POP
## Length:5566 MG : 853 0:5539 Min. : 805
## Class :character SP : 645 1: 27 1st Qu.: 5231
## Mode :character RS : 497 Median : 10936
## BA : 416 Mean : 34296
## PR : 399 3rd Qu.: 23513
## SC : 294 Max. :11253503
## (Other):2462 NA's :5
## IBGE_RES_POP_BRAS IBGE_RES_POP_ESTR IBGE_DU IBGE_DU_URBAN
## Min. : 805 Min. : 0.00 Min. : 239 Min. : 60
## 1st Qu.: 5223 1st Qu.: 0.00 1st Qu.: 1572 1st Qu.: 874
## Median : 10934 Median : 0.00 Median : 3178 Median : 1850
## Mean : 34218 Mean : 77.56 Mean : 10308 Mean : 8864
## 3rd Qu.: 23397 3rd Qu.: 10.00 3rd Qu.: 6727 3rd Qu.: 4628
## Max. :11133776 Max. :119727.00 Max. :3576148 Max. :3548433
## NA's :5 NA's :5 NA's :7 NA's :7
## IBGE_DU_RURAL IBGE_POP IBGE_1 IBGE_1-4
## Min. : 3.0 Min. : 174 Min. : 0.0 Min. : 5
## 1st Qu.: 486.8 1st Qu.: 2802 1st Qu.: 38.0 1st Qu.: 158
## Median : 931.0 Median : 6177 Median : 92.0 Median : 377
## Mean : 1462.6 Mean : 27612 Mean : 383.5 Mean : 1546
## 3rd Qu.: 1831.2 3rd Qu.: 15306 3rd Qu.: 232.0 3rd Qu.: 952
## Max. :33809.0 Max. :10463636 Max. :129464.0 Max. :514794
## NA's :78 NA's :5 NA's :5 NA's :5
## IBGE_5-9 IBGE_10-14 IBGE_15-59 IBGE_60+
## Min. : 7 Min. : 12 Min. : 94 Min. : 29
## 1st Qu.: 220 1st Qu.: 260 1st Qu.: 1735 1st Qu.: 341
## Median : 516 Median : 589 Median : 3842 Median : 723
## Mean : 2071 Mean : 2383 Mean : 18223 Mean : 3006
## 3rd Qu.: 1301 3rd Qu.: 1479 3rd Qu.: 9633 3rd Qu.: 1725
## Max. :684443 Max. :783702 Max. :7058221 Max. :1293012
## NA's :5 NA's :5 NA's :5 NA's :5
## IBGE_PLANTED_AREA IBGE_CROP_PRODUCTION_$ IDHM Ranking 2010 IDHM
## Min. : 0.0 Min. : 0 Min. : 1 Min. :0.4180
## 1st Qu.: 910.2 1st Qu.: 2328 1st Qu.:1391 1st Qu.:0.5990
## Median : 3471.5 Median : 13846 Median :2782 Median :0.6650
## Mean : 14180.2 Mean : 57389 Mean :2782 Mean :0.6592
## 3rd Qu.: 11173.2 3rd Qu.: 55594 3rd Qu.:4173 3rd Qu.:0.7180
## Max. :1205669.0 Max. :3274885 Max. :5565 Max. :0.8620
## NA's :5 NA's :5
## IDHM_Renda IDHM_Longevidade IDHM_Educacao LONG
## Min. :0.4000 Min. :0.6720 Min. :0.2070 Min. :-72.92
## 1st Qu.:0.5720 1st Qu.:0.7690 1st Qu.:0.4900 1st Qu.:-50.87
## Median :0.6540 Median :0.8080 Median :0.5600 Median :-46.53
## Mean :0.6429 Mean :0.8016 Mean :0.5591 Mean :-46.22
## 3rd Qu.:0.7070 3rd Qu.:0.8360 3rd Qu.:0.6310 3rd Qu.:-41.41
## Max. :0.8910 Max. :0.8940 Max. :0.8250 Max. : 51.47
## NA's :5 NA's :5 NA's :5
## LAT ALT PAY_TV FIXED_PHONES
## Min. :-33.688 Min. : 0.0 Min. : 1.0 Min. : 3
## 1st Qu.:-22.843 1st Qu.: 169.4 1st Qu.: 88.0 1st Qu.: 118
## Median :-18.107 Median : 406.5 Median : 247.0 Median : 328
## Mean :-16.455 Mean : 894.2 Mean : 3096.3 Mean : 6572
## 3rd Qu.: -8.491 3rd Qu.: 628.9 3rd Qu.: 815.8 3rd Qu.: 1151
## Max. : 4.585 Max. :874579.0 Max. :2047668.0 Max. :5543127
## NA's :6
## AREA REGIAO_TUR CATEGORIA_TUR
## Min. : 3.57 Corredores Das Águas: 59 A : 51
## 1st Qu.: 204.43 Vale Do Contestado : 45 B : 168
## Median : 415.81 Amazônia Atlântica : 40 C : 521
## Mean : 1515.52 Araguaia-Tocantins : 39 D :1889
## 3rd Qu.: 1026.38 Cariri : 37 E : 653
## Max. :159533.33 (Other) :3062 NA's:2284
## NA's :1 NA's :2284
## ESTIMATED_POP RURAL_URBAN GVA_AGROPEC GVA_INDUSTRY
## Min. : 786 Length:5566 Min. : 0 Min. : 1
## 1st Qu.: 5451 Class :character 1st Qu.: 4193 1st Qu.: 1725
## Median : 11591 Mode :character Median : 20432 Median : 7428
## Mean : 37452 Mean : 47285 Mean : 176050
## 3rd Qu.: 25303 3rd Qu.: 51227 3rd Qu.: 41240
## Max. :12176866 Max. :1402282 Max. :63306755
##
## GVA_SERVICES GVA_PUBLIC GVA_TOTAL TAXES
## Min. : 2 Min. : 7 Min. : 17 Min. : -14159
## 1st Qu.: 10107 1st Qu.: 17254 1st Qu.: 42223 1st Qu.: 1305
## Median : 31214 Median : 35838 Median : 119492 Median : 5108
## Mean : 489787 Mean : 123829 Mean : 833504 Mean : 118947
## 3rd Qu.: 115503 3rd Qu.: 89301 3rd Qu.: 314139 3rd Qu.: 22208
## Max. :464656988 Max. :41902893 Max. :569910503 Max. :117125387
##
## GDP POP_GDP GDP_CAPITA GVA_MAIN
## Min. : 15 Min. : 815 Min. : 3191 Length:5566
## 1st Qu.: 43706 1st Qu.: 5480 1st Qu.: 9062 Class :character
## Median : 125153 Median : 11584 Median : 15870 Mode :character
## Mean : 955185 Mean : 37018 Mean : 21132
## 3rd Qu.: 329764 3rd Qu.: 25098 3rd Qu.: 26156
## Max. :687035890 Max. :12038175 Max. :314638
##
## MUN_EXPENDIT COMP_TOT COMP_A COMP_B
## Min. :1.421e+06 Min. : 6.0 Min. : 0.00 Min. : 0.000
## 1st Qu.:1.573e+07 1st Qu.: 68.0 1st Qu.: 1.00 1st Qu.: 0.000
## Median :2.749e+07 Median : 162.0 Median : 2.00 Median : 0.000
## Mean :1.044e+08 Mean : 907.3 Mean : 18.27 Mean : 1.853
## 3rd Qu.:5.679e+07 3rd Qu.: 448.8 3rd Qu.: 8.00 3rd Qu.: 2.000
## Max. :4.577e+10 Max. :530446.0 Max. :1948.00 Max. :274.000
## NA's :1490
## COMP_C COMP_D COMP_E COMP_F
## Min. : 0.00 Min. : 0.0000 Min. : 0.000 Min. : 0.00
## 1st Qu.: 3.00 1st Qu.: 0.0000 1st Qu.: 0.000 1st Qu.: 1.00
## Median : 11.00 Median : 0.0000 Median : 0.000 Median : 4.00
## Mean : 73.49 Mean : 0.4265 Mean : 2.031 Mean : 43.29
## 3rd Qu.: 39.00 3rd Qu.: 0.0000 3rd Qu.: 1.000 3rd Qu.: 15.00
## Max. :31566.00 Max. :332.0000 Max. :657.000 Max. :25222.00
##
## COMP_G COMP_H COMP_I COMP_J
## Min. : 1.0 Min. : 0.00 Min. : 0.00 Min. : 0.00
## 1st Qu.: 32.0 1st Qu.: 1.00 1st Qu.: 2.00 1st Qu.: 0.00
## Median : 75.0 Median : 7.00 Median : 7.00 Median : 1.00
## Mean : 348.2 Mean : 41.02 Mean : 55.91 Mean : 24.76
## 3rd Qu.: 199.8 3rd Qu.: 25.00 3rd Qu.: 24.00 3rd Qu.: 5.00
## Max. :150633.0 Max. :19515.00 Max. :29290.00 Max. :38720.00
##
## COMP_K COMP_L COMP_M COMP_N
## Min. : 0.00 Min. : 0.00 Min. : 0.00 Min. : 0.00
## 1st Qu.: 0.00 1st Qu.: 0.00 1st Qu.: 1.00 1st Qu.: 1.00
## Median : 0.00 Median : 0.00 Median : 4.00 Median : 4.00
## Mean : 15.56 Mean : 15.15 Mean : 51.33 Mean : 83.76
## 3rd Qu.: 2.00 3rd Qu.: 3.00 3rd Qu.: 13.00 3rd Qu.: 14.00
## Max. :23738.00 Max. :14003.00 Max. :49181.00 Max. :76757.00
##
## COMP_O COMP_P COMP_Q COMP_R
## Min. : 0.00 Min. : 0.00 Min. : 0.00 Min. : 0.00
## 1st Qu.: 2.00 1st Qu.: 2.00 1st Qu.: 1.00 1st Qu.: 0.00
## Median : 2.00 Median : 6.00 Median : 3.00 Median : 2.00
## Mean : 3.27 Mean : 30.98 Mean : 34.17 Mean : 12.19
## 3rd Qu.: 3.00 3rd Qu.: 17.00 3rd Qu.: 12.00 3rd Qu.: 6.00
## Max. :204.00 Max. :16030.00 Max. :22248.00 Max. :6687.00
##
## COMP_S COMP_T COMP_U HOTELS
## Min. : 0.00 Min. :0 Min. : 0.00000 Min. : 1.000
## 1st Qu.: 5.00 1st Qu.:0 1st Qu.: 0.00000 1st Qu.: 1.000
## Median : 12.00 Median :0 Median : 0.00000 Median : 1.000
## Mean : 51.64 Mean :0 Mean : 0.05031 Mean : 3.131
## 3rd Qu.: 31.00 3rd Qu.:0 3rd Qu.: 0.00000 3rd Qu.: 3.000
## Max. :24832.00 Max. :0 Max. :123.00000 Max. :97.000
## NA's :4679
## BEDS Pr_Agencies Pu_Agencies Pr_Bank
## Min. : 2.0 Min. : 0.000 Min. : 0.000 Min. : 0.000
## 1st Qu.: 40.0 1st Qu.: 0.000 1st Qu.: 1.000 1st Qu.: 0.000
## Median : 82.0 Median : 1.000 Median : 2.000 Median : 1.000
## Mean : 257.5 Mean : 3.383 Mean : 2.829 Mean : 1.312
## 3rd Qu.: 200.0 3rd Qu.: 2.000 3rd Qu.: 2.000 3rd Qu.: 2.000
## Max. :13247.0 Max. :1693.000 Max. :626.000 Max. :83.000
## NA's :4679 NA's :2224 NA's :2224 NA's :2224
## Pu_Bank Pr_Assets Pu_Assets Cars
## Min. :0.00 Min. :0.000e+00 Min. :0.000e+00 Min. : 2
## 1st Qu.:1.00 1st Qu.:0.000e+00 1st Qu.:4.047e+07 1st Qu.: 602
## Median :2.00 Median :3.231e+07 Median :1.339e+08 Median : 1440
## Mean :1.58 Mean :9.180e+09 Mean :6.005e+09 Mean : 9866
## 3rd Qu.:2.00 3rd Qu.:1.148e+08 3rd Qu.:4.970e+08 3rd Qu.: 4089
## Max. :8.00 Max. :1.947e+13 Max. :8.016e+12 Max. :5740995
## NA's :2224 NA's :2224 NA's :2224 NA's :8
## Motorcycles Wheeled_tractor UBER MAC
## Min. : 4 Min. : 0.000 1 : 125 Min. : 1.000
## 1st Qu.: 591 1st Qu.: 0.000 NA's:5441 1st Qu.: 1.000
## Median : 1285 Median : 0.000 Median : 2.000
## Mean : 4881 Mean : 5.757 Mean : 4.277
## 3rd Qu.: 3298 3rd Qu.: 1.000 3rd Qu.: 3.000
## Max. :1134570 Max. :3236.000 Max. :130.000
## NA's :8 NA's :8 NA's :5400
## WAL-MART POST_OFFICES
## Min. : 1.000 Min. : 1.000
## 1st Qu.: 1.000 1st Qu.: 1.000
## Median : 1.000 Median : 1.000
## Mean : 2.059 Mean : 2.081
## 3rd Qu.: 1.750 3rd Qu.: 2.000
## Max. :26.000 Max. :225.000
## NA's :5464 NA's :117
There are now no more NA values for GDP_CAPITA
To help us better understand which variables are needed in calculating GDP_CAPITA, and why we need these variables ### 5.1 EDA using statistical graphics We can plot the distribution of GDP_CAPITA by using histograms
ggplot(data=brazil, aes(x=`GDP_CAPITA`))+
geom_histogram(bins=20, color="black", fill="light blue")
The histogram above shows a right skewed distribution. This suggests that more municipalities have relatively lower GDP per CAPITA. We will normalise the skewed distribution by using log transformation
brazil <- brazil%>%
mutate(`LOG_GDP_CAPITA`=log(GDP_CAPITA))
Then, plot the LOG_GDP_CAPITA histogram
ggplot(data=brazil, aes(x=`LOG_GDP_CAPITA`))+
geom_histogram(bins=20, color="black", fill="light blue")
The histogram now is less skewed, and in fact, resembles a normal distribution.
Next, we will rename the variables as some original names are quite long / difficult to use. This ensures simplicity for us.
names(brazil)[names(brazil) == "IBGE_CROP_PRODUCTION_$"] <- "IBGE_CROP_PRODUCTION"
names(brazil)[names(brazil) == " GVA_TOTAL "] <- "GVA_TOTAL"
names(brazil)[names(brazil) == "IBGE_15-59"] <- "Active_pop"
We draw a boxplot to illustrate the distribution over the different states in brazil
brazil_states <- brazil%>%
group_by(STATE)
ggplot(data=brazil_states, mapping=aes(x=STATE, y=GDP_CAPITA)) + geom_boxplot()+
ggtitle("Distribution of GDP per CAPITA across states in brazil")
There are a few outliners for some states (BA, MS), but with some countries very well spread out, such as AC, RR.
We will filter out the top 10 countries with the highest GDP_CAPITA, and identify if the variables that might have possibly resulted in the high GDP value
brazil_gdp_capita <- brazil%>%
arrange(desc(GDP_CAPITA))%>%
top_n(n=10, wt=GDP_CAPITA)
head(brazil_gdp_capita)
## # A tibble: 6 x 82
## CITY STATE CAPITAL IBGE_RES_POP IBGE_RES_POP_BR~ IBGE_RES_POP_ES~ IBGE_DU
## <chr> <fct> <fct> <dbl> <dbl> <dbl> <dbl>
## 1 Paul~ SP 0 82146 81967 179 24311
## 2 Selv~ MS 0 6287 6287 0 2003
## 3 São ~ BA 0 33183 33183 0 9503
## 4 Triu~ RS 0 25793 25787 6 8635
## 5 Brej~ SP 0 2573 2565 8 822
## 6 Seba~ SP 0 3031 3031 0 1055
## # ... with 75 more variables: IBGE_DU_URBAN <dbl>, IBGE_DU_RURAL <dbl>,
## # IBGE_POP <dbl>, IBGE_1 <dbl>, `IBGE_1-4` <dbl>, `IBGE_5-9` <dbl>,
## # `IBGE_10-14` <dbl>, Active_pop <dbl>, `IBGE_60+` <dbl>,
## # IBGE_PLANTED_AREA <dbl>, IBGE_CROP_PRODUCTION <dbl>, `IDHM Ranking
## # 2010` <dbl>, IDHM <dbl>, IDHM_Renda <dbl>, IDHM_Longevidade <dbl>,
## # IDHM_Educacao <dbl>, LONG <dbl>, LAT <dbl>, ALT <dbl>, PAY_TV <dbl>,
## # FIXED_PHONES <dbl>, AREA <dbl>, REGIAO_TUR <fct>, CATEGORIA_TUR <fct>,
## # ESTIMATED_POP <dbl>, RURAL_URBAN <chr>, GVA_AGROPEC <dbl>,
## # GVA_INDUSTRY <dbl>, GVA_SERVICES <dbl>, GVA_PUBLIC <dbl>, GVA_TOTAL <dbl>,
## # TAXES <dbl>, GDP <dbl>, POP_GDP <dbl>, GDP_CAPITA <dbl>, GVA_MAIN <chr>,
## # MUN_EXPENDIT <dbl>, COMP_TOT <dbl>, COMP_A <dbl>, COMP_B <dbl>,
## # COMP_C <dbl>, COMP_D <dbl>, COMP_E <dbl>, COMP_F <dbl>, COMP_G <dbl>,
## # COMP_H <dbl>, COMP_I <dbl>, COMP_J <dbl>, COMP_K <dbl>, COMP_L <dbl>,
## # COMP_M <dbl>, COMP_N <dbl>, COMP_O <dbl>, COMP_P <dbl>, COMP_Q <dbl>,
## # COMP_R <dbl>, COMP_S <dbl>, COMP_T <dbl>, COMP_U <dbl>, HOTELS <dbl>,
## # BEDS <dbl>, Pr_Agencies <dbl>, Pu_Agencies <dbl>, Pr_Bank <dbl>,
## # Pu_Bank <dbl>, Pr_Assets <dbl>, Pu_Assets <dbl>, Cars <dbl>,
## # Motorcycles <dbl>, Wheeled_tractor <dbl>, UBER <fct>, MAC <dbl>,
## # `WAL-MART` <dbl>, POST_OFFICES <dbl>, LOG_GDP_CAPITA <dbl>
We are able to identify several variables that are high in value in these countries that have high GDP_CAPITA
For example, IBGE_RES_POP, Active_pop, IBGE_CROP_PRODUCTION, IDHM, PAY_TV, FIXED_PHONES, AREA, GVA_TOTAL, TAXES, GDP, POP_GDP, MUN_EXPENDIT, COMP_TOT, Cars, Motorcycles –> shows correlation between the two variables.
### 5.4 Variables selection Besides the variables that affect GDP_CAPITA in the analysis above, we will consider also the GDP equation, which is C + I + G + (X-M). Simply put, it is the sum of Consumption, Investment, Government expenditures, and Net Exports. We will then identify variables that might have an impact on these components as mentioned, before determining which factors to use.
summary(brazil)
## CITY STATE CAPITAL IBGE_RES_POP
## Length:5566 MG : 853 0:5539 Min. : 805
## Class :character SP : 645 1: 27 1st Qu.: 5231
## Mode :character RS : 497 Median : 10936
## BA : 416 Mean : 34296
## PR : 399 3rd Qu.: 23513
## SC : 294 Max. :11253503
## (Other):2462 NA's :5
## IBGE_RES_POP_BRAS IBGE_RES_POP_ESTR IBGE_DU IBGE_DU_URBAN
## Min. : 805 Min. : 0.00 Min. : 239 Min. : 60
## 1st Qu.: 5223 1st Qu.: 0.00 1st Qu.: 1572 1st Qu.: 874
## Median : 10934 Median : 0.00 Median : 3178 Median : 1850
## Mean : 34218 Mean : 77.56 Mean : 10308 Mean : 8864
## 3rd Qu.: 23397 3rd Qu.: 10.00 3rd Qu.: 6727 3rd Qu.: 4628
## Max. :11133776 Max. :119727.00 Max. :3576148 Max. :3548433
## NA's :5 NA's :5 NA's :7 NA's :7
## IBGE_DU_RURAL IBGE_POP IBGE_1 IBGE_1-4
## Min. : 3.0 Min. : 174 Min. : 0.0 Min. : 5
## 1st Qu.: 486.8 1st Qu.: 2802 1st Qu.: 38.0 1st Qu.: 158
## Median : 931.0 Median : 6177 Median : 92.0 Median : 377
## Mean : 1462.6 Mean : 27612 Mean : 383.5 Mean : 1546
## 3rd Qu.: 1831.2 3rd Qu.: 15306 3rd Qu.: 232.0 3rd Qu.: 952
## Max. :33809.0 Max. :10463636 Max. :129464.0 Max. :514794
## NA's :78 NA's :5 NA's :5 NA's :5
## IBGE_5-9 IBGE_10-14 Active_pop IBGE_60+
## Min. : 7 Min. : 12 Min. : 94 Min. : 29
## 1st Qu.: 220 1st Qu.: 260 1st Qu.: 1735 1st Qu.: 341
## Median : 516 Median : 589 Median : 3842 Median : 723
## Mean : 2071 Mean : 2383 Mean : 18223 Mean : 3006
## 3rd Qu.: 1301 3rd Qu.: 1479 3rd Qu.: 9633 3rd Qu.: 1725
## Max. :684443 Max. :783702 Max. :7058221 Max. :1293012
## NA's :5 NA's :5 NA's :5 NA's :5
## IBGE_PLANTED_AREA IBGE_CROP_PRODUCTION IDHM Ranking 2010 IDHM
## Min. : 0.0 Min. : 0 Min. : 1 Min. :0.4180
## 1st Qu.: 910.2 1st Qu.: 2328 1st Qu.:1391 1st Qu.:0.5990
## Median : 3471.5 Median : 13846 Median :2782 Median :0.6650
## Mean : 14180.2 Mean : 57389 Mean :2782 Mean :0.6592
## 3rd Qu.: 11173.2 3rd Qu.: 55594 3rd Qu.:4173 3rd Qu.:0.7180
## Max. :1205669.0 Max. :3274885 Max. :5565 Max. :0.8620
## NA's :5 NA's :5
## IDHM_Renda IDHM_Longevidade IDHM_Educacao LONG
## Min. :0.4000 Min. :0.6720 Min. :0.2070 Min. :-72.92
## 1st Qu.:0.5720 1st Qu.:0.7690 1st Qu.:0.4900 1st Qu.:-50.87
## Median :0.6540 Median :0.8080 Median :0.5600 Median :-46.53
## Mean :0.6429 Mean :0.8016 Mean :0.5591 Mean :-46.22
## 3rd Qu.:0.7070 3rd Qu.:0.8360 3rd Qu.:0.6310 3rd Qu.:-41.41
## Max. :0.8910 Max. :0.8940 Max. :0.8250 Max. : 51.47
## NA's :5 NA's :5 NA's :5
## LAT ALT PAY_TV FIXED_PHONES
## Min. :-33.688 Min. : 0.0 Min. : 1.0 Min. : 3
## 1st Qu.:-22.843 1st Qu.: 169.4 1st Qu.: 88.0 1st Qu.: 118
## Median :-18.107 Median : 406.5 Median : 247.0 Median : 328
## Mean :-16.455 Mean : 894.2 Mean : 3096.3 Mean : 6572
## 3rd Qu.: -8.491 3rd Qu.: 628.9 3rd Qu.: 815.8 3rd Qu.: 1151
## Max. : 4.585 Max. :874579.0 Max. :2047668.0 Max. :5543127
## NA's :6
## AREA REGIAO_TUR CATEGORIA_TUR
## Min. : 3.57 Corredores Das Águas: 59 A : 51
## 1st Qu.: 204.43 Vale Do Contestado : 45 B : 168
## Median : 415.81 Amazônia Atlântica : 40 C : 521
## Mean : 1515.52 Araguaia-Tocantins : 39 D :1889
## 3rd Qu.: 1026.38 Cariri : 37 E : 653
## Max. :159533.33 (Other) :3062 NA's:2284
## NA's :1 NA's :2284
## ESTIMATED_POP RURAL_URBAN GVA_AGROPEC GVA_INDUSTRY
## Min. : 786 Length:5566 Min. : 0 Min. : 1
## 1st Qu.: 5451 Class :character 1st Qu.: 4193 1st Qu.: 1725
## Median : 11591 Mode :character Median : 20432 Median : 7428
## Mean : 37452 Mean : 47285 Mean : 176050
## 3rd Qu.: 25303 3rd Qu.: 51227 3rd Qu.: 41240
## Max. :12176866 Max. :1402282 Max. :63306755
##
## GVA_SERVICES GVA_PUBLIC GVA_TOTAL TAXES
## Min. : 2 Min. : 7 Min. : 17 Min. : -14159
## 1st Qu.: 10107 1st Qu.: 17254 1st Qu.: 42223 1st Qu.: 1305
## Median : 31214 Median : 35838 Median : 119492 Median : 5108
## Mean : 489787 Mean : 123829 Mean : 833504 Mean : 118947
## 3rd Qu.: 115503 3rd Qu.: 89301 3rd Qu.: 314139 3rd Qu.: 22208
## Max. :464656988 Max. :41902893 Max. :569910503 Max. :117125387
##
## GDP POP_GDP GDP_CAPITA GVA_MAIN
## Min. : 15 Min. : 815 Min. : 3191 Length:5566
## 1st Qu.: 43706 1st Qu.: 5480 1st Qu.: 9062 Class :character
## Median : 125153 Median : 11584 Median : 15870 Mode :character
## Mean : 955185 Mean : 37018 Mean : 21132
## 3rd Qu.: 329764 3rd Qu.: 25098 3rd Qu.: 26156
## Max. :687035890 Max. :12038175 Max. :314638
##
## MUN_EXPENDIT COMP_TOT COMP_A COMP_B
## Min. :1.421e+06 Min. : 6.0 Min. : 0.00 Min. : 0.000
## 1st Qu.:1.573e+07 1st Qu.: 68.0 1st Qu.: 1.00 1st Qu.: 0.000
## Median :2.749e+07 Median : 162.0 Median : 2.00 Median : 0.000
## Mean :1.044e+08 Mean : 907.3 Mean : 18.27 Mean : 1.853
## 3rd Qu.:5.679e+07 3rd Qu.: 448.8 3rd Qu.: 8.00 3rd Qu.: 2.000
## Max. :4.577e+10 Max. :530446.0 Max. :1948.00 Max. :274.000
## NA's :1490
## COMP_C COMP_D COMP_E COMP_F
## Min. : 0.00 Min. : 0.0000 Min. : 0.000 Min. : 0.00
## 1st Qu.: 3.00 1st Qu.: 0.0000 1st Qu.: 0.000 1st Qu.: 1.00
## Median : 11.00 Median : 0.0000 Median : 0.000 Median : 4.00
## Mean : 73.49 Mean : 0.4265 Mean : 2.031 Mean : 43.29
## 3rd Qu.: 39.00 3rd Qu.: 0.0000 3rd Qu.: 1.000 3rd Qu.: 15.00
## Max. :31566.00 Max. :332.0000 Max. :657.000 Max. :25222.00
##
## COMP_G COMP_H COMP_I COMP_J
## Min. : 1.0 Min. : 0.00 Min. : 0.00 Min. : 0.00
## 1st Qu.: 32.0 1st Qu.: 1.00 1st Qu.: 2.00 1st Qu.: 0.00
## Median : 75.0 Median : 7.00 Median : 7.00 Median : 1.00
## Mean : 348.2 Mean : 41.02 Mean : 55.91 Mean : 24.76
## 3rd Qu.: 199.8 3rd Qu.: 25.00 3rd Qu.: 24.00 3rd Qu.: 5.00
## Max. :150633.0 Max. :19515.00 Max. :29290.00 Max. :38720.00
##
## COMP_K COMP_L COMP_M COMP_N
## Min. : 0.00 Min. : 0.00 Min. : 0.00 Min. : 0.00
## 1st Qu.: 0.00 1st Qu.: 0.00 1st Qu.: 1.00 1st Qu.: 1.00
## Median : 0.00 Median : 0.00 Median : 4.00 Median : 4.00
## Mean : 15.56 Mean : 15.15 Mean : 51.33 Mean : 83.76
## 3rd Qu.: 2.00 3rd Qu.: 3.00 3rd Qu.: 13.00 3rd Qu.: 14.00
## Max. :23738.00 Max. :14003.00 Max. :49181.00 Max. :76757.00
##
## COMP_O COMP_P COMP_Q COMP_R
## Min. : 0.00 Min. : 0.00 Min. : 0.00 Min. : 0.00
## 1st Qu.: 2.00 1st Qu.: 2.00 1st Qu.: 1.00 1st Qu.: 0.00
## Median : 2.00 Median : 6.00 Median : 3.00 Median : 2.00
## Mean : 3.27 Mean : 30.98 Mean : 34.17 Mean : 12.19
## 3rd Qu.: 3.00 3rd Qu.: 17.00 3rd Qu.: 12.00 3rd Qu.: 6.00
## Max. :204.00 Max. :16030.00 Max. :22248.00 Max. :6687.00
##
## COMP_S COMP_T COMP_U HOTELS
## Min. : 0.00 Min. :0 Min. : 0.00000 Min. : 1.000
## 1st Qu.: 5.00 1st Qu.:0 1st Qu.: 0.00000 1st Qu.: 1.000
## Median : 12.00 Median :0 Median : 0.00000 Median : 1.000
## Mean : 51.64 Mean :0 Mean : 0.05031 Mean : 3.131
## 3rd Qu.: 31.00 3rd Qu.:0 3rd Qu.: 0.00000 3rd Qu.: 3.000
## Max. :24832.00 Max. :0 Max. :123.00000 Max. :97.000
## NA's :4679
## BEDS Pr_Agencies Pu_Agencies Pr_Bank
## Min. : 2.0 Min. : 0.000 Min. : 0.000 Min. : 0.000
## 1st Qu.: 40.0 1st Qu.: 0.000 1st Qu.: 1.000 1st Qu.: 0.000
## Median : 82.0 Median : 1.000 Median : 2.000 Median : 1.000
## Mean : 257.5 Mean : 3.383 Mean : 2.829 Mean : 1.312
## 3rd Qu.: 200.0 3rd Qu.: 2.000 3rd Qu.: 2.000 3rd Qu.: 2.000
## Max. :13247.0 Max. :1693.000 Max. :626.000 Max. :83.000
## NA's :4679 NA's :2224 NA's :2224 NA's :2224
## Pu_Bank Pr_Assets Pu_Assets Cars
## Min. :0.00 Min. :0.000e+00 Min. :0.000e+00 Min. : 2
## 1st Qu.:1.00 1st Qu.:0.000e+00 1st Qu.:4.047e+07 1st Qu.: 602
## Median :2.00 Median :3.231e+07 Median :1.339e+08 Median : 1440
## Mean :1.58 Mean :9.180e+09 Mean :6.005e+09 Mean : 9866
## 3rd Qu.:2.00 3rd Qu.:1.148e+08 3rd Qu.:4.970e+08 3rd Qu.: 4089
## Max. :8.00 Max. :1.947e+13 Max. :8.016e+12 Max. :5740995
## NA's :2224 NA's :2224 NA's :2224 NA's :8
## Motorcycles Wheeled_tractor UBER MAC
## Min. : 4 Min. : 0.000 1 : 125 Min. : 1.000
## 1st Qu.: 591 1st Qu.: 0.000 NA's:5441 1st Qu.: 1.000
## Median : 1285 Median : 0.000 Median : 2.000
## Mean : 4881 Mean : 5.757 Mean : 4.277
## 3rd Qu.: 3298 3rd Qu.: 1.000 3rd Qu.: 3.000
## Max. :1134570 Max. :3236.000 Max. :130.000
## NA's :8 NA's :8 NA's :5400
## WAL-MART POST_OFFICES LOG_GDP_CAPITA
## Min. : 1.000 Min. : 1.000 Min. : 8.068
## 1st Qu.: 1.000 1st Qu.: 1.000 1st Qu.: 9.112
## Median : 1.000 Median : 1.000 Median : 9.672
## Mean : 2.059 Mean : 2.081 Mean : 9.697
## 3rd Qu.: 1.750 3rd Qu.: 2.000 3rd Qu.:10.172
## Max. :26.000 Max. :225.000 Max. :12.659
## NA's :5464 NA's :117
Raw Variables that we will consider using:
IBGE_RES_POP
Active_pop
IBGE_CROP_PRODUCTION
IDHM_Longevidade
IDHM_Educacao
IDHM_Renda
PAY_TV
FIXED_PHONES
AREA
GVA_AGROPEC
GVA_INDUSTRY
GVA_SERVICES
TAXES
GDP
POP_GDP
GDP_CAPITA
MUN_EXPENDIT
COMP_TOT
Cars
Motorcycles
Pr_Assets
Pu_Assets
We choose IBGE_RES_POP since it is the population of Brazil, which includes residents and foreigners both.
IBGE_RES_POP has 8 NA values, which could be due to the particular city not having any population. We will remove these NA values as it does not have any impact on GDP_CAPITA. It is also hard to assign a value to these
We have identified that after removing these 8 NAs, the amount of NAs in other variables have decreased too. This could be due to cities having multiple NA values for different variables
brazil <- brazil%>%
filter(!is.na(`IBGE_RES_POP`))
summary(brazil)
## CITY STATE CAPITAL IBGE_RES_POP
## Length:5561 MG : 853 0:5534 Min. : 805
## Class :character SP : 645 1: 27 1st Qu.: 5231
## Mode :character RS : 496 Median : 10936
## BA : 416 Mean : 34296
## PR : 399 3rd Qu.: 23513
## SC : 292 Max. :11253503
## (Other):2460
## IBGE_RES_POP_BRAS IBGE_RES_POP_ESTR IBGE_DU IBGE_DU_URBAN
## Min. : 805 Min. : 0.00 Min. : 239 Min. : 60
## 1st Qu.: 5223 1st Qu.: 0.00 1st Qu.: 1572 1st Qu.: 874
## Median : 10934 Median : 0.00 Median : 3178 Median : 1850
## Mean : 34218 Mean : 77.56 Mean : 10308 Mean : 8864
## 3rd Qu.: 23397 3rd Qu.: 10.00 3rd Qu.: 6727 3rd Qu.: 4628
## Max. :11133776 Max. :119727.00 Max. :3576148 Max. :3548433
## NA's :2 NA's :2
## IBGE_DU_RURAL IBGE_POP IBGE_1 IBGE_1-4
## Min. : 3.0 Min. : 174 Min. : 0.0 Min. : 5
## 1st Qu.: 486.8 1st Qu.: 2802 1st Qu.: 38.0 1st Qu.: 158
## Median : 931.0 Median : 6177 Median : 92.0 Median : 377
## Mean : 1462.6 Mean : 27612 Mean : 383.5 Mean : 1546
## 3rd Qu.: 1831.2 3rd Qu.: 15306 3rd Qu.: 232.0 3rd Qu.: 952
## Max. :33809.0 Max. :10463636 Max. :129464.0 Max. :514794
## NA's :73
## IBGE_5-9 IBGE_10-14 Active_pop IBGE_60+
## Min. : 7 Min. : 12 Min. : 94 Min. : 29
## 1st Qu.: 220 1st Qu.: 260 1st Qu.: 1735 1st Qu.: 341
## Median : 516 Median : 589 Median : 3842 Median : 723
## Mean : 2071 Mean : 2383 Mean : 18223 Mean : 3006
## 3rd Qu.: 1301 3rd Qu.: 1479 3rd Qu.: 9633 3rd Qu.: 1725
## Max. :684443 Max. :783702 Max. :7058221 Max. :1293012
##
## IBGE_PLANTED_AREA IBGE_CROP_PRODUCTION IDHM Ranking 2010 IDHM
## Min. : 0 Min. : 0 Min. : 1 Min. :0.4180
## 1st Qu.: 911 1st Qu.: 2333 1st Qu.:1391 1st Qu.:0.5990
## Median : 3473 Median : 13845 Median :2782 Median :0.6650
## Mean : 14171 Mean : 57356 Mean :2782 Mean :0.6592
## 3rd Qu.: 11171 3rd Qu.: 55490 3rd Qu.:4173 3rd Qu.:0.7180
## Max. :1205669 Max. :3274885 Max. :5565 Max. :0.8620
##
## IDHM_Renda IDHM_Longevidade IDHM_Educacao LONG
## Min. :0.4000 Min. :0.6720 Min. :0.2070 Min. :-72.92
## 1st Qu.:0.5720 1st Qu.:0.7690 1st Qu.:0.4900 1st Qu.:-50.87
## Median :0.6540 Median :0.8080 Median :0.5600 Median :-46.52
## Mean :0.6429 Mean :0.8016 Mean :0.5591 Mean :-46.21
## 3rd Qu.:0.7070 3rd Qu.:0.8360 3rd Qu.:0.6310 3rd Qu.:-41.41
## Max. :0.8910 Max. :0.8940 Max. :0.8250 Max. : 51.47
##
## LAT ALT PAY_TV FIXED_PHONES
## Min. :-33.688 Min. : 0.0 Min. : 1 Min. : 3
## 1st Qu.:-22.841 1st Qu.: 169.4 1st Qu.: 88 1st Qu.: 118
## Median :-18.097 Median : 406.5 Median : 247 Median : 328
## Mean :-16.450 Mean : 894.2 Mean : 3099 Mean : 6577
## 3rd Qu.: -8.490 3rd Qu.: 628.9 3rd Qu.: 816 3rd Qu.: 1151
## Max. : 4.585 Max. :874579.0 Max. :2047668 Max. :5543127
## NA's :1
## AREA REGIAO_TUR CATEGORIA_TUR
## Min. : 3.57 Corredores Das Águas: 59 A : 51
## 1st Qu.: 204.54 Vale Do Contestado : 45 B : 168
## Median : 415.86 Amazônia Atlântica : 40 C : 521
## Mean : 1515.03 Araguaia-Tocantins : 39 D :1889
## 3rd Qu.: 1025.73 Cariri : 37 E : 648
## Max. :159533.33 (Other) :3057 NA's:2284
## NA's :1 NA's :2284
## ESTIMATED_POP RURAL_URBAN GVA_AGROPEC GVA_INDUSTRY
## Min. : 786 Length:5561 Min. : 0 Min. : 1
## 1st Qu.: 5450 Class :character 1st Qu.: 4193 1st Qu.: 1724
## Median : 11591 Mode :character Median : 20434 Median : 7432
## Mean : 37477 Mean : 47277 Mean : 176171
## 3rd Qu.: 25311 3rd Qu.: 51238 3rd Qu.: 41026
## Max. :12176866 Max. :1402282 Max. :63306755
##
## GVA_SERVICES GVA_PUBLIC GVA_TOTAL TAXES
## Min. : 2 Min. : 7 Min. : 17 Min. : -14159
## 1st Qu.: 10112 1st Qu.: 17252 1st Qu.: 42253 1st Qu.: 1303
## Median : 31216 Median : 35747 Median : 119481 Median : 5108
## Mean : 490191 Mean : 123904 Mean : 834110 Mean : 119046
## 3rd Qu.: 115644 3rd Qu.: 89363 3rd Qu.: 314190 3rd Qu.: 22251
## Max. :464656988 Max. :41902893 Max. :569910503 Max. :117125387
##
## GDP POP_GDP GDP_CAPITA GVA_MAIN
## Min. : 15 Min. : 815 Min. : 3191 Length:5561
## 1st Qu.: 43645 1st Qu.: 5481 1st Qu.: 9062 Class :character
## Median : 125111 Median : 11584 Median : 15866 Mode :character
## Mean : 955869 Mean : 37043 Mean : 21125
## 3rd Qu.: 329780 3rd Qu.: 25114 3rd Qu.: 26157
## Max. :687035890 Max. :12038175 Max. :314638
##
## MUN_EXPENDIT COMP_TOT COMP_A COMP_B
## Min. :1.421e+06 Min. : 6 Min. : 0.00 Min. : 0.000
## 1st Qu.:1.573e+07 1st Qu.: 68 1st Qu.: 1.00 1st Qu.: 0.000
## Median :2.749e+07 Median : 163 Median : 2.00 Median : 0.000
## Mean :1.045e+08 Mean : 908 Mean : 18.28 Mean : 1.854
## 3rd Qu.:5.681e+07 3rd Qu.: 450 3rd Qu.: 8.00 3rd Qu.: 2.000
## Max. :4.577e+10 Max. :530446 Max. :1948.00 Max. :274.000
## NA's :1489
## COMP_C COMP_D COMP_E COMP_F
## Min. : 0.00 Min. : 0.0000 Min. : 0.000 Min. : 0.00
## 1st Qu.: 3.00 1st Qu.: 0.0000 1st Qu.: 0.000 1st Qu.: 1.00
## Median : 11.00 Median : 0.0000 Median : 0.000 Median : 4.00
## Mean : 73.55 Mean : 0.4267 Mean : 2.032 Mean : 43.31
## 3rd Qu.: 39.00 3rd Qu.: 0.0000 3rd Qu.: 1.000 3rd Qu.: 15.00
## Max. :31566.00 Max. :332.0000 Max. :657.000 Max. :25222.00
##
## COMP_G COMP_H COMP_I COMP_J
## Min. : 1.0 Min. : 0.00 Min. : 0.00 Min. : 0.00
## 1st Qu.: 32.0 1st Qu.: 1.00 1st Qu.: 2.00 1st Qu.: 0.00
## Median : 75.0 Median : 7.00 Median : 7.00 Median : 1.00
## Mean : 348.5 Mean : 41.05 Mean : 55.96 Mean : 24.78
## 3rd Qu.: 200.0 3rd Qu.: 25.00 3rd Qu.: 24.00 3rd Qu.: 5.00
## Max. :150633.0 Max. :19515.00 Max. :29290.00 Max. :38720.00
##
## COMP_K COMP_L COMP_M COMP_N
## Min. : 0.00 Min. : 0.00 Min. : 0.00 Min. : 0.00
## 1st Qu.: 0.00 1st Qu.: 0.00 1st Qu.: 1.00 1st Qu.: 1.00
## Median : 0.00 Median : 0.00 Median : 4.00 Median : 4.00
## Mean : 15.58 Mean : 15.16 Mean : 51.37 Mean : 83.82
## 3rd Qu.: 2.00 3rd Qu.: 3.00 3rd Qu.: 13.00 3rd Qu.: 14.00
## Max. :23738.00 Max. :14003.00 Max. :49181.00 Max. :76757.00
##
## COMP_O COMP_P COMP_Q COMP_R
## Min. : 1.000 Min. : 0 Min. : 0.0 Min. : 0.0
## 1st Qu.: 2.000 1st Qu.: 2 1st Qu.: 1.0 1st Qu.: 0.0
## Median : 2.000 Median : 6 Median : 3.0 Median : 2.0
## Mean : 3.272 Mean : 31 Mean : 34.2 Mean : 12.2
## 3rd Qu.: 3.000 3rd Qu.: 17 3rd Qu.: 12.0 3rd Qu.: 6.0
## Max. :204.000 Max. :16030 Max. :22248.0 Max. :6687.0
##
## COMP_S COMP_T COMP_U HOTELS
## Min. : 0.00 Min. :0 Min. : 0.00000 Min. : 1.000
## 1st Qu.: 5.00 1st Qu.:0 1st Qu.: 0.00000 1st Qu.: 1.000
## Median : 12.00 Median :0 Median : 0.00000 Median : 1.000
## Mean : 51.68 Mean :0 Mean : 0.05035 Mean : 3.131
## 3rd Qu.: 31.00 3rd Qu.:0 3rd Qu.: 0.00000 3rd Qu.: 3.000
## Max. :24832.00 Max. :0 Max. :123.00000 Max. :97.000
## NA's :4674
## BEDS Pr_Agencies Pu_Agencies Pr_Bank
## Min. : 2.0 Min. : 0.000 Min. : 0.00 Min. : 0.000
## 1st Qu.: 40.0 1st Qu.: 0.000 1st Qu.: 1.00 1st Qu.: 0.000
## Median : 82.0 Median : 1.000 Median : 2.00 Median : 1.000
## Mean : 257.5 Mean : 3.384 Mean : 2.83 Mean : 1.312
## 3rd Qu.: 200.0 3rd Qu.: 2.000 3rd Qu.: 2.00 3rd Qu.: 2.000
## Max. :13247.0 Max. :1693.000 Max. :626.00 Max. :83.000
## NA's :4674 NA's :2220 NA's :2220 NA's :2220
## Pu_Bank Pr_Assets Pu_Assets Cars
## Min. :0.00 Min. :0.000e+00 Min. :0.000e+00 Min. : 2
## 1st Qu.:1.00 1st Qu.:0.000e+00 1st Qu.:4.048e+07 1st Qu.: 602
## Median :2.00 Median :3.234e+07 Median :1.339e+08 Median : 1440
## Mean :1.58 Mean :9.183e+09 Mean :6.007e+09 Mean : 9873
## 3rd Qu.:2.00 3rd Qu.:1.149e+08 3rd Qu.:4.976e+08 3rd Qu.: 4095
## Max. :8.00 Max. :1.947e+13 Max. :8.016e+12 Max. :5740995
## NA's :2220 NA's :2220 NA's :2220 NA's :8
## Motorcycles Wheeled_tractor UBER MAC
## Min. : 4 Min. : 0.000 1 : 125 Min. : 1.000
## 1st Qu.: 591 1st Qu.: 0.000 NA's:5436 1st Qu.: 1.000
## Median : 1285 Median : 0.000 Median : 2.000
## Mean : 4885 Mean : 5.761 Mean : 4.277
## 3rd Qu.: 3299 3rd Qu.: 1.000 3rd Qu.: 3.000
## Max. :1134570 Max. :3236.000 Max. :130.000
## NA's :8 NA's :8 NA's :5395
## WAL-MART POST_OFFICES LOG_GDP_CAPITA
## Min. : 1.000 Min. : 1.000 Min. : 8.068
## 1st Qu.: 1.000 1st Qu.: 1.000 1st Qu.: 9.112
## Median : 1.000 Median : 1.000 Median : 9.672
## Mean : 2.059 Mean : 2.081 Mean : 9.697
## 3rd Qu.: 1.750 3rd Qu.: 2.000 3rd Qu.:10.172
## Max. :26.000 Max. :225.000 Max. :12.659
## NA's :5459 NA's :117
IBGE_CROP_PRODUCTION is chosen since it represents the earnings from the production of crops, which these earnings will contribute to the economy. It is also reported online that agriculture is one of the principal bases of Brazil’s economy.
##### IDHM_Longevidade IDHM is the Human Development Index, which is calculated from IDHM_Longevidade, IDHM_Educacao, and IDHM_Renda.
The Human Development Index might contribute to the economy as a higher HDI value suggests that life expectancy (IDHM_Longevidade) is longer, IDHM_Educacao (Education level) is higher, which will raise the nation’s Gross National Income. This could be due to more people being able to work and contribute to the economy.
We will extract these three variables individually and conduct separate analysis on each three.
There is 1 NA value in IDHM_Longevidade. As it is hard to assign a value to this NA, we will drop this NA value
brazil <- brazil%>%
filter(!is.na(IDHM_Longevidade))
PAY_TV and FIXED_PHONES are selected as these as luxury goods, and only a better off person can afford it. A municipality with higher number of PAY_TV and/or FIXED_PHONES may suggest a higher GDP, although it could be due to other factors. We will just include PAY_TV and FIXED_PHONES into our analysis first.
There are no NA values inside PAY_TV and FIXED_PHONES
A municipality with a bigger area might have more space for economic development, and could result in a change in GDP too.
There is 1 NA value for AREA, and this could be due to the city not being well developed enough for AREA to be calculated. However, it is hard to assign a value to this NA, therefore we will drop this NA value.
brazil <- brazil%>%
filter(!is.na(`AREA`))
A municipality with higher MUN_EXPENDIT probably suggests higher expenditure, which might be due to individuals being better off. Expenditure might affect GDP.
However, since there are 1489 NA values in MUN_EXPENDIT, it is a substantial amount of missing values. We will choose not to use this value in this case, as it will remove a large amount of observation.
Owning a vehicle such as cars or motorcycles is part of private consumption, and it might have an impact on GDP.
There are 8 NA values for Cars, and could be due to the cities not having any Cars, but we are unable to correctly assign a value to these cities, hence, we will drop the NA values
brazil <- brazil%>%
filter(!is.na(`Cars`))
There are 9 NA values for Motorcycles, and could be due to the cities not having any motorcycles, but we are unable to correctly assign a value to these cities, hence, we will drop the NA values
brazil <- brazil%>%
filter(!is.na(`Motorcycles`))
summary(brazil)
## CITY STATE CAPITAL IBGE_RES_POP
## Length:5552 MG : 852 0:5525 Min. : 805
## Class :character SP : 645 1: 27 1st Qu.: 5236
## Mode :character RS : 496 Median : 10943
## BA : 416 Mean : 34330
## PR : 398 3rd Qu.: 23646
## SC : 292 Max. :11253503
## (Other):2453
## IBGE_RES_POP_BRAS IBGE_RES_POP_ESTR IBGE_DU IBGE_DU_URBAN
## Min. : 805 Min. : 0.00 Min. : 239 Min. : 60
## 1st Qu.: 5231 1st Qu.: 0.00 1st Qu.: 1574 1st Qu.: 875
## Median : 10942 Median : 0.00 Median : 3182 Median : 1852
## Mean : 34253 Mean : 77.68 Mean : 10319 Mean : 8875
## 3rd Qu.: 23522 3rd Qu.: 10.00 3rd Qu.: 6728 3rd Qu.: 4628
## Max. :11133776 Max. :119727.00 Max. :3576148 Max. :3548433
## NA's :2 NA's :2
## IBGE_DU_RURAL IBGE_POP IBGE_1 IBGE_1-4
## Min. : 3 Min. : 174 Min. : 0.0 Min. : 5
## 1st Qu.: 487 1st Qu.: 2803 1st Qu.: 38.0 1st Qu.: 158
## Median : 931 Median : 6183 Median : 92.0 Median : 377
## Mean : 1463 Mean : 27642 Mean : 383.9 Mean : 1547
## 3rd Qu.: 1833 3rd Qu.: 15308 3rd Qu.: 232.0 3rd Qu.: 952
## Max. :33809 Max. :10463636 Max. :129464.0 Max. :514794
## NA's :73
## IBGE_5-9 IBGE_10-14 Active_pop IBGE_60+
## Min. : 7 Min. : 12.0 Min. : 94 Min. : 29
## 1st Qu.: 220 1st Qu.: 259.0 1st Qu.: 1735 1st Qu.: 341
## Median : 517 Median : 589.5 Median : 3848 Median : 724
## Mean : 2073 Mean : 2385.3 Mean : 18243 Mean : 3010
## 3rd Qu.: 1301 3rd Qu.: 1479.0 3rd Qu.: 9634 3rd Qu.: 1726
## Max. :684443 Max. :783702.0 Max. :7058221 Max. :1293012
##
## IBGE_PLANTED_AREA IBGE_CROP_PRODUCTION IDHM Ranking 2010 IDHM
## Min. : 0.0 Min. : 0 Min. : 1 Min. :0.4180
## 1st Qu.: 912.8 1st Qu.: 2332 1st Qu.:1389 1st Qu.:0.5990
## Median : 3477.5 Median : 13859 Median :2780 Median :0.6650
## Mean : 14164.2 Mean : 57327 Mean :2781 Mean :0.6593
## 3rd Qu.: 11171.8 3rd Qu.: 55564 3rd Qu.:4172 3rd Qu.:0.7200
## Max. :1205669.0 Max. :3274885 Max. :5565 Max. :0.8620
##
## IDHM_Renda IDHM_Longevidade IDHM_Educacao LONG
## Min. :0.400 Min. :0.6720 Min. :0.2070 Min. :-72.92
## 1st Qu.:0.572 1st Qu.:0.7690 1st Qu.:0.4900 1st Qu.:-50.87
## Median :0.654 Median :0.8080 Median :0.5600 Median :-46.52
## Mean :0.643 Mean :0.8016 Mean :0.5592 Mean :-46.21
## 3rd Qu.:0.707 3rd Qu.:0.8360 3rd Qu.:0.6310 3rd Qu.:-41.41
## Max. :0.891 Max. :0.8940 Max. :0.8250 Max. : 51.47
##
## LAT ALT PAY_TV FIXED_PHONES
## Min. :-33.688 Min. : 0.0 Min. : 1 Min. : 3
## 1st Qu.:-22.843 1st Qu.: 170.1 1st Qu.: 88 1st Qu.: 119
## Median :-18.107 Median : 406.6 Median : 248 Median : 328
## Mean :-16.459 Mean : 895.1 Mean : 3103 Mean : 6587
## 3rd Qu.: -8.508 3rd Qu.: 629.2 3rd Qu.: 817 3rd Qu.: 1156
## Max. : 4.585 Max. :874579.0 Max. :2047668 Max. :5543127
## NA's :1
## AREA REGIAO_TUR CATEGORIA_TUR
## Min. : 3.57 Corredores Das Águas: 59 A : 51
## 1st Qu.: 204.70 Vale Do Contestado : 45 B : 168
## Median : 415.86 Amazônia Atlântica : 40 C : 521
## Mean : 1515.99 Araguaia-Tocantins : 39 D :1888
## 3rd Qu.: 1025.17 Cariri : 37 E : 648
## Max. :159533.33 (Other) :3056 NA's:2276
## NA's :2276
## ESTIMATED_POP RURAL_URBAN GVA_AGROPEC GVA_INDUSTRY
## Min. : 786 Length:5552 Min. : 0 Min. : 1
## 1st Qu.: 5455 Class :character 1st Qu.: 4194 1st Qu.: 1726
## Median : 11602 Mode :character Median : 20440 Median : 7451
## Mean : 37516 Mean : 47279 Mean : 176424
## 3rd Qu.: 25328 3rd Qu.: 51243 3rd Qu.: 41097
## Max. :12176866 Max. :1402282 Max. :63306755
##
## GVA_SERVICES GVA_PUBLIC GVA_TOTAL TAXES
## Min. : 2 Min. : 7 Min. : 17 Min. : -14159
## 1st Qu.: 10110 1st Qu.: 17251 1st Qu.: 42447 1st Qu.: 1302
## Median : 31242 Median : 35778 Median : 119676 Median : 5114
## Mean : 490859 Mean : 124014 Mean : 835285 Mean : 119215
## 3rd Qu.: 115678 3rd Qu.: 89378 3rd Qu.: 314463 3rd Qu.: 22289
## Max. :464656988 Max. :41902893 Max. :569910503 Max. :117125387
##
## GDP POP_GDP GDP_CAPITA GVA_MAIN
## Min. : 15 Min. : 815 Min. : 3191 Length:5552
## 1st Qu.: 43691 1st Qu.: 5492 1st Qu.: 9064 Class :character
## Median : 125473 Median : 11588 Median : 15877 Mode :character
## Mean : 957210 Mean : 37081 Mean : 21136
## 3rd Qu.: 330284 3rd Qu.: 25120 3rd Qu.: 26168
## Max. :687035890 Max. :12038175 Max. :314638
##
## MUN_EXPENDIT COMP_TOT COMP_A COMP_B
## Min. :1.421e+06 Min. : 6.0 Min. : 0.0 Min. : 0.000
## 1st Qu.:1.574e+07 1st Qu.: 68.0 1st Qu.: 1.0 1st Qu.: 0.000
## Median :2.749e+07 Median : 163.0 Median : 2.0 Median : 0.000
## Mean :1.046e+08 Mean : 909.2 Mean : 18.3 Mean : 1.857
## 3rd Qu.:5.680e+07 3rd Qu.: 450.0 3rd Qu.: 8.0 3rd Qu.: 2.000
## Max. :4.577e+10 Max. :530446.0 Max. :1948.0 Max. :274.000
## NA's :1486
## COMP_C COMP_D COMP_E COMP_F
## Min. : 0.00 Min. : 0.0000 Min. : 0.000 Min. : 0.00
## 1st Qu.: 3.00 1st Qu.: 0.0000 1st Qu.: 0.000 1st Qu.: 1.00
## Median : 11.00 Median : 0.0000 Median : 0.000 Median : 4.00
## Mean : 73.65 Mean : 0.4274 Mean : 2.033 Mean : 43.37
## 3rd Qu.: 40.00 3rd Qu.: 0.0000 3rd Qu.: 1.000 3rd Qu.: 15.00
## Max. :31566.00 Max. :332.0000 Max. :657.000 Max. :25222.00
##
## COMP_G COMP_H COMP_I COMP_J
## Min. : 1.0 Min. : 0.0 Min. : 0.00 Min. : 0.00
## 1st Qu.: 32.0 1st Qu.: 1.0 1st Qu.: 2.00 1st Qu.: 0.00
## Median : 75.0 Median : 7.0 Median : 7.00 Median : 1.00
## Mean : 348.9 Mean : 41.1 Mean : 56.03 Mean : 24.82
## 3rd Qu.: 200.2 3rd Qu.: 25.0 3rd Qu.: 24.00 3rd Qu.: 5.00
## Max. :150633.0 Max. :19515.0 Max. :29290.00 Max. :38720.00
##
## COMP_K COMP_L COMP_M COMP_N
## Min. : 0.0 Min. : 0.00 Min. : 0.00 Min. : 0.00
## 1st Qu.: 0.0 1st Qu.: 0.00 1st Qu.: 1.00 1st Qu.: 1.00
## Median : 0.0 Median : 0.00 Median : 4.00 Median : 4.00
## Mean : 15.6 Mean : 15.19 Mean : 51.45 Mean : 83.95
## 3rd Qu.: 2.0 3rd Qu.: 3.00 3rd Qu.: 13.00 3rd Qu.: 14.00
## Max. :23738.0 Max. :14003.00 Max. :49181.00 Max. :76757.00
##
## COMP_O COMP_P COMP_Q COMP_R
## Min. : 1.000 Min. : 0.00 Min. : 0.00 Min. : 0.00
## 1st Qu.: 2.000 1st Qu.: 2.00 1st Qu.: 1.00 1st Qu.: 0.00
## Median : 2.000 Median : 6.00 Median : 3.00 Median : 2.00
## Mean : 3.272 Mean : 31.03 Mean : 34.25 Mean : 12.21
## 3rd Qu.: 3.000 3rd Qu.: 17.00 3rd Qu.: 12.00 3rd Qu.: 6.00
## Max. :204.000 Max. :16030.00 Max. :22248.00 Max. :6687.00
##
## COMP_S COMP_T COMP_U HOTELS
## Min. : 0.00 Min. :0 Min. : 0.00000 Min. : 1.000
## 1st Qu.: 5.00 1st Qu.:0 1st Qu.: 0.00000 1st Qu.: 1.000
## Median : 13.00 Median :0 Median : 0.00000 Median : 1.000
## Mean : 51.74 Mean :0 Mean : 0.05043 Mean : 3.134
## 3rd Qu.: 31.00 3rd Qu.:0 3rd Qu.: 0.00000 3rd Qu.: 3.000
## Max. :24832.00 Max. :0 Max. :123.00000 Max. :97.000
## NA's :4667
## BEDS Pr_Agencies Pu_Agencies Pr_Bank
## Min. : 2.0 Min. : 0.000 Min. : 0.00 Min. : 0.000
## 1st Qu.: 40.0 1st Qu.: 0.000 1st Qu.: 1.00 1st Qu.: 0.000
## Median : 82.0 Median : 1.000 Median : 2.00 Median : 1.000
## Mean : 257.9 Mean : 3.386 Mean : 2.83 Mean : 1.313
## 3rd Qu.: 200.0 3rd Qu.: 2.000 3rd Qu.: 2.00 3rd Qu.: 2.000
## Max. :13247.0 Max. :1693.000 Max. :626.00 Max. :83.000
## NA's :4667 NA's :2214 NA's :2214 NA's :2214
## Pu_Bank Pr_Assets Pu_Assets Cars
## Min. :0.00 Min. :0.000e+00 Min. :0.000e+00 Min. : 2
## 1st Qu.:1.00 1st Qu.:0.000e+00 1st Qu.:4.047e+07 1st Qu.: 602
## Median :2.00 Median :3.231e+07 Median :1.339e+08 Median : 1440
## Mean :1.58 Mean :9.191e+09 Mean :6.012e+09 Mean : 9875
## 3rd Qu.:2.00 3rd Qu.:1.150e+08 3rd Qu.:4.950e+08 3rd Qu.: 4096
## Max. :8.00 Max. :1.947e+13 Max. :8.016e+12 Max. :5740995
## NA's :2214 NA's :2214 NA's :2214
## Motorcycles Wheeled_tractor UBER MAC
## Min. : 4 Min. : 0.000 1 : 125 Min. : 1.000
## 1st Qu.: 591 1st Qu.: 0.000 NA's:5427 1st Qu.: 1.000
## Median : 1286 Median : 0.000 Median : 2.000
## Mean : 4886 Mean : 5.762 Mean : 4.277
## 3rd Qu.: 3300 3rd Qu.: 1.000 3rd Qu.: 3.000
## Max. :1134570 Max. :3236.000 Max. :130.000
## NA's :5386
## WAL-MART POST_OFFICES LOG_GDP_CAPITA
## Min. : 1.000 Min. : 1.000 Min. : 8.068
## 1st Qu.: 1.000 1st Qu.: 1.000 1st Qu.: 9.112
## Median : 1.000 Median : 1.000 Median : 9.673
## Mean : 2.059 Mean : 2.082 Mean : 9.698
## 3rd Qu.: 1.750 3rd Qu.: 2.000 3rd Qu.:10.172
## Max. :26.000 Max. :225.000 Max. :12.659
## NA's :5450 NA's :113
Pr_Assets and Pu_Assets are selected as assets are equivalent to money, which can count towards interests and investments, and investments is a factor that affects GDP.
As there are 1544 NA values for Pr_Assets, We will choose not to use this value in this case, as it will remove a large amount of observation.
There are 1544 NA values for Pu_Assets, We will choose not to use this value in this case, as it will remove a large amount of observation.
After removing all the NA values, we can conduct a correlation matrix
We will select our variables first
brazil3 <- brazil%>%
dplyr::select("IBGE_RES_POP", "Active_pop", "IBGE_CROP_PRODUCTION", "IDHM_Longevidade", "IDHM_Educacao", "IDHM_Renda", "PAY_TV", "FIXED_PHONES", "AREA", "GVA_AGROPEC", "GVA_INDUSTRY", "GVA_SERVICES", "TAXES", "GDP", "POP_GDP", "GDP_CAPITA", "COMP_TOT", "Cars", "Motorcycles")
Before doing so, we will double check the NAs in the variables
summary(brazil3)
## IBGE_RES_POP Active_pop IBGE_CROP_PRODUCTION IDHM_Longevidade
## Min. : 805 Min. : 94 Min. : 0 Min. :0.6720
## 1st Qu.: 5236 1st Qu.: 1735 1st Qu.: 2332 1st Qu.:0.7690
## Median : 10943 Median : 3848 Median : 13859 Median :0.8080
## Mean : 34330 Mean : 18243 Mean : 57327 Mean :0.8016
## 3rd Qu.: 23646 3rd Qu.: 9634 3rd Qu.: 55564 3rd Qu.:0.8360
## Max. :11253503 Max. :7058221 Max. :3274885 Max. :0.8940
## IDHM_Educacao IDHM_Renda PAY_TV FIXED_PHONES
## Min. :0.2070 Min. :0.400 Min. : 1 Min. : 3
## 1st Qu.:0.4900 1st Qu.:0.572 1st Qu.: 88 1st Qu.: 119
## Median :0.5600 Median :0.654 Median : 248 Median : 328
## Mean :0.5592 Mean :0.643 Mean : 3103 Mean : 6587
## 3rd Qu.:0.6310 3rd Qu.:0.707 3rd Qu.: 817 3rd Qu.: 1156
## Max. :0.8250 Max. :0.891 Max. :2047668 Max. :5543127
## AREA GVA_AGROPEC GVA_INDUSTRY GVA_SERVICES
## Min. : 3.57 Min. : 0 Min. : 1 Min. : 2
## 1st Qu.: 204.70 1st Qu.: 4194 1st Qu.: 1726 1st Qu.: 10110
## Median : 415.86 Median : 20440 Median : 7451 Median : 31242
## Mean : 1515.99 Mean : 47279 Mean : 176424 Mean : 490859
## 3rd Qu.: 1025.17 3rd Qu.: 51243 3rd Qu.: 41097 3rd Qu.: 115678
## Max. :159533.33 Max. :1402282 Max. :63306755 Max. :464656988
## TAXES GDP POP_GDP GDP_CAPITA
## Min. : -14159 Min. : 15 Min. : 815 Min. : 3191
## 1st Qu.: 1302 1st Qu.: 43691 1st Qu.: 5492 1st Qu.: 9064
## Median : 5114 Median : 125473 Median : 11588 Median : 15877
## Mean : 119215 Mean : 957210 Mean : 37081 Mean : 21136
## 3rd Qu.: 22289 3rd Qu.: 330284 3rd Qu.: 25120 3rd Qu.: 26168
## Max. :117125387 Max. :687035890 Max. :12038175 Max. :314638
## COMP_TOT Cars Motorcycles
## Min. : 6.0 Min. : 2 Min. : 4
## 1st Qu.: 68.0 1st Qu.: 602 1st Qu.: 591
## Median : 163.0 Median : 1440 Median : 1286
## Mean : 909.2 Mean : 9875 Mean : 4886
## 3rd Qu.: 450.0 3rd Qu.: 4096 3rd Qu.: 3300
## Max. :530446.0 Max. :5740995 Max. :1134570
After confirming there are no NA values, we can now plot the correlation matrix
corrplot(cor(brazil3), diag=FALSE, order="alphabet", tl.pos="td", tl.cex=0.5,number.cex=0.4, method="number", type="upper")
As seen in the correlation plot, many variables are highly correlated. To prevent this, we will derive new variables.
In addition, since IBGE_RES_POP and Active_pop have a correlation value of 1, we conclude that these two variables are rather similar. IBGE_RES_POP and Active_pop are both population of the municipality, but we would choose Active_pop instead since they are the economically active group, which contributes most to the economy, and hence would perhaps have the greatest correlation to GDP_CAPITA. This could be due to the economically active group earning income, spending income, and paying taxes.
Since GDP_CAPITA is taken by GDP divided by population, population affects GDP. For consistency, we will divide several variables by population.
We will divide PAY_TV by population, to be able to calculate approximate number of pay_tv each individual has
brazil <- brazil %>%
mutate(PAY_TV_p = PAY_TV/POP_GDP)
We will divide FIXED_PHONES by population, to be able to calculate approximate number of fixed_phone each individual has
brazil <- brazil %>%
mutate(FIXED_PHONES_p = FIXED_PHONES/POP_GDP)
We will divide Cars by population, to be able to calculate approximate number of cars each individual has
brazil <- brazil %>%
mutate(Cars_p =Cars/POP_GDP)
We will divide Motorcycles by population, to be able to calculate approximate number of motorcycles each individual has
brazil <- brazil %>%
mutate(Motorcycles_p = Motorcycles/POP_GDP)
GVA tells us how much value is added or lost from a municipality, which can be used in the calculation of GDP. We divide GVA_AGROPEC by population, to calculate the approximate Gross Added Value of each individual
brazil <- brazil %>%
mutate(GVA_AGROPEC_p = GVA_AGROPEC/POP_GDP)
We divide GVA_INDUSTRY by population, to calculate the approximate Gross Added Value of each individual
brazil <- brazil %>%
mutate(GVA_INDUSTRY_p = GVA_INDUSTRY/POP_GDP)
We divide GVA_SERVICES by population, to calculate the approximate Gross Added Value of each individual
brazil <- brazil %>%
mutate(GVA_SERVICES_p = GVA_SERVICES/POP_GDP)
We will also derive new variables such as pop_density, as the higher the population density in a city, the higher the possibility of spending in that city, increasing the GDP_CAPITA. We will use population divided by area to calculate pop_density ##### pop_density
brazil <- brazil %>%
mutate(pop_density = POP_GDP/AREA)
We will also derive a tax to gdp ratio, which tells us how the government spend the tax money. It is calculated by taking taxes dividing by GDP. A higher ratio suggests higher ability for sustainable economic growth. ##### tax_to_gdp
brazil <- brazil %>%
mutate(tax_to_gdp = TAXES/GDP)
Select variables first
brazil4 <- brazil%>%
dplyr::select("Active_pop", "IBGE_CROP_PRODUCTION", "IDHM_Longevidade", "IDHM_Educacao", "IDHM_Renda", "PAY_TV_p", "FIXED_PHONES_p", "GVA_AGROPEC_p", "GVA_INDUSTRY_p", "GVA_SERVICES_p", "tax_to_gdp", "GDP_CAPITA", "COMP_TOT", "Cars_p", "Motorcycles_p", "pop_density")
We will, again, double confirm there are no NA values in our data set
summary(brazil4)
## Active_pop IBGE_CROP_PRODUCTION IDHM_Longevidade IDHM_Educacao
## Min. : 94 Min. : 0 Min. :0.6720 Min. :0.2070
## 1st Qu.: 1735 1st Qu.: 2332 1st Qu.:0.7690 1st Qu.:0.4900
## Median : 3848 Median : 13859 Median :0.8080 Median :0.5600
## Mean : 18243 Mean : 57327 Mean :0.8016 Mean :0.5592
## 3rd Qu.: 9634 3rd Qu.: 55564 3rd Qu.:0.8360 3rd Qu.:0.6310
## Max. :7058221 Max. :3274885 Max. :0.8940 Max. :0.8250
## IDHM_Renda PAY_TV_p FIXED_PHONES_p GVA_AGROPEC_p
## Min. :0.400 Min. :0.000196 Min. :0.0004817 Min. : 0.0000
## 1st Qu.:0.572 1st Qu.:0.011429 1st Qu.:0.0134803 1st Qu.: 0.3543
## Median :0.654 Median :0.024615 Median :0.0382001 Median : 1.4255
## Mean :0.643 Mean :0.037121 Mean :0.0605970 Mean : 3.7476
## 3rd Qu.:0.707 3rd Qu.:0.048320 3rd Qu.:0.0854910 3rd Qu.: 4.7287
## Max. :0.891 Max. :0.480541 Max. :1.0372617 Max. :121.0575
## GVA_INDUSTRY_p GVA_SERVICES_p tax_to_gdp GDP_CAPITA
## Min. : 0.0001 Min. : 0.00049 Min. : -2.0920 Min. : 3191
## 1st Qu.: 0.2607 1st Qu.: 1.61815 1st Qu.: 0.0297 1st Qu.: 9064
## Median : 0.7919 Median : 3.52467 Median : 0.0527 Median : 15877
## Mean : 3.3100 Mean : 5.61031 Mean : 8.5631 Mean : 21136
## 3rd Qu.: 2.5950 3rd Qu.: 7.73471 3rd Qu.: 0.0973 3rd Qu.: 26168
## Max. :254.9198 Max. :108.80127 Max. :324.6684 Max. :314638
## COMP_TOT Cars_p Motorcycles_p pop_density
## Min. : 6.0 Min. :0.0000529 Min. :0.0001059 Min. : 0.166
## 1st Qu.: 68.0 1st Qu.:0.0627272 1st Qu.:0.0872258 1st Qu.: 11.946
## Median : 163.0 Median :0.1731448 Median :0.1234463 Median : 25.309
## Mean : 909.2 Mean :0.1918811 Mean :0.1340853 Mean : 117.472
## 3rd Qu.: 450.0 3rd Qu.:0.3117910 3rd Qu.:0.1693006 3rd Qu.: 55.477
## Max. :530446.0 Max. :0.6440065 Max. :0.5500341 Max. :13533.497
corrplot(cor(brazil4), diag=FALSE, order="alphabet", tl.pos="td", tl.cex=0.5,number.cex=0.5, method="number", type="upper")
corrplot(cor(brazil4), diag=FALSE, order="alphabet", tl.pos="td", tl.cex=0.5,number.cex=0.5, method="square", type="upper")
We chose to order the variables alphabetically for easy identifcation. From the scatterplot, we see that Active_pop is highly correlated to COMP_TOT (correlation value = 0.97). In view of this, we should only include either one in the subsequent model building. As a result, COMP_TOT is excluded in the subsequent model builing.
corrplot(cor(brazil4[,c(1:12, 14:16)]), diag=FALSE, order="alphabet", tl.pos="td", tl.cex=0.5,number.cex=0.5, method="number", type="upper")
corrplot(cor(brazil4[,c(1:12,14:16)]), diag=FALSE, order="alphabet", tl.pos="td", tl.cex=0.5,number.cex=0.5, method="square", type="upper")
Having selecting our variables needed, we can now do the modelling.
We will be doing a linear modelling, with the GDP_CAPITA with all the independent variables that we have identified previously.
Before joining, we have identified that the names of variables in name_muni and CITY in mun and brazil respectively are not consistent. We will hence upper case all the letters for both variables.
mun$name_muni <- toupper(mun$name_muni)
brazil$CITY <- toupper(brazil$CITY)
brazil_mun <- right_join(mun, brazil, by=c("name_muni"="CITY", "abbrev_state"="STATE"))
## Warning: Column `abbrev_state`/`STATE` joining character vector and factor,
## coercing into character vector
Check for missing geometries
any(is.na(st_dimension(brazil_mun)))
## [1] FALSE
qtm(brazil_mun, "GDP_CAPITA", border=NULL, scale=0.6) + tm_legend(main.title="GDP_CAPITA in 2016", main.title.position="centre")
Select variables to use first
brazil6 <- brazil%>%
dplyr::select("Active_pop", "IBGE_CROP_PRODUCTION", "IDHM_Longevidade", "IDHM_Educacao", "IDHM_Renda", "PAY_TV_p", "FIXED_PHONES_p", "GVA_AGROPEC_p", "GVA_INDUSTRY_p", "GVA_SERVICES_p", "tax_to_gdp", "GDP_CAPITA", "Cars_p", "Motorcycles_p", "pop_density")
Before we conduct the linear regression, we will plot histograms of each of the variables, to have an idea of the distribution of the variables.
Active_pop <- ggplot(data=brazil6, aes(x=Active_pop))+
geom_histogram(bins=25, color="black", fill="light blue")
IBGE_CROP_PRODUCTION <- ggplot(data=brazil6, aes(x=IBGE_CROP_PRODUCTION))+
geom_histogram(bins=25, color="black", fill="light blue")
IDHM_Longevidade <- ggplot(data=brazil6, aes(x=IDHM_Longevidade))+
geom_histogram(bins=25, color="black", fill="light blue")
IDHM_Educacao <- ggplot(data=brazil6, aes(x=IDHM_Educacao))+
geom_histogram(bins=25, color="black", fill="light blue")
IDHM_Renda <- ggplot(data=brazil6, aes(x=IDHM_Renda))+
geom_histogram(bins=25, color="black", fill="light blue")
PAY_TV_p <- ggplot(data=brazil6, aes(x=PAY_TV_p))+
geom_histogram(bins=25, color="black", fill="light blue")
FIXED_PHONES_p <- ggplot(data=brazil6, aes(x=FIXED_PHONES_p))+
geom_histogram(bins=25, color="black", fill="light blue")
GVA_AGROPEC_p <- ggplot(data=brazil6, aes(x=GVA_AGROPEC_p))+
geom_histogram(bins=25, color="black", fill="light blue")
GVA_INDUSTRY_p <- ggplot(data=brazil6, aes(x=GVA_INDUSTRY_p))+
geom_histogram(bins=25, color="black", fill="light blue")
GVA_SERVICES_p <- ggplot(data=brazil6, aes(x=GVA_SERVICES_p))+
geom_histogram(bins=25, color="black", fill="light blue")
tax_to_gdp <- ggplot(data=brazil6, aes(x=tax_to_gdp))+
geom_histogram(bins=25, color="black", fill="light blue")
GDP_CAPITA <- ggplot(data=brazil6, aes(x=GDP_CAPITA))+
geom_histogram(bins=25, color="black", fill="light blue")
Cars_p <- ggplot(data=brazil6, aes(x=Cars_p))+
geom_histogram(bins=25, color="black", fill="light blue")
Motorcycles_p <- ggplot(data=brazil6, aes(x=Motorcycles_p))+
geom_histogram(bins=25, color="black", fill="light blue")
pop_density <- ggplot(data=brazil6, aes(x=pop_density))+
geom_histogram(bins=25, color="black", fill="light blue")
ggarrange(Active_pop, IBGE_CROP_PRODUCTION, IDHM_Longevidade, IDHM_Educacao, IDHM_Renda, PAY_TV_p, FIXED_PHONES_p, GVA_AGROPEC_p, GVA_INDUSTRY_p, GVA_SERVICES_p, tax_to_gdp, GDP_CAPITA, Cars_p, Motorcycles_p, pop_density, ncol=3, nrow=6)
We can tell from the data that most of them are skewed, either left or right. Some however, resemble a normal distribution. For eg, IDHM_Longevidade, IDHM_Educacao, IDHM_Renda, or Motorcycles_p (even though it is slightly right skewed)
brazil_lm <- lm(GDP_CAPITA~., data=brazil6)
summary(brazil_lm)
##
## Call:
## lm(formula = GDP_CAPITA ~ ., data = brazil6)
##
## Residuals:
## Min 1Q Median 3Q Max
## -31274 -2969 -1027 1068 260244
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -7.692e+03 3.134e+03 -2.455 0.0141 *
## Active_pop -1.640e-03 1.027e-03 -1.598 0.1102
## IBGE_CROP_PRODUCTION 4.831e-03 8.530e-04 5.664 1.55e-08 ***
## IDHM_Longevidade -1.814e+03 4.676e+03 -0.388 0.6980
## IDHM_Educacao 2.279e+03 2.217e+03 1.028 0.3040
## IDHM_Renda 2.399e+04 3.961e+03 6.057 1.48e-09 ***
## PAY_TV_p 3.089e+03 3.702e+03 0.834 0.4041
## FIXED_PHONES_p 6.257e+03 3.111e+03 2.011 0.0443 *
## GVA_AGROPEC_p 9.139e+02 2.052e+01 44.540 < 2e-16 ***
## GVA_INDUSTRY_p 1.136e+03 1.234e+01 92.086 < 2e-16 ***
## GVA_SERVICES_p 9.654e+02 2.191e+01 44.071 < 2e-16 ***
## tax_to_gdp 1.097e+01 4.404e+00 2.491 0.0128 *
## Cars_p 3.006e+03 1.796e+03 1.674 0.0942 .
## Motorcycles_p -3.573e+03 1.796e+03 -1.989 0.0468 *
## pop_density 4.068e-01 2.155e-01 1.887 0.0592 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 8386 on 5537 degrees of freedom
## Multiple R-squared: 0.8304, Adjusted R-squared: 0.8299
## F-statistic: 1936 on 14 and 5537 DF, p-value: < 2.2e-16
AIC(brazil_lm)
## [1] 116090
BIC(brazil_lm)
## [1] 116195.9
With reference to the report above, it is clear that not all the independent variables are statistically significant. We will revise the model by removing those variables which are not statistically significant (p > 0.05).
brazil_lm2 <- lm(GDP_CAPITA~ IBGE_CROP_PRODUCTION + IDHM_Renda + FIXED_PHONES_p + GVA_AGROPEC_p + GVA_INDUSTRY_p + GVA_SERVICES_p + tax_to_gdp + Motorcycles_p, data=brazil6)
summary(brazil_lm2)
##
## Call:
## lm(formula = GDP_CAPITA ~ IBGE_CROP_PRODUCTION + IDHM_Renda +
## FIXED_PHONES_p + GVA_AGROPEC_p + GVA_INDUSTRY_p + GVA_SERVICES_p +
## tax_to_gdp + Motorcycles_p, data = brazil6)
##
## Residuals:
## Min 1Q Median 3Q Max
## -31029 -2986 -1077 1048 260099
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -1.056e+04 1.293e+03 -8.170 3.78e-16 ***
## IBGE_CROP_PRODUCTION 4.437e-03 8.406e-04 5.279 1.35e-07 ***
## IDHM_Renda 2.909e+04 2.297e+03 12.664 < 2e-16 ***
## FIXED_PHONES_p 8.901e+03 2.696e+03 3.302 0.000966 ***
## GVA_AGROPEC_p 9.133e+02 2.031e+01 44.974 < 2e-16 ***
## GVA_INDUSTRY_p 1.136e+03 1.232e+01 92.252 < 2e-16 ***
## GVA_SERVICES_p 9.694e+02 2.168e+01 44.725 < 2e-16 ***
## tax_to_gdp 1.105e+01 4.400e+00 2.510 0.012094 *
## Motorcycles_p -3.852e+03 1.776e+03 -2.169 0.030108 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 8389 on 5543 degrees of freedom
## Multiple R-squared: 0.8301, Adjusted R-squared: 0.8298
## F-statistic: 3385 on 8 and 5543 DF, p-value: < 2.2e-16
ols_regress(brazil_lm2)
## Model Summary
## --------------------------------------------------------------------
## R 0.911 RMSE 8388.797
## R-Squared 0.830 Coef. Var 39.689
## Adj. R-Squared 0.830 MSE 70371919.117
## Pred R-Squared 0.828 MAE 3618.252
## --------------------------------------------------------------------
## RMSE: Root Mean Square Error
## MSE: Mean Square Error
## MAE: Mean Absolute Error
##
## ANOVA
## -------------------------------------------------------------------------------------
## Sum of
## Squares DF Mean Square F Sig.
## -------------------------------------------------------------------------------------
## Regression 1.905457e+12 8 238182096351.380 3384.618 0.0000
## Residual 390071547665.115 5543 70371919.117
## Total 2.295528e+12 5551
## -------------------------------------------------------------------------------------
##
## Parameter Estimates
## ------------------------------------------------------------------------------------------------------------
## model Beta Std. Error Std. Beta t Sig lower upper
## ------------------------------------------------------------------------------------------------------------
## (Intercept) -10563.125 1292.878 -8.170 0.000 -13097.672 -8028.578
## IBGE_CROP_PRODUCTION 0.004 0.001 0.032 5.279 0.000 0.003 0.006
## IDHM_Renda 29090.224 2297.138 0.115 12.664 0.000 24586.934 33593.514
## FIXED_PHONES_p 8901.109 2695.720 0.030 3.302 0.001 3616.440 14185.778
## GVA_AGROPEC_p 913.327 20.308 0.287 44.974 0.000 873.516 953.138
## GVA_INDUSTRY_p 1136.382 12.318 0.553 92.252 0.000 1112.233 1160.530
## GVA_SERVICES_p 969.445 21.676 0.330 44.725 0.000 926.952 1011.938
## tax_to_gdp 11.046 4.400 0.014 2.510 0.012 2.419 19.673
## Motorcycles_p -3851.553 1775.544 -0.013 -2.169 0.030 -7332.315 -370.792
## ------------------------------------------------------------------------------------------------------------
AIC(brazil_lm2)
## [1] 116087.7
BIC(brazil_lm2)
## [1] 116153.9
Adjusted R2 = 0.8298
AIC = 116087.7
BIC = 116153.9
In order to use this model for the rest of our analysis, we need to check the assumptions of multiple linear regression.
##### 6.2.1 Check for multicolinearity
ols_vif_tol(brazil_lm)
## Variables Tolerance VIF
## 1 Active_pop 0.7651375 1.306955
## 2 IBGE_CROP_PRODUCTION 0.7928779 1.261228
## 3 IDHM_Longevidade 0.2900536 3.447638
## 4 IDHM_Educacao 0.2958672 3.379894
## 5 IDHM_Renda 0.1240375 8.062076
## 6 PAY_TV_p 0.5831230 1.714904
## 7 FIXED_PHONES_p 0.2842390 3.518166
## 8 GVA_AGROPEC_p 0.7358332 1.359004
## 9 GVA_INDUSTRY_p 0.8496988 1.176888
## 10 GVA_SERVICES_p 0.5511510 1.814385
## 11 tax_to_gdp 0.9731863 1.027552
## 12 Cars_p 0.2071776 4.826778
## 13 Motorcycles_p 0.8815642 1.134347
## 14 pop_density 0.7243592 1.380531
Since the VIF of the independent variables are less than 10, we can safely conclude that there are no sign of multicollinearity among the independent variables. This suggests that there is a low to moderate correlation between variables, but it is not a significant cause for concern.
Test the assumption that linearity and additivity of the relationship between dependent and independent variables
ols_plot_resid_fit(brazil_lm2)
Looking at the residuals vs fitted values plot, the red line is approximately at 0. There is no pattern in the residual plot, suggesting that we can assume a linear relationship between the predictors and outcome variables.
ols_plot_resid_hist(brazil_lm2)
The figure reveals that the residual of the multiple linear regression model resemble normal distribution
We will make another test to further verify out testing of assumption
fnorm <- fitdist(residuals(brazil_lm2), distr="norm")
summary(fnorm)
## Fitting of the distribution ' norm ' by maximum likelihood
## Parameters :
## estimate Std. Error
## mean 2.674790e-14 114.40902
## sd 8.381995e+03 79.49234
## Loglikelihood: -58033.83 AIC: 116071.7 BIC: 116084.9
## Correlation matrix:
## mean sd
## mean 1 0
## sd 0 1
plot(fnorm)
Q-Q plot and P-P plot shows no deviation from normal distribution. From the code chunk below, at the 95% significance level, KS statistic is smaller than the critical value (0.1968861 < 0.3205551). Do not reject null hypothesis. There is insufficient evidence to suspect that the residuals of the multiple linear regression model are not normally distributed. Hence, they are normally distributed.
par(mfrow=c(2,2))
plot(brazil_lm2)
gofstat(fnorm,discrete=FALSE)
## Goodness-of-fit statistics
## 1-mle-norm
## Kolmogorov-Smirnov statistic 0.2266629
## Cramer-von Mises statistic 112.2075310
## Anderson-Darling statistic Inf
##
## Goodness-of-fit criteria
## 1-mle-norm
## Akaike's Information Criterion 116071.7
## Bayesian Information Criterion 116084.9
KScritvalue <- 1.36/sqrt(length(brazil6))
KScritvalue
## [1] 0.3511505
Looking at the spread location plot, the variances of the residual points increases with the value of the fitted outcome variable, suggesting heteroscedasticity.
As some of our assumptions are not met, we run a 2nd specification: log GDP_CAPITA against all other Xs to correct extreme values, and eliminate heteroscedasticity..
brazil_lglm <- lm(log(GDP_CAPITA)~., data=brazil6)
summary(brazil_lglm)
##
## Call:
## lm(formula = log(GDP_CAPITA) ~ ., data = brazil6)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.85604 -0.16554 -0.03425 0.13553 2.75065
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 6.504e+00 1.025e-01 63.467 < 2e-16 ***
## Active_pop -3.059e-08 3.357e-08 -0.911 0.362
## IBGE_CROP_PRODUCTION 2.670e-07 2.789e-08 9.574 < 2e-16 ***
## IDHM_Longevidade 6.452e-01 1.529e-01 4.219 2.49e-05 ***
## IDHM_Educacao 1.023e-01 7.250e-02 1.411 0.158
## IDHM_Renda 3.494e+00 1.295e-01 26.974 < 2e-16 ***
## PAY_TV_p -1.358e-01 1.211e-01 -1.122 0.262
## FIXED_PHONES_p 5.489e-02 1.017e-01 0.540 0.590
## GVA_AGROPEC_p 2.786e-02 6.710e-04 41.515 < 2e-16 ***
## GVA_INDUSTRY_p 1.880e-02 4.035e-04 46.600 < 2e-16 ***
## GVA_SERVICES_p 2.013e-02 7.164e-04 28.100 < 2e-16 ***
## tax_to_gdp 7.901e-04 1.440e-04 5.487 4.28e-08 ***
## Cars_p 4.289e-01 5.873e-02 7.302 3.24e-13 ***
## Motorcycles_p -6.957e-02 5.875e-02 -1.184 0.236
## pop_density -8.081e-07 7.049e-06 -0.115 0.909
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.2742 on 5537 degrees of freedom
## Multiple R-squared: 0.8394, Adjusted R-squared: 0.839
## F-statistic: 2067 on 14 and 5537 DF, p-value: < 2.2e-16
AIC(brazil_lglm)
## [1] 1407.291
BIC(brazil_lglm)
## [1] 1513.241
With reference to the report above, it is clear that not all the independent variables are statistically significant. We will revise the model by removing those variables which are not statistically significant.
##### 6.3.1 Selecting statistically significant variables
brazil_lglm2 <- lm(log(GDP_CAPITA)~ IBGE_CROP_PRODUCTION + IDHM_Longevidade + IDHM_Renda + GVA_AGROPEC_p + GVA_INDUSTRY_p + GVA_SERVICES_p + tax_to_gdp + Cars_p, data=brazil6)
summary(brazil_lglm2)
##
## Call:
## lm(formula = log(GDP_CAPITA) ~ IBGE_CROP_PRODUCTION + IDHM_Longevidade +
## IDHM_Renda + GVA_AGROPEC_p + GVA_INDUSTRY_p + GVA_SERVICES_p +
## tax_to_gdp + Cars_p, data = brazil6)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.86710 -0.16500 -0.03439 0.13338 2.75733
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 6.512e+00 9.999e-02 65.126 < 2e-16 ***
## IBGE_CROP_PRODUCTION 2.663e-07 2.754e-08 9.671 < 2e-16 ***
## IDHM_Longevidade 6.639e-01 1.522e-01 4.362 1.32e-05 ***
## IDHM_Renda 3.523e+00 1.143e-01 30.807 < 2e-16 ***
## GVA_AGROPEC_p 2.787e-02 6.319e-04 44.104 < 2e-16 ***
## GVA_INDUSTRY_p 1.884e-02 4.022e-04 46.856 < 2e-16 ***
## GVA_SERVICES_p 2.010e-02 6.731e-04 29.859 < 2e-16 ***
## tax_to_gdp 7.833e-04 1.434e-04 5.461 4.93e-08 ***
## Cars_p 4.546e-01 5.691e-02 7.988 1.66e-15 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.2742 on 5543 degrees of freedom
## Multiple R-squared: 0.8392, Adjusted R-squared: 0.839
## F-statistic: 3616 on 8 and 5543 DF, p-value: < 2.2e-16
ols_regress(brazil_lglm2)
## Model Summary
## -------------------------------------------------------------
## R 0.916 RMSE 0.274
## R-Squared 0.839 Coef. Var 2.828
## Adj. R-Squared 0.839 MSE 0.075
## Pred R-Squared 0.836 MAE 0.197
## -------------------------------------------------------------
## RMSE: Root Mean Square Error
## MSE: Mean Square Error
## MAE: Mean Absolute Error
##
## ANOVA
## ------------------------------------------------------------------------
## Sum of
## Squares DF Mean Square F Sig.
## ------------------------------------------------------------------------
## Regression 2175.562 8 271.945 3616.278 0.0000
## Residual 416.835 5543 0.075
## Total 2592.397 5551
## ------------------------------------------------------------------------
##
## Parameter Estimates
## ----------------------------------------------------------------------------------------------
## model Beta Std. Error Std. Beta t Sig lower upper
## ----------------------------------------------------------------------------------------------
## (Intercept) 6.512 0.100 65.126 0.000 6.316 6.708
## IBGE_CROP_PRODUCTION 0.000 0.000 0.058 9.671 0.000 0.000 0.000
## IDHM_Longevidade 0.664 0.152 0.043 4.362 0.000 0.365 0.962
## IDHM_Renda 3.523 0.114 0.416 30.807 0.000 3.298 3.747
## GVA_AGROPEC_p 0.028 0.001 0.261 44.104 0.000 0.027 0.029
## GVA_INDUSTRY_p 0.019 0.000 0.273 46.856 0.000 0.018 0.020
## GVA_SERVICES_p 0.020 0.001 0.204 29.859 0.000 0.019 0.021
## tax_to_gdp 0.001 0.000 0.030 5.461 0.000 0.001 0.001
## Cars_p 0.455 0.057 0.092 7.988 0.000 0.343 0.566
## ----------------------------------------------------------------------------------------------
AIC(brazil_lglm2)
## [1] 1400.531
BIC(brazil_lglm2)
## [1] 1466.75
Adjusted R2=0.839
AIC = 1400.531
BIC = 1466.75
ols_plot_resid_hist(brazil_lglm2)
fnorm <- fitdist(residuals(brazil_lglm2), distr="norm")
summary(fnorm)
## Fitting of the distribution ' norm ' by maximum likelihood
## Parameters :
## estimate Std. Error
## mean -3.231877e-18 0.003677332
## sd 2.740044e-01 0.002600110
## Loglikelihood: -690.2656 AIC: 1384.531 BIC: 1397.775
## Correlation matrix:
## mean sd
## mean 1 0
## sd 0 1
plot(fnorm)
par(mfrow=c(2,2))
plot(brazil_lglm2)
gofstat(fnorm,discrete=FALSE)
## Goodness-of-fit statistics
## 1-mle-norm
## Kolmogorov-Smirnov statistic 0.06763436
## Cramer-von Mises statistic 10.70174891
## Anderson-Darling statistic Inf
##
## Goodness-of-fit criteria
## 1-mle-norm
## Akaike's Information Criterion 1384.531
## Bayesian Information Criterion 1397.775
KScritvalue <- 1.36/sqrt(length(brazil6))
KScritvalue
## [1] 0.3511505
The plots are now more favourable and there is a higher adjusted R2 and lower AIC and BIC values. The significant variables have changed slightly too.
summary(brazil_lglm2)
##
## Call:
## lm(formula = log(GDP_CAPITA) ~ IBGE_CROP_PRODUCTION + IDHM_Longevidade +
## IDHM_Renda + GVA_AGROPEC_p + GVA_INDUSTRY_p + GVA_SERVICES_p +
## tax_to_gdp + Cars_p, data = brazil6)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.86710 -0.16500 -0.03439 0.13338 2.75733
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 6.512e+00 9.999e-02 65.126 < 2e-16 ***
## IBGE_CROP_PRODUCTION 2.663e-07 2.754e-08 9.671 < 2e-16 ***
## IDHM_Longevidade 6.639e-01 1.522e-01 4.362 1.32e-05 ***
## IDHM_Renda 3.523e+00 1.143e-01 30.807 < 2e-16 ***
## GVA_AGROPEC_p 2.787e-02 6.319e-04 44.104 < 2e-16 ***
## GVA_INDUSTRY_p 1.884e-02 4.022e-04 46.856 < 2e-16 ***
## GVA_SERVICES_p 2.010e-02 6.731e-04 29.859 < 2e-16 ***
## tax_to_gdp 7.833e-04 1.434e-04 5.461 4.93e-08 ***
## Cars_p 4.546e-01 5.691e-02 7.988 1.66e-15 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.2742 on 5543 degrees of freedom
## Multiple R-squared: 0.8392, Adjusted R-squared: 0.839
## F-statistic: 3616 on 8 and 5543 DF, p-value: < 2.2e-16
The following equation includes variables that are significant in the log multiple linear regression model.
###### GDP_CAPITA = 6.512 + 2.663e-07 IBGE_CROP_PRODUCTION + 0.6639 IDHM_Longevidade + 3.523 IDHM_Renda + 0.4546 Cars_p + 0.02787 GVA_AGROPEC_p + 0.01884 GVA_INDUSTRY_p + 0.02010 GVA_SERVICES + 7.833e-04 tax_to_gdp
We have to visualise the residuals of the multiple linear regression model that we have achieved above
In order to perform spatial autocorrelation test, We will need to convert brazil into a SpatialPointsDataFrame
brazil.point.sf <- st_as_sf(brazil, coords=c("LONG", "LAT"), crs=4674)%>%
st_transform(crs=4674)
brazil.polygon.sf <- right_join(mun, brazil, by=c("name_muni" = "CITY", "abbrev_state" = "STATE"))
## Warning: Column `abbrev_state`/`STATE` joining character vector and factor,
## coercing into character vector
brazil.polygon.sf <- st_as_sf(brazil.polygon.sf, crs=4674) %>%
st_transform(crs=4674)
Next, we will export the residual of the model and save it as a separate data frame
mlr.output <- as.data.frame(brazil_lglm2$residuals)
We will then join this newly created mlr.output data frame with brazil.point.sf object and brazil.polygon.sf object
brazil.point.res.sf <- cbind(brazil.point.sf,
brazil_lglm2$residuals) %>%
rename(`MLR_RES` = `brazil_lglm2.residuals`)
brazil.polygon.res.sf <- cbind(brazil.polygon.sf,
brazil_lglm2$residuals) %>%
rename(`MLR_RES` = `brazil_lglm2.residuals`)
We will load the spatial point and spatial polygon point respectively
brazil.point.res.sf
## Simple feature collection with 5552 features and 90 fields
## geometry type: POINT
## dimension: XY
## bbox: xmin: -72.9165 ymin: -33.68757 xmax: 51.4725 ymax: 4.58544
## geographic CRS: SIRGAS 2000
## First 10 features:
## CITY STATE CAPITAL IBGE_RES_POP IBGE_RES_POP_BRAS
## 1 ABADIA DE GOIÁS GO 0 6876 6876
## 2 ABADIA DOS DOURADOS MG 0 6704 6704
## 3 ABADIÂNIA GO 0 15757 15609
## 4 ABAETÉ MG 0 22690 22690
## 5 ABAETETUBA PA 0 141100 141040
## 6 ABAIARA CE 0 10496 10496
## 7 ABAÍRA BA 0 8316 8316
## 8 ABARÉ BA 0 17064 17064
## 9 ABATIÁ PR 0 7764 7764
## 10 ABDON BATISTA SC 0 2653 2653
## IBGE_RES_POP_ESTR IBGE_DU IBGE_DU_URBAN IBGE_DU_RURAL IBGE_POP IBGE_1
## 1 0 2137 1546 591 5300 69
## 2 0 2328 1481 847 4154 38
## 3 148 4655 3233 1422 10656 139
## 4 0 7694 6667 1027 18464 176
## 5 60 31061 19057 12004 82956 1354
## 6 0 2791 1251 1540 4538 98
## 7 0 2572 1193 1379 3725 37
## 8 0 4332 2379 1953 8994 167
## 9 0 2499 1877 622 5685 69
## 10 0 848 234 614 724 12
## IBGE_1.4 IBGE_5.9 IBGE_10.14 Active_pop IBGE_60. IBGE_PLANTED_AREA
## 1 318 438 517 3542 416 319
## 2 207 260 351 2709 589 4479
## 3 650 894 1087 6896 990 10307
## 4 856 1233 1539 11979 2681 1862
## 5 5567 7618 8905 53516 5996 25200
## 6 323 421 483 2631 582 2598
## 7 156 263 277 2319 673 895
## 8 733 978 927 5386 803 2058
## 9 302 370 483 3650 811 1197
## 10 32 49 63 479 89 5502
## IBGE_CROP_PRODUCTION IDHM.Ranking.2010 IDHM IDHM_Renda IDHM_Longevidade
## 1 1843 1689 0.708 0.687 0.830
## 2 18017 2207 0.690 0.693 0.839
## 3 33085 2202 0.690 0.671 0.841
## 4 7502 1994 0.698 0.720 0.848
## 5 700872 3530 0.628 0.579 0.798
## 6 5234 3522 0.628 0.540 0.748
## 7 3999 4086 0.603 0.577 0.746
## 8 22761 4756 0.575 0.533 0.776
## 9 9943 2258 0.687 0.676 0.804
## 10 26195 2092 0.690 0.660 0.812
## IDHM_Educacao ALT PAY_TV FIXED_PHONES AREA
## 1 0.622 893.60 360 842 147.26
## 2 0.563 753.12 77 296 881.06
## 3 0.579 1017.55 227 720 1045.13
## 4 0.556 644.74 1230 1716 1817.07
## 5 0.537 10.12 3389 1218 1610.65
## 6 0.612 403.11 29 34 180.08
## 7 0.510 674.22 952 335 538.68
## 8 0.460 316.38 51 222 1604.92
## 9 0.596 579.30 55 392 228.72
## 10 0.625 720.98 109 260 237.16
## REGIAO_TUR CATEGORIA_TUR ESTIMATED_POP
## 1 <NA> <NA> 8583
## 2 Caminhos Do Cerrado D 6972
## 3 Região Turística Do Ouro E Cristais C 19614
## 4 Lago De Três Marias D 23223
## 5 Araguaia-Tocantins D 156292
## 6 <NA> <NA> 11663
## 7 Chapada Diamantina D 8767
## 8 <NA> <NA> 19814
## 9 <NA> <NA> 7507
## 10 Vale Do Contestado D 2577
## RURAL_URBAN GVA_AGROPEC GVA_INDUSTRY GVA_SERVICES GVA_PUBLIC GVA_TOTAL
## 1 Urbano 6.20 27991.25 74750.32 36915.04 145857.60
## 2 Rural Adjacente 50524.57 25917.70 62689.23 28083.79 167215.28
## 3 Rural Adjacente 42.84 16728.30 138198.58 63396.20 261161.91
## 4 Urbano 113824.60 31002.62 172.33 86081.41 403241.27
## 5 Urbano 140463.72 58610.00 468128.69 486872.40 1154074.81
## 6 Rural Adjacente 4435.16 5.88 22.81 35989.96 69108.67
## 7 Rural Remoto 12.41 3437.43 17990.74 28463.74 62304.83
## 8 Rural Remoto 9176.40 6.70 36921.84 65.75 118.55
## 9 Rural Adjacente 73340.52 8839.71 42999.37 34103.49 159283.08
## 10 Rural Adjacente 24996.75 3578.87 16011.10 17842.64 62429.36
## TAXES GDP POP_GDP GDP_CAPITA
## 1 20554.20 166.41 8053 20664.57
## 2 12873.50 180.09 7037 25591.70
## 3 26822.58 287984.49 18427 15628.40
## 4 26994.09 430235.36 23574 18250.42
## 5 95180.48 1249255.29 151934 8222.36
## 6 4042.79 73151.46 11483 6370.41
## 7 2019.77 64324.59 9212 6982.70
## 8 6.21 124754.26 19939 6256.80
## 9 5.77 165048.21 7795 21173.60
## 10 2312.65 64742.01 2617 24739.02
## GVA_MAIN
## 1 Demais serviços
## 2 Demais serviços
## 3 Demais serviços
## 4 Demais serviços
## 5 Administração, defesa, educação e saúde públicas e seguridade social
## 6 Administração, defesa, educação e saúde públicas e seguridade social
## 7 Administração, defesa, educação e saúde públicas e seguridade social
## 8 Administração, defesa, educação e saúde públicas e seguridade social
## 9 Agricultura, inclusive apoio à agricultura e a pós colheita
## 10 Administração, defesa, educação e saúde públicas e seguridade social
## MUN_EXPENDIT COMP_TOT COMP_A COMP_B COMP_C COMP_D COMP_E COMP_F COMP_G
## 1 28227691 284 5 1 56 0 2 29 110
## 2 17909274 476 6 6 30 1 2 34 190
## 3 37513019 288 5 9 26 0 2 7 117
## 4 NA 621 18 1 40 0 1 20 303
## 5 NA 931 4 2 43 0 1 27 500
## 6 NA 86 1 0 4 0 0 6 48
## 7 NA 191 6 0 8 0 1 4 97
## 8 NA 87 2 0 3 0 0 0 71
## 9 NA 285 5 0 20 0 1 10 133
## 10 19506956 69 2 0 4 0 0 2 35
## COMP_H COMP_I COMP_J COMP_K COMP_L COMP_M COMP_N COMP_O COMP_P COMP_Q COMP_R
## 1 26 4 5 0 2 10 12 4 6 6 1
## 2 70 28 11 0 4 15 29 2 9 14 6
## 3 12 57 2 1 0 7 15 3 11 5 1
## 4 62 30 9 6 4 28 27 2 15 19 9
## 5 16 31 6 1 1 22 16 2 155 33 15
## 6 2 10 2 0 0 2 3 2 0 2 0
## 7 5 5 3 1 0 5 5 2 8 1 2
## 8 0 1 1 0 0 0 1 2 0 2 0
## 9 18 14 8 0 4 11 26 2 8 9 4
## 10 8 3 1 1 0 4 0 2 1 3 0
## COMP_S COMP_T COMP_U HOTELS BEDS Pr_Agencies Pu_Agencies Pr_Bank Pu_Bank
## 1 5 0 0 NA NA NA NA NA NA
## 2 19 0 0 NA NA NA NA NA NA
## 3 8 0 0 1 34 1 1 1 1
## 4 27 0 0 NA NA 2 2 2 2
## 5 56 0 0 NA NA 2 4 2 4
## 6 4 0 0 NA NA NA NA NA NA
## 7 38 0 0 1 24 NA NA NA NA
## 8 4 0 0 NA NA 1 0 1 0
## 9 12 0 0 NA NA 0 1 0 1
## 10 3 0 0 NA NA 0 1 0 1
## Pr_Assets Pu_Assets Cars Motorcycles Wheeled_tractor UBER MAC WAL.MART
## 1 NA NA 2158 1246 0 <NA> NA NA
## 2 NA NA 2227 1142 0 <NA> NA NA
## 3 33724584 67091904 2838 1426 0 <NA> NA NA
## 4 44974716 371922572 6928 2953 0 <NA> NA NA
## 5 76181384 800078483 5277 25661 0 <NA> NA NA
## 6 NA NA 553 1674 0 <NA> NA NA
## 7 NA NA 896 696 0 <NA> NA NA
## 8 21823314 0 613 1532 0 <NA> NA NA
## 9 0 45976288 2168 912 0 <NA> NA NA
## 10 0 42909056 976 345 2 <NA> NA NA
## POST_OFFICES LOG_GDP_CAPITA PAY_TV_p FIXED_PHONES_p Cars_p
## 1 1 9.936176 0.044703837 0.104557308 0.26797467
## 2 1 10.150023 0.010942163 0.042063379 0.31647009
## 3 3 9.656845 0.012318880 0.039073099 0.15401313
## 4 4 9.811943 0.052176126 0.072792059 0.29388309
## 5 2 9.014613 0.022305738 0.008016639 0.03473219
## 6 1 8.759419 0.002525472 0.002960899 0.04815815
## 7 1 8.851191 0.103343465 0.036365610 0.09726444
## 8 1 8.741424 0.002557801 0.011133959 0.03074377
## 9 1 9.960510 0.007055805 0.050288647 0.27812700
## 10 1 10.116137 0.041650745 0.099350401 0.37294612
## Motorcycles_p GVA_AGROPEC_p GVA_INDUSTRY_p GVA_SERVICES_p pop_density
## 1 0.15472495 0.0007698994 3.4758785546 9.282294797 54.68559
## 2 0.16228506 7.1798451044 3.6830609635 8.908516413 7.98697
## 3 0.07738644 0.0023248494 0.9078146199 7.499787269 17.63130
## 4 0.12526512 4.8283956902 1.3151191991 0.007310172 12.97363
## 5 0.16889570 0.9245048508 0.3857596061 3.081131873 94.33086
## 6 0.14578072 0.3862370461 0.0005120613 0.001986415 63.76610
## 7 0.07555363 0.0013471559 0.3731469822 1.952967868 17.10106
## 8 0.07683434 0.4602236822 0.0003360249 1.851739806 12.42367
## 9 0.11699808 9.4086619628 1.1340230917 5.516275818 34.08097
## 10 0.13183034 9.5516813145 1.3675468093 6.118112342 11.03474
## tax_to_gdp MLR_RES geometry
## 1 1.235154e+02 -0.01765106 POINT (-49.44055 -16.75881)
## 2 7.148370e+01 -0.01299481 POINT (-47.39683 -18.48756)
## 3 9.313897e-02 -0.02357896 POINT (-48.71881 -16.18267)
## 4 6.274261e-02 -0.09409796 POINT (-45.44619 -19.15585)
## 5 7.618978e-02 -0.36386803 POINT (-48.8844 -1.72347)
## 6 5.526602e-02 -0.18516124 POINT (-39.04755 -7.356977)
## 7 3.139966e-02 -0.27987653 POINT (-41.66161 -13.25353)
## 8 4.977786e-05 -0.23302640 POINT (-39.11659 -8.723418)
## 9 3.495948e-05 0.01027696 POINT (-50.31253 -23.30049)
## 10 3.572101e-02 0.14900808 POINT (-51.02527 -27.60899)
brazil.polygon.res.sf
## Simple feature collection with 5552 features and 94 fields
## geometry type: GEOMETRY
## dimension: XY
## bbox: xmin: -73.99045 ymin: -33.75118 xmax: -28.83594 ymax: 5.271841
## geographic CRS: SIRGAS 2000
## First 10 features:
## code_muni name_muni code_state abbrev_state CAPITAL IBGE_RES_POP
## 1 5200050 ABADIA DE GOIÁS 52 GO 0 6876
## 2 3100104 ABADIA DOS DOURADOS 31 MG 0 6704
## 3 5200100 ABADIÂNIA 52 GO 0 15757
## 4 3100203 ABAETÉ 31 MG 0 22690
## 5 1500107 ABAETETUBA 15 PA 0 141100
## 6 2300101 ABAIARA 23 CE 0 10496
## 7 2900108 ABAÍRA 29 BA 0 8316
## 8 2900207 ABARÉ 29 BA 0 17064
## 9 4100103 ABATIÁ 41 PR 0 7764
## 10 4200051 ABDON BATISTA 42 SC 0 2653
## IBGE_RES_POP_BRAS IBGE_RES_POP_ESTR IBGE_DU IBGE_DU_URBAN IBGE_DU_RURAL
## 1 6876 0 2137 1546 591
## 2 6704 0 2328 1481 847
## 3 15609 148 4655 3233 1422
## 4 22690 0 7694 6667 1027
## 5 141040 60 31061 19057 12004
## 6 10496 0 2791 1251 1540
## 7 8316 0 2572 1193 1379
## 8 17064 0 4332 2379 1953
## 9 7764 0 2499 1877 622
## 10 2653 0 848 234 614
## IBGE_POP IBGE_1 IBGE_1.4 IBGE_5.9 IBGE_10.14 Active_pop IBGE_60.
## 1 5300 69 318 438 517 3542 416
## 2 4154 38 207 260 351 2709 589
## 3 10656 139 650 894 1087 6896 990
## 4 18464 176 856 1233 1539 11979 2681
## 5 82956 1354 5567 7618 8905 53516 5996
## 6 4538 98 323 421 483 2631 582
## 7 3725 37 156 263 277 2319 673
## 8 8994 167 733 978 927 5386 803
## 9 5685 69 302 370 483 3650 811
## 10 724 12 32 49 63 479 89
## IBGE_PLANTED_AREA IBGE_CROP_PRODUCTION IDHM.Ranking.2010 IDHM IDHM_Renda
## 1 319 1843 1689 0.708 0.687
## 2 4479 18017 2207 0.690 0.693
## 3 10307 33085 2202 0.690 0.671
## 4 1862 7502 1994 0.698 0.720
## 5 25200 700872 3530 0.628 0.579
## 6 2598 5234 3522 0.628 0.540
## 7 895 3999 4086 0.603 0.577
## 8 2058 22761 4756 0.575 0.533
## 9 1197 9943 2258 0.687 0.676
## 10 5502 26195 2092 0.690 0.660
## IDHM_Longevidade IDHM_Educacao LONG LAT ALT PAY_TV
## 1 0.830 0.622 -49.44055 -16.758812 893.60 360
## 2 0.839 0.563 -47.39683 -18.487565 753.12 77
## 3 0.841 0.579 -48.71881 -16.182672 1017.55 227
## 4 0.848 0.556 -45.44619 -19.155848 644.74 1230
## 5 0.798 0.537 -48.88440 -1.723470 10.12 3389
## 6 0.748 0.612 -39.04755 -7.356977 403.11 29
## 7 0.746 0.510 -41.66161 -13.253532 674.22 952
## 8 0.776 0.460 -39.11659 -8.723418 316.38 51
## 9 0.804 0.596 -50.31253 -23.300494 579.30 55
## 10 0.812 0.625 -51.02527 -27.608987 720.98 109
## FIXED_PHONES AREA REGIAO_TUR CATEGORIA_TUR
## 1 842 147.26 <NA> <NA>
## 2 296 881.06 Caminhos Do Cerrado D
## 3 720 1045.13 Região Turística Do Ouro E Cristais C
## 4 1716 1817.07 Lago De Três Marias D
## 5 1218 1610.65 Araguaia-Tocantins D
## 6 34 180.08 <NA> <NA>
## 7 335 538.68 Chapada Diamantina D
## 8 222 1604.92 <NA> <NA>
## 9 392 228.72 <NA> <NA>
## 10 260 237.16 Vale Do Contestado D
## ESTIMATED_POP RURAL_URBAN GVA_AGROPEC GVA_INDUSTRY GVA_SERVICES
## 1 8583 Urbano 6.20 27991.25 74750.32
## 2 6972 Rural Adjacente 50524.57 25917.70 62689.23
## 3 19614 Rural Adjacente 42.84 16728.30 138198.58
## 4 23223 Urbano 113824.60 31002.62 172.33
## 5 156292 Urbano 140463.72 58610.00 468128.69
## 6 11663 Rural Adjacente 4435.16 5.88 22.81
## 7 8767 Rural Remoto 12.41 3437.43 17990.74
## 8 19814 Rural Remoto 9176.40 6.70 36921.84
## 9 7507 Rural Adjacente 73340.52 8839.71 42999.37
## 10 2577 Rural Adjacente 24996.75 3578.87 16011.10
## GVA_PUBLIC GVA_TOTAL TAXES GDP POP_GDP GDP_CAPITA
## 1 36915.04 145857.60 20554.20 166.41 8053 20664.57
## 2 28083.79 167215.28 12873.50 180.09 7037 25591.70
## 3 63396.20 261161.91 26822.58 287984.49 18427 15628.40
## 4 86081.41 403241.27 26994.09 430235.36 23574 18250.42
## 5 486872.40 1154074.81 95180.48 1249255.29 151934 8222.36
## 6 35989.96 69108.67 4042.79 73151.46 11483 6370.41
## 7 28463.74 62304.83 2019.77 64324.59 9212 6982.70
## 8 65.75 118.55 6.21 124754.26 19939 6256.80
## 9 34103.49 159283.08 5.77 165048.21 7795 21173.60
## 10 17842.64 62429.36 2312.65 64742.01 2617 24739.02
## GVA_MAIN
## 1 Demais serviços
## 2 Demais serviços
## 3 Demais serviços
## 4 Demais serviços
## 5 Administração, defesa, educação e saúde públicas e seguridade social
## 6 Administração, defesa, educação e saúde públicas e seguridade social
## 7 Administração, defesa, educação e saúde públicas e seguridade social
## 8 Administração, defesa, educação e saúde públicas e seguridade social
## 9 Agricultura, inclusive apoio à agricultura e a pós colheita
## 10 Administração, defesa, educação e saúde públicas e seguridade social
## MUN_EXPENDIT COMP_TOT COMP_A COMP_B COMP_C COMP_D COMP_E COMP_F COMP_G
## 1 28227691 284 5 1 56 0 2 29 110
## 2 17909274 476 6 6 30 1 2 34 190
## 3 37513019 288 5 9 26 0 2 7 117
## 4 NA 621 18 1 40 0 1 20 303
## 5 NA 931 4 2 43 0 1 27 500
## 6 NA 86 1 0 4 0 0 6 48
## 7 NA 191 6 0 8 0 1 4 97
## 8 NA 87 2 0 3 0 0 0 71
## 9 NA 285 5 0 20 0 1 10 133
## 10 19506956 69 2 0 4 0 0 2 35
## COMP_H COMP_I COMP_J COMP_K COMP_L COMP_M COMP_N COMP_O COMP_P COMP_Q COMP_R
## 1 26 4 5 0 2 10 12 4 6 6 1
## 2 70 28 11 0 4 15 29 2 9 14 6
## 3 12 57 2 1 0 7 15 3 11 5 1
## 4 62 30 9 6 4 28 27 2 15 19 9
## 5 16 31 6 1 1 22 16 2 155 33 15
## 6 2 10 2 0 0 2 3 2 0 2 0
## 7 5 5 3 1 0 5 5 2 8 1 2
## 8 0 1 1 0 0 0 1 2 0 2 0
## 9 18 14 8 0 4 11 26 2 8 9 4
## 10 8 3 1 1 0 4 0 2 1 3 0
## COMP_S COMP_T COMP_U HOTELS BEDS Pr_Agencies Pu_Agencies Pr_Bank Pu_Bank
## 1 5 0 0 NA NA NA NA NA NA
## 2 19 0 0 NA NA NA NA NA NA
## 3 8 0 0 1 34 1 1 1 1
## 4 27 0 0 NA NA 2 2 2 2
## 5 56 0 0 NA NA 2 4 2 4
## 6 4 0 0 NA NA NA NA NA NA
## 7 38 0 0 1 24 NA NA NA NA
## 8 4 0 0 NA NA 1 0 1 0
## 9 12 0 0 NA NA 0 1 0 1
## 10 3 0 0 NA NA 0 1 0 1
## Pr_Assets Pu_Assets Cars Motorcycles Wheeled_tractor UBER MAC WAL.MART
## 1 NA NA 2158 1246 0 <NA> NA NA
## 2 NA NA 2227 1142 0 <NA> NA NA
## 3 33724584 67091904 2838 1426 0 <NA> NA NA
## 4 44974716 371922572 6928 2953 0 <NA> NA NA
## 5 76181384 800078483 5277 25661 0 <NA> NA NA
## 6 NA NA 553 1674 0 <NA> NA NA
## 7 NA NA 896 696 0 <NA> NA NA
## 8 21823314 0 613 1532 0 <NA> NA NA
## 9 0 45976288 2168 912 0 <NA> NA NA
## 10 0 42909056 976 345 2 <NA> NA NA
## POST_OFFICES LOG_GDP_CAPITA PAY_TV_p FIXED_PHONES_p Cars_p
## 1 1 9.936176 0.044703837 0.104557308 0.26797467
## 2 1 10.150023 0.010942163 0.042063379 0.31647009
## 3 3 9.656845 0.012318880 0.039073099 0.15401313
## 4 4 9.811943 0.052176126 0.072792059 0.29388309
## 5 2 9.014613 0.022305738 0.008016639 0.03473219
## 6 1 8.759419 0.002525472 0.002960899 0.04815815
## 7 1 8.851191 0.103343465 0.036365610 0.09726444
## 8 1 8.741424 0.002557801 0.011133959 0.03074377
## 9 1 9.960510 0.007055805 0.050288647 0.27812700
## 10 1 10.116137 0.041650745 0.099350401 0.37294612
## Motorcycles_p GVA_AGROPEC_p GVA_INDUSTRY_p GVA_SERVICES_p pop_density
## 1 0.15472495 0.0007698994 3.4758785546 9.282294797 54.68559
## 2 0.16228506 7.1798451044 3.6830609635 8.908516413 7.98697
## 3 0.07738644 0.0023248494 0.9078146199 7.499787269 17.63130
## 4 0.12526512 4.8283956902 1.3151191991 0.007310172 12.97363
## 5 0.16889570 0.9245048508 0.3857596061 3.081131873 94.33086
## 6 0.14578072 0.3862370461 0.0005120613 0.001986415 63.76610
## 7 0.07555363 0.0013471559 0.3731469822 1.952967868 17.10106
## 8 0.07683434 0.4602236822 0.0003360249 1.851739806 12.42367
## 9 0.11699808 9.4086619628 1.1340230917 5.516275818 34.08097
## 10 0.13183034 9.5516813145 1.3675468093 6.118112342 11.03474
## tax_to_gdp MLR_RES geom
## 1 1.235154e+02 -0.01765106 MULTIPOLYGON (((-49.4444 -1...
## 2 7.148370e+01 -0.01299481 POLYGON ((-47.4384 -18.1657...
## 3 9.313897e-02 -0.02357896 MULTIPOLYGON (((-48.84178 -...
## 4 6.274261e-02 -0.09409796 POLYGON ((-45.16777 -18.890...
## 5 7.618978e-02 -0.36386803 MULTIPOLYGON (((-48.83139 -...
## 6 5.526602e-02 -0.18516124 MULTIPOLYGON (((-39.01733 -...
## 7 3.139966e-02 -0.27987653 MULTIPOLYGON (((-41.64722 -...
## 8 4.977786e-05 -0.23302640 MULTIPOLYGON (((-39.35855 -...
## 9 3.495948e-05 0.01027696 POLYGON ((-50.22465 -23.226...
## 10 3.572101e-02 0.14900808 MULTIPOLYGON (((-51.03724 -...
Next, we will convert brazil.point.res.sf simple feature object into a SpatialPointDataFrame by using as_Spatial()
brazil.point.sp <- as_Spatial(brazil.point.res.sf)
brazil.point.sp
## class : SpatialPointsDataFrame
## features : 5552
## extent : -72.9165, 51.4725, -33.68757, 4.58544 (xmin, xmax, ymin, ymax)
## Warning in proj4string(x): CRS object has comment, which is lost in output
## crs : +proj=longlat +ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +no_defs
## variables : 90
## names : CITY, STATE, CAPITAL, IBGE_RES_POP, IBGE_RES_POP_BRAS, IBGE_RES_POP_ESTR, IBGE_DU, IBGE_DU_URBAN, IBGE_DU_RURAL, IBGE_POP, IBGE_1, IBGE_1.4, IBGE_5.9, IBGE_10.14, Active_pop, ...
## min values : ABADIA DE GOIÁS, AC, 0, 805, 805, 0, 239, 60, 3, 174, 0, 5, 7, 12, 94, ...
## max values : ZORTÉA, TO, 1, 11253503, 11133776, 119727, 3576148, 3548433, 33809, 10463636, 129464, 514794, 684443, 783702, 7058221, ...
brazil.polygon.res.sf <- st_make_valid(brazil.polygon.res.sf)
qtm(brazil.polygon.res.sf, "GDP_CAPITA", borders=NULL, scale=0.7)+
tm_legend(main.title="GDP_CAPITA",
main.title.position="centre")
brazil.polygon.res.sf <- st_make_valid(brazil.polygon.res.sf)
qtm(brazil.polygon.res.sf, "MLR_RES", borders=NULL,scale = 0.7) + tm_legend(
main.title = "Residuals",
main.title.position = "centre")
## Variable(s) "MLR_RES" contains positive and negative values, so midpoint is set to 0. Set midpoint = NA to show the full spectrum of the color palette.
tm_shape(mun)+
tm_polygons()+
tm_shape(brazil.point.res.sf)+
tm_dots(col="MLR_RES", alpha=0.6, style="quantile")
## Variable(s) "MLR_RES" contains positive and negative values, so midpoint is set to 0. Set midpoint = NA to show the full spectrum of the color palette.
The figure above reveal that there is sign of spatial autocorrelation.
We will now perform Moran’s I test, to further confirm our observation
The hypothesis test is as follow:
H0: Residual for regression model is randomly distributed
H1: Residual for regression model is not randomly distributed
Confidence interval: 0.95
The code chunk below will tell us the upper limit for distance band
coords <- coordinates(brazil.point.sp)
k <- knn2nb(knearneigh(coords))
kdists <- unlist(nbdists(k, coords, longlat=FALSE))
summary(kdists)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.00548 0.08484 0.12164 0.17221 0.18149 88.21169
The output above tells us that the maximum distance is 88.2119. Using this as the upper threshold, this gives us certainty that all units will have at least one neighbour.
Computing Global Moran’s Test I
wm_knn6 <- knn2nb(knearneigh(coordinates(brazil.point.sp), k=6))
wm_knn6
## Neighbour list object:
## Number of regions: 5552
## Number of nonzero links: 33312
## Percentage nonzero weights: 0.1080692
## Average number of links: 6
## Non-symmetric neighbours list
We will now need to convert the output neighbour lists into spatial weights
wm_lw <- nb2listw(wm_knn6, style="W")
summary(wm_lw)
## Characteristics of weights list object:
## Neighbour list object:
## Number of regions: 5552
## Number of nonzero links: 33312
## Percentage nonzero weights: 0.1080692
## Average number of links: 6
## Non-symmetric neighbours list
## Link number distribution:
##
## 6
## 5552
## 5552 least connected regions:
## 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 1272 1273 1274 1275 1276 1277 1278 1279 1280 1281 1282 1283 1284 1285 1286 1287 1288 1289 1290 1291 1292 1293 1294 1295 1296 1297 1298 1299 1300 1301 1302 1303 1304 1305 1306 1307 1308 1309 1310 1311 1312 1313 1314 1315 1316 1317 1318 1319 1320 1321 1322 1323 1324 1325 1326 1327 1328 1329 1330 1331 1332 1333 1334 1335 1336 1337 1338 1339 1340 1341 1342 1343 1344 1345 1346 1347 1348 1349 1350 1351 1352 1353 1354 1355 1356 1357 1358 1359 1360 1361 1362 1363 1364 1365 1366 1367 1368 1369 1370 1371 1372 1373 1374 1375 1376 1377 1378 1379 1380 1381 1382 1383 1384 1385 1386 1387 1388 1389 1390 1391 1392 1393 1394 1395 1396 1397 1398 1399 1400 1401 1402 1403 1404 1405 1406 1407 1408 1409 1410 1411 1412 1413 1414 1415 1416 1417 1418 1419 1420 1421 1422 1423 1424 1425 1426 1427 1428 1429 1430 1431 1432 1433 1434 1435 1436 1437 1438 1439 1440 1441 1442 1443 1444 1445 1446 1447 1448 1449 1450 1451 1452 1453 1454 1455 1456 1457 1458 1459 1460 1461 1462 1463 1464 1465 1466 1467 1468 1469 1470 1471 1472 1473 1474 1475 1476 1477 1478 1479 1480 1481 1482 1483 1484 1485 1486 1487 1488 1489 1490 1491 1492 1493 1494 1495 1496 1497 1498 1499 1500 1501 1502 1503 1504 1505 1506 1507 1508 1509 1510 1511 1512 1513 1514 1515 1516 1517 1518 1519 1520 1521 1522 1523 1524 1525 1526 1527 1528 1529 1530 1531 1532 1533 1534 1535 1536 1537 1538 1539 1540 1541 1542 1543 1544 1545 1546 1547 1548 1549 1550 1551 1552 1553 1554 1555 1556 1557 1558 1559 1560 1561 1562 1563 1564 1565 1566 1567 1568 1569 1570 1571 1572 1573 1574 1575 1576 1577 1578 1579 1580 1581 1582 1583 1584 1585 1586 1587 1588 1589 1590 1591 1592 1593 1594 1595 1596 1597 1598 1599 1600 1601 1602 1603 1604 1605 1606 1607 1608 1609 1610 1611 1612 1613 1614 1615 1616 1617 1618 1619 1620 1621 1622 1623 1624 1625 1626 1627 1628 1629 1630 1631 1632 1633 1634 1635 1636 1637 1638 1639 1640 1641 1642 1643 1644 1645 1646 1647 1648 1649 1650 1651 1652 1653 1654 1655 1656 1657 1658 1659 1660 1661 1662 1663 1664 1665 1666 1667 1668 1669 1670 1671 1672 1673 1674 1675 1676 1677 1678 1679 1680 1681 1682 1683 1684 1685 1686 1687 1688 1689 1690 1691 1692 1693 1694 1695 1696 1697 1698 1699 1700 1701 1702 1703 1704 1705 1706 1707 1708 1709 1710 1711 1712 1713 1714 1715 1716 1717 1718 1719 1720 1721 1722 1723 1724 1725 1726 1727 1728 1729 1730 1731 1732 1733 1734 1735 1736 1737 1738 1739 1740 1741 1742 1743 1744 1745 1746 1747 1748 1749 1750 1751 1752 1753 1754 1755 1756 1757 1758 1759 1760 1761 1762 1763 1764 1765 1766 1767 1768 1769 1770 1771 1772 1773 1774 1775 1776 1777 1778 1779 1780 1781 1782 1783 1784 1785 1786 1787 1788 1789 1790 1791 1792 1793 1794 1795 1796 1797 1798 1799 1800 1801 1802 1803 1804 1805 1806 1807 1808 1809 1810 1811 1812 1813 1814 1815 1816 1817 1818 1819 1820 1821 1822 1823 1824 1825 1826 1827 1828 1829 1830 1831 1832 1833 1834 1835 1836 1837 1838 1839 1840 1841 1842 1843 1844 1845 1846 1847 1848 1849 1850 1851 1852 1853 1854 1855 1856 1857 1858 1859 1860 1861 1862 1863 1864 1865 1866 1867 1868 1869 1870 1871 1872 1873 1874 1875 1876 1877 1878 1879 1880 1881 1882 1883 1884 1885 1886 1887 1888 1889 1890 1891 1892 1893 1894 1895 1896 1897 1898 1899 1900 1901 1902 1903 1904 1905 1906 1907 1908 1909 1910 1911 1912 1913 1914 1915 1916 1917 1918 1919 1920 1921 1922 1923 1924 1925 1926 1927 1928 1929 1930 1931 1932 1933 1934 1935 1936 1937 1938 1939 1940 1941 1942 1943 1944 1945 1946 1947 1948 1949 1950 1951 1952 1953 1954 1955 1956 1957 1958 1959 1960 1961 1962 1963 1964 1965 1966 1967 1968 1969 1970 1971 1972 1973 1974 1975 1976 1977 1978 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025 2026 2027 2028 2029 2030 2031 2032 2033 2034 2035 2036 2037 2038 2039 2040 2041 2042 2043 2044 2045 2046 2047 2048 2049 2050 2051 2052 2053 2054 2055 2056 2057 2058 2059 2060 2061 2062 2063 2064 2065 2066 2067 2068 2069 2070 2071 2072 2073 2074 2075 2076 2077 2078 2079 2080 2081 2082 2083 2084 2085 2086 2087 2088 2089 2090 2091 2092 2093 2094 2095 2096 2097 2098 2099 2100 2101 2102 2103 2104 2105 2106 2107 2108 2109 2110 2111 2112 2113 2114 2115 2116 2117 2118 2119 2120 2121 2122 2123 2124 2125 2126 2127 2128 2129 2130 2131 2132 2133 2134 2135 2136 2137 2138 2139 2140 2141 2142 2143 2144 2145 2146 2147 2148 2149 2150 2151 2152 2153 2154 2155 2156 2157 2158 2159 2160 2161 2162 2163 2164 2165 2166 2167 2168 2169 2170 2171 2172 2173 2174 2175 2176 2177 2178 2179 2180 2181 2182 2183 2184 2185 2186 2187 2188 2189 2190 2191 2192 2193 2194 2195 2196 2197 2198 2199 2200 2201 2202 2203 2204 2205 2206 2207 2208 2209 2210 2211 2212 2213 2214 2215 2216 2217 2218 2219 2220 2221 2222 2223 2224 2225 2226 2227 2228 2229 2230 2231 2232 2233 2234 2235 2236 2237 2238 2239 2240 2241 2242 2243 2244 2245 2246 2247 2248 2249 2250 2251 2252 2253 2254 2255 2256 2257 2258 2259 2260 2261 2262 2263 2264 2265 2266 2267 2268 2269 2270 2271 2272 2273 2274 2275 2276 2277 2278 2279 2280 2281 2282 2283 2284 2285 2286 2287 2288 2289 2290 2291 2292 2293 2294 2295 2296 2297 2298 2299 2300 2301 2302 2303 2304 2305 2306 2307 2308 2309 2310 2311 2312 2313 2314 2315 2316 2317 2318 2319 2320 2321 2322 2323 2324 2325 2326 2327 2328 2329 2330 2331 2332 2333 2334 2335 2336 2337 2338 2339 2340 2341 2342 2343 2344 2345 2346 2347 2348 2349 2350 2351 2352 2353 2354 2355 2356 2357 2358 2359 2360 2361 2362 2363 2364 2365 2366 2367 2368 2369 2370 2371 2372 2373 2374 2375 2376 2377 2378 2379 2380 2381 2382 2383 2384 2385 2386 2387 2388 2389 2390 2391 2392 2393 2394 2395 2396 2397 2398 2399 2400 2401 2402 2403 2404 2405 2406 2407 2408 2409 2410 2411 2412 2413 2414 2415 2416 2417 2418 2419 2420 2421 2422 2423 2424 2425 2426 2427 2428 2429 2430 2431 2432 2433 2434 2435 2436 2437 2438 2439 2440 2441 2442 2443 2444 2445 2446 2447 2448 2449 2450 2451 2452 2453 2454 2455 2456 2457 2458 2459 2460 2461 2462 2463 2464 2465 2466 2467 2468 2469 2470 2471 2472 2473 2474 2475 2476 2477 2478 2479 2480 2481 2482 2483 2484 2485 2486 2487 2488 2489 2490 2491 2492 2493 2494 2495 2496 2497 2498 2499 2500 2501 2502 2503 2504 2505 2506 2507 2508 2509 2510 2511 2512 2513 2514 2515 2516 2517 2518 2519 2520 2521 2522 2523 2524 2525 2526 2527 2528 2529 2530 2531 2532 2533 2534 2535 2536 2537 2538 2539 2540 2541 2542 2543 2544 2545 2546 2547 2548 2549 2550 2551 2552 2553 2554 2555 2556 2557 2558 2559 2560 2561 2562 2563 2564 2565 2566 2567 2568 2569 2570 2571 2572 2573 2574 2575 2576 2577 2578 2579 2580 2581 2582 2583 2584 2585 2586 2587 2588 2589 2590 2591 2592 2593 2594 2595 2596 2597 2598 2599 2600 2601 2602 2603 2604 2605 2606 2607 2608 2609 2610 2611 2612 2613 2614 2615 2616 2617 2618 2619 2620 2621 2622 2623 2624 2625 2626 2627 2628 2629 2630 2631 2632 2633 2634 2635 2636 2637 2638 2639 2640 2641 2642 2643 2644 2645 2646 2647 2648 2649 2650 2651 2652 2653 2654 2655 2656 2657 2658 2659 2660 2661 2662 2663 2664 2665 2666 2667 2668 2669 2670 2671 2672 2673 2674 2675 2676 2677 2678 2679 2680 2681 2682 2683 2684 2685 2686 2687 2688 2689 2690 2691 2692 2693 2694 2695 2696 2697 2698 2699 2700 2701 2702 2703 2704 2705 2706 2707 2708 2709 2710 2711 2712 2713 2714 2715 2716 2717 2718 2719 2720 2721 2722 2723 2724 2725 2726 2727 2728 2729 2730 2731 2732 2733 2734 2735 2736 2737 2738 2739 2740 2741 2742 2743 2744 2745 2746 2747 2748 2749 2750 2751 2752 2753 2754 2755 2756 2757 2758 2759 2760 2761 2762 2763 2764 2765 2766 2767 2768 2769 2770 2771 2772 2773 2774 2775 2776 2777 2778 2779 2780 2781 2782 2783 2784 2785 2786 2787 2788 2789 2790 2791 2792 2793 2794 2795 2796 2797 2798 2799 2800 2801 2802 2803 2804 2805 2806 2807 2808 2809 2810 2811 2812 2813 2814 2815 2816 2817 2818 2819 2820 2821 2822 2823 2824 2825 2826 2827 2828 2829 2830 2831 2832 2833 2834 2835 2836 2837 2838 2839 2840 2841 2842 2843 2844 2845 2846 2847 2848 2849 2850 2851 2852 2853 2854 2855 2856 2857 2858 2859 2860 2861 2862 2863 2864 2865 2866 2867 2868 2869 2870 2871 2872 2873 2874 2875 2876 2877 2878 2879 2880 2881 2882 2883 2884 2885 2886 2887 2888 2889 2890 2891 2892 2893 2894 2895 2896 2897 2898 2899 2900 2901 2902 2903 2904 2905 2906 2907 2908 2909 2910 2911 2912 2913 2914 2915 2916 2917 2918 2919 2920 2921 2922 2923 2924 2925 2926 2927 2928 2929 2930 2931 2932 2933 2934 2935 2936 2937 2938 2939 2940 2941 2942 2943 2944 2945 2946 2947 2948 2949 2950 2951 2952 2953 2954 2955 2956 2957 2958 2959 2960 2961 2962 2963 2964 2965 2966 2967 2968 2969 2970 2971 2972 2973 2974 2975 2976 2977 2978 2979 2980 2981 2982 2983 2984 2985 2986 2987 2988 2989 2990 2991 2992 2993 2994 2995 2996 2997 2998 2999 3000 3001 3002 3003 3004 3005 3006 3007 3008 3009 3010 3011 3012 3013 3014 3015 3016 3017 3018 3019 3020 3021 3022 3023 3024 3025 3026 3027 3028 3029 3030 3031 3032 3033 3034 3035 3036 3037 3038 3039 3040 3041 3042 3043 3044 3045 3046 3047 3048 3049 3050 3051 3052 3053 3054 3055 3056 3057 3058 3059 3060 3061 3062 3063 3064 3065 3066 3067 3068 3069 3070 3071 3072 3073 3074 3075 3076 3077 3078 3079 3080 3081 3082 3083 3084 3085 3086 3087 3088 3089 3090 3091 3092 3093 3094 3095 3096 3097 3098 3099 3100 3101 3102 3103 3104 3105 3106 3107 3108 3109 3110 3111 3112 3113 3114 3115 3116 3117 3118 3119 3120 3121 3122 3123 3124 3125 3126 3127 3128 3129 3130 3131 3132 3133 3134 3135 3136 3137 3138 3139 3140 3141 3142 3143 3144 3145 3146 3147 3148 3149 3150 3151 3152 3153 3154 3155 3156 3157 3158 3159 3160 3161 3162 3163 3164 3165 3166 3167 3168 3169 3170 3171 3172 3173 3174 3175 3176 3177 3178 3179 3180 3181 3182 3183 3184 3185 3186 3187 3188 3189 3190 3191 3192 3193 3194 3195 3196 3197 3198 3199 3200 3201 3202 3203 3204 3205 3206 3207 3208 3209 3210 3211 3212 3213 3214 3215 3216 3217 3218 3219 3220 3221 3222 3223 3224 3225 3226 3227 3228 3229 3230 3231 3232 3233 3234 3235 3236 3237 3238 3239 3240 3241 3242 3243 3244 3245 3246 3247 3248 3249 3250 3251 3252 3253 3254 3255 3256 3257 3258 3259 3260 3261 3262 3263 3264 3265 3266 3267 3268 3269 3270 3271 3272 3273 3274 3275 3276 3277 3278 3279 3280 3281 3282 3283 3284 3285 3286 3287 3288 3289 3290 3291 3292 3293 3294 3295 3296 3297 3298 3299 3300 3301 3302 3303 3304 3305 3306 3307 3308 3309 3310 3311 3312 3313 3314 3315 3316 3317 3318 3319 3320 3321 3322 3323 3324 3325 3326 3327 3328 3329 3330 3331 3332 3333 3334 3335 3336 3337 3338 3339 3340 3341 3342 3343 3344 3345 3346 3347 3348 3349 3350 3351 3352 3353 3354 3355 3356 3357 3358 3359 3360 3361 3362 3363 3364 3365 3366 3367 3368 3369 3370 3371 3372 3373 3374 3375 3376 3377 3378 3379 3380 3381 3382 3383 3384 3385 3386 3387 3388 3389 3390 3391 3392 3393 3394 3395 3396 3397 3398 3399 3400 3401 3402 3403 3404 3405 3406 3407 3408 3409 3410 3411 3412 3413 3414 3415 3416 3417 3418 3419 3420 3421 3422 3423 3424 3425 3426 3427 3428 3429 3430 3431 3432 3433 3434 3435 3436 3437 3438 3439 3440 3441 3442 3443 3444 3445 3446 3447 3448 3449 3450 3451 3452 3453 3454 3455 3456 3457 3458 3459 3460 3461 3462 3463 3464 3465 3466 3467 3468 3469 3470 3471 3472 3473 3474 3475 3476 3477 3478 3479 3480 3481 3482 3483 3484 3485 3486 3487 3488 3489 3490 3491 3492 3493 3494 3495 3496 3497 3498 3499 3500 3501 3502 3503 3504 3505 3506 3507 3508 3509 3510 3511 3512 3513 3514 3515 3516 3517 3518 3519 3520 3521 3522 3523 3524 3525 3526 3527 3528 3529 3530 3531 3532 3533 3534 3535 3536 3537 3538 3539 3540 3541 3542 3543 3544 3545 3546 3547 3548 3549 3550 3551 3552 3553 3554 3555 3556 3557 3558 3559 3560 3561 3562 3563 3564 3565 3566 3567 3568 3569 3570 3571 3572 3573 3574 3575 3576 3577 3578 3579 3580 3581 3582 3583 3584 3585 3586 3587 3588 3589 3590 3591 3592 3593 3594 3595 3596 3597 3598 3599 3600 3601 3602 3603 3604 3605 3606 3607 3608 3609 3610 3611 3612 3613 3614 3615 3616 3617 3618 3619 3620 3621 3622 3623 3624 3625 3626 3627 3628 3629 3630 3631 3632 3633 3634 3635 3636 3637 3638 3639 3640 3641 3642 3643 3644 3645 3646 3647 3648 3649 3650 3651 3652 3653 3654 3655 3656 3657 3658 3659 3660 3661 3662 3663 3664 3665 3666 3667 3668 3669 3670 3671 3672 3673 3674 3675 3676 3677 3678 3679 3680 3681 3682 3683 3684 3685 3686 3687 3688 3689 3690 3691 3692 3693 3694 3695 3696 3697 3698 3699 3700 3701 3702 3703 3704 3705 3706 3707 3708 3709 3710 3711 3712 3713 3714 3715 3716 3717 3718 3719 3720 3721 3722 3723 3724 3725 3726 3727 3728 3729 3730 3731 3732 3733 3734 3735 3736 3737 3738 3739 3740 3741 3742 3743 3744 3745 3746 3747 3748 3749 3750 3751 3752 3753 3754 3755 3756 3757 3758 3759 3760 3761 3762 3763 3764 3765 3766 3767 3768 3769 3770 3771 3772 3773 3774 3775 3776 3777 3778 3779 3780 3781 3782 3783 3784 3785 3786 3787 3788 3789 3790 3791 3792 3793 3794 3795 3796 3797 3798 3799 3800 3801 3802 3803 3804 3805 3806 3807 3808 3809 3810 3811 3812 3813 3814 3815 3816 3817 3818 3819 3820 3821 3822 3823 3824 3825 3826 3827 3828 3829 3830 3831 3832 3833 3834 3835 3836 3837 3838 3839 3840 3841 3842 3843 3844 3845 3846 3847 3848 3849 3850 3851 3852 3853 3854 3855 3856 3857 3858 3859 3860 3861 3862 3863 3864 3865 3866 3867 3868 3869 3870 3871 3872 3873 3874 3875 3876 3877 3878 3879 3880 3881 3882 3883 3884 3885 3886 3887 3888 3889 3890 3891 3892 3893 3894 3895 3896 3897 3898 3899 3900 3901 3902 3903 3904 3905 3906 3907 3908 3909 3910 3911 3912 3913 3914 3915 3916 3917 3918 3919 3920 3921 3922 3923 3924 3925 3926 3927 3928 3929 3930 3931 3932 3933 3934 3935 3936 3937 3938 3939 3940 3941 3942 3943 3944 3945 3946 3947 3948 3949 3950 3951 3952 3953 3954 3955 3956 3957 3958 3959 3960 3961 3962 3963 3964 3965 3966 3967 3968 3969 3970 3971 3972 3973 3974 3975 3976 3977 3978 3979 3980 3981 3982 3983 3984 3985 3986 3987 3988 3989 3990 3991 3992 3993 3994 3995 3996 3997 3998 3999 4000 4001 4002 4003 4004 4005 4006 4007 4008 4009 4010 4011 4012 4013 4014 4015 4016 4017 4018 4019 4020 4021 4022 4023 4024 4025 4026 4027 4028 4029 4030 4031 4032 4033 4034 4035 4036 4037 4038 4039 4040 4041 4042 4043 4044 4045 4046 4047 4048 4049 4050 4051 4052 4053 4054 4055 4056 4057 4058 4059 4060 4061 4062 4063 4064 4065 4066 4067 4068 4069 4070 4071 4072 4073 4074 4075 4076 4077 4078 4079 4080 4081 4082 4083 4084 4085 4086 4087 4088 4089 4090 4091 4092 4093 4094 4095 4096 4097 4098 4099 4100 4101 4102 4103 4104 4105 4106 4107 4108 4109 4110 4111 4112 4113 4114 4115 4116 4117 4118 4119 4120 4121 4122 4123 4124 4125 4126 4127 4128 4129 4130 4131 4132 4133 4134 4135 4136 4137 4138 4139 4140 4141 4142 4143 4144 4145 4146 4147 4148 4149 4150 4151 4152 4153 4154 4155 4156 4157 4158 4159 4160 4161 4162 4163 4164 4165 4166 4167 4168 4169 4170 4171 4172 4173 4174 4175 4176 4177 4178 4179 4180 4181 4182 4183 4184 4185 4186 4187 4188 4189 4190 4191 4192 4193 4194 4195 4196 4197 4198 4199 4200 4201 4202 4203 4204 4205 4206 4207 4208 4209 4210 4211 4212 4213 4214 4215 4216 4217 4218 4219 4220 4221 4222 4223 4224 4225 4226 4227 4228 4229 4230 4231 4232 4233 4234 4235 4236 4237 4238 4239 4240 4241 4242 4243 4244 4245 4246 4247 4248 4249 4250 4251 4252 4253 4254 4255 4256 4257 4258 4259 4260 4261 4262 4263 4264 4265 4266 4267 4268 4269 4270 4271 4272 4273 4274 4275 4276 4277 4278 4279 4280 4281 4282 4283 4284 4285 4286 4287 4288 4289 4290 4291 4292 4293 4294 4295 4296 4297 4298 4299 4300 4301 4302 4303 4304 4305 4306 4307 4308 4309 4310 4311 4312 4313 4314 4315 4316 4317 4318 4319 4320 4321 4322 4323 4324 4325 4326 4327 4328 4329 4330 4331 4332 4333 4334 4335 4336 4337 4338 4339 4340 4341 4342 4343 4344 4345 4346 4347 4348 4349 4350 4351 4352 4353 4354 4355 4356 4357 4358 4359 4360 4361 4362 4363 4364 4365 4366 4367 4368 4369 4370 4371 4372 4373 4374 4375 4376 4377 4378 4379 4380 4381 4382 4383 4384 4385 4386 4387 4388 4389 4390 4391 4392 4393 4394 4395 4396 4397 4398 4399 4400 4401 4402 4403 4404 4405 4406 4407 4408 4409 4410 4411 4412 4413 4414 4415 4416 4417 4418 4419 4420 4421 4422 4423 4424 4425 4426 4427 4428 4429 4430 4431 4432 4433 4434 4435 4436 4437 4438 4439 4440 4441 4442 4443 4444 4445 4446 4447 4448 4449 4450 4451 4452 4453 4454 4455 4456 4457 4458 4459 4460 4461 4462 4463 4464 4465 4466 4467 4468 4469 4470 4471 4472 4473 4474 4475 4476 4477 4478 4479 4480 4481 4482 4483 4484 4485 4486 4487 4488 4489 4490 4491 4492 4493 4494 4495 4496 4497 4498 4499 4500 4501 4502 4503 4504 4505 4506 4507 4508 4509 4510 4511 4512 4513 4514 4515 4516 4517 4518 4519 4520 4521 4522 4523 4524 4525 4526 4527 4528 4529 4530 4531 4532 4533 4534 4535 4536 4537 4538 4539 4540 4541 4542 4543 4544 4545 4546 4547 4548 4549 4550 4551 4552 4553 4554 4555 4556 4557 4558 4559 4560 4561 4562 4563 4564 4565 4566 4567 4568 4569 4570 4571 4572 4573 4574 4575 4576 4577 4578 4579 4580 4581 4582 4583 4584 4585 4586 4587 4588 4589 4590 4591 4592 4593 4594 4595 4596 4597 4598 4599 4600 4601 4602 4603 4604 4605 4606 4607 4608 4609 4610 4611 4612 4613 4614 4615 4616 4617 4618 4619 4620 4621 4622 4623 4624 4625 4626 4627 4628 4629 4630 4631 4632 4633 4634 4635 4636 4637 4638 4639 4640 4641 4642 4643 4644 4645 4646 4647 4648 4649 4650 4651 4652 4653 4654 4655 4656 4657 4658 4659 4660 4661 4662 4663 4664 4665 4666 4667 4668 4669 4670 4671 4672 4673 4674 4675 4676 4677 4678 4679 4680 4681 4682 4683 4684 4685 4686 4687 4688 4689 4690 4691 4692 4693 4694 4695 4696 4697 4698 4699 4700 4701 4702 4703 4704 4705 4706 4707 4708 4709 4710 4711 4712 4713 4714 4715 4716 4717 4718 4719 4720 4721 4722 4723 4724 4725 4726 4727 4728 4729 4730 4731 4732 4733 4734 4735 4736 4737 4738 4739 4740 4741 4742 4743 4744 4745 4746 4747 4748 4749 4750 4751 4752 4753 4754 4755 4756 4757 4758 4759 4760 4761 4762 4763 4764 4765 4766 4767 4768 4769 4770 4771 4772 4773 4774 4775 4776 4777 4778 4779 4780 4781 4782 4783 4784 4785 4786 4787 4788 4789 4790 4791 4792 4793 4794 4795 4796 4797 4798 4799 4800 4801 4802 4803 4804 4805 4806 4807 4808 4809 4810 4811 4812 4813 4814 4815 4816 4817 4818 4819 4820 4821 4822 4823 4824 4825 4826 4827 4828 4829 4830 4831 4832 4833 4834 4835 4836 4837 4838 4839 4840 4841 4842 4843 4844 4845 4846 4847 4848 4849 4850 4851 4852 4853 4854 4855 4856 4857 4858 4859 4860 4861 4862 4863 4864 4865 4866 4867 4868 4869 4870 4871 4872 4873 4874 4875 4876 4877 4878 4879 4880 4881 4882 4883 4884 4885 4886 4887 4888 4889 4890 4891 4892 4893 4894 4895 4896 4897 4898 4899 4900 4901 4902 4903 4904 4905 4906 4907 4908 4909 4910 4911 4912 4913 4914 4915 4916 4917 4918 4919 4920 4921 4922 4923 4924 4925 4926 4927 4928 4929 4930 4931 4932 4933 4934 4935 4936 4937 4938 4939 4940 4941 4942 4943 4944 4945 4946 4947 4948 4949 4950 4951 4952 4953 4954 4955 4956 4957 4958 4959 4960 4961 4962 4963 4964 4965 4966 4967 4968 4969 4970 4971 4972 4973 4974 4975 4976 4977 4978 4979 4980 4981 4982 4983 4984 4985 4986 4987 4988 4989 4990 4991 4992 4993 4994 4995 4996 4997 4998 4999 5000 5001 5002 5003 5004 5005 5006 5007 5008 5009 5010 5011 5012 5013 5014 5015 5016 5017 5018 5019 5020 5021 5022 5023 5024 5025 5026 5027 5028 5029 5030 5031 5032 5033 5034 5035 5036 5037 5038 5039 5040 5041 5042 5043 5044 5045 5046 5047 5048 5049 5050 5051 5052 5053 5054 5055 5056 5057 5058 5059 5060 5061 5062 5063 5064 5065 5066 5067 5068 5069 5070 5071 5072 5073 5074 5075 5076 5077 5078 5079 5080 5081 5082 5083 5084 5085 5086 5087 5088 5089 5090 5091 5092 5093 5094 5095 5096 5097 5098 5099 5100 5101 5102 5103 5104 5105 5106 5107 5108 5109 5110 5111 5112 5113 5114 5115 5116 5117 5118 5119 5120 5121 5122 5123 5124 5125 5126 5127 5128 5129 5130 5131 5132 5133 5134 5135 5136 5137 5138 5139 5140 5141 5142 5143 5144 5145 5146 5147 5148 5149 5150 5151 5152 5153 5154 5155 5156 5157 5158 5159 5160 5161 5162 5163 5164 5165 5166 5167 5168 5169 5170 5171 5172 5173 5174 5175 5176 5177 5178 5179 5180 5181 5182 5183 5184 5185 5186 5187 5188 5189 5190 5191 5192 5193 5194 5195 5196 5197 5198 5199 5200 5201 5202 5203 5204 5205 5206 5207 5208 5209 5210 5211 5212 5213 5214 5215 5216 5217 5218 5219 5220 5221 5222 5223 5224 5225 5226 5227 5228 5229 5230 5231 5232 5233 5234 5235 5236 5237 5238 5239 5240 5241 5242 5243 5244 5245 5246 5247 5248 5249 5250 5251 5252 5253 5254 5255 5256 5257 5258 5259 5260 5261 5262 5263 5264 5265 5266 5267 5268 5269 5270 5271 5272 5273 5274 5275 5276 5277 5278 5279 5280 5281 5282 5283 5284 5285 5286 5287 5288 5289 5290 5291 5292 5293 5294 5295 5296 5297 5298 5299 5300 5301 5302 5303 5304 5305 5306 5307 5308 5309 5310 5311 5312 5313 5314 5315 5316 5317 5318 5319 5320 5321 5322 5323 5324 5325 5326 5327 5328 5329 5330 5331 5332 5333 5334 5335 5336 5337 5338 5339 5340 5341 5342 5343 5344 5345 5346 5347 5348 5349 5350 5351 5352 5353 5354 5355 5356 5357 5358 5359 5360 5361 5362 5363 5364 5365 5366 5367 5368 5369 5370 5371 5372 5373 5374 5375 5376 5377 5378 5379 5380 5381 5382 5383 5384 5385 5386 5387 5388 5389 5390 5391 5392 5393 5394 5395 5396 5397 5398 5399 5400 5401 5402 5403 5404 5405 5406 5407 5408 5409 5410 5411 5412 5413 5414 5415 5416 5417 5418 5419 5420 5421 5422 5423 5424 5425 5426 5427 5428 5429 5430 5431 5432 5433 5434 5435 5436 5437 5438 5439 5440 5441 5442 5443 5444 5445 5446 5447 5448 5449 5450 5451 5452 5453 5454 5455 5456 5457 5458 5459 5460 5461 5462 5463 5464 5465 5466 5467 5468 5469 5470 5471 5472 5473 5474 5475 5476 5477 5478 5479 5480 5481 5482 5483 5484 5485 5486 5487 5488 5489 5490 5491 5492 5493 5494 5495 5496 5497 5498 5499 5500 5501 5502 5503 5504 5505 5506 5507 5508 5509 5510 5511 5512 5513 5514 5515 5516 5517 5518 5519 5520 5521 5522 5523 5524 5525 5526 5527 5528 5529 5530 5531 5532 5533 5534 5535 5536 5537 5538 5539 5540 5541 5542 5543 5544 5545 5546 5547 5548 5549 5550 5551 5552 with 6 links
## 5552 most connected regions:
## 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 1272 1273 1274 1275 1276 1277 1278 1279 1280 1281 1282 1283 1284 1285 1286 1287 1288 1289 1290 1291 1292 1293 1294 1295 1296 1297 1298 1299 1300 1301 1302 1303 1304 1305 1306 1307 1308 1309 1310 1311 1312 1313 1314 1315 1316 1317 1318 1319 1320 1321 1322 1323 1324 1325 1326 1327 1328 1329 1330 1331 1332 1333 1334 1335 1336 1337 1338 1339 1340 1341 1342 1343 1344 1345 1346 1347 1348 1349 1350 1351 1352 1353 1354 1355 1356 1357 1358 1359 1360 1361 1362 1363 1364 1365 1366 1367 1368 1369 1370 1371 1372 1373 1374 1375 1376 1377 1378 1379 1380 1381 1382 1383 1384 1385 1386 1387 1388 1389 1390 1391 1392 1393 1394 1395 1396 1397 1398 1399 1400 1401 1402 1403 1404 1405 1406 1407 1408 1409 1410 1411 1412 1413 1414 1415 1416 1417 1418 1419 1420 1421 1422 1423 1424 1425 1426 1427 1428 1429 1430 1431 1432 1433 1434 1435 1436 1437 1438 1439 1440 1441 1442 1443 1444 1445 1446 1447 1448 1449 1450 1451 1452 1453 1454 1455 1456 1457 1458 1459 1460 1461 1462 1463 1464 1465 1466 1467 1468 1469 1470 1471 1472 1473 1474 1475 1476 1477 1478 1479 1480 1481 1482 1483 1484 1485 1486 1487 1488 1489 1490 1491 1492 1493 1494 1495 1496 1497 1498 1499 1500 1501 1502 1503 1504 1505 1506 1507 1508 1509 1510 1511 1512 1513 1514 1515 1516 1517 1518 1519 1520 1521 1522 1523 1524 1525 1526 1527 1528 1529 1530 1531 1532 1533 1534 1535 1536 1537 1538 1539 1540 1541 1542 1543 1544 1545 1546 1547 1548 1549 1550 1551 1552 1553 1554 1555 1556 1557 1558 1559 1560 1561 1562 1563 1564 1565 1566 1567 1568 1569 1570 1571 1572 1573 1574 1575 1576 1577 1578 1579 1580 1581 1582 1583 1584 1585 1586 1587 1588 1589 1590 1591 1592 1593 1594 1595 1596 1597 1598 1599 1600 1601 1602 1603 1604 1605 1606 1607 1608 1609 1610 1611 1612 1613 1614 1615 1616 1617 1618 1619 1620 1621 1622 1623 1624 1625 1626 1627 1628 1629 1630 1631 1632 1633 1634 1635 1636 1637 1638 1639 1640 1641 1642 1643 1644 1645 1646 1647 1648 1649 1650 1651 1652 1653 1654 1655 1656 1657 1658 1659 1660 1661 1662 1663 1664 1665 1666 1667 1668 1669 1670 1671 1672 1673 1674 1675 1676 1677 1678 1679 1680 1681 1682 1683 1684 1685 1686 1687 1688 1689 1690 1691 1692 1693 1694 1695 1696 1697 1698 1699 1700 1701 1702 1703 1704 1705 1706 1707 1708 1709 1710 1711 1712 1713 1714 1715 1716 1717 1718 1719 1720 1721 1722 1723 1724 1725 1726 1727 1728 1729 1730 1731 1732 1733 1734 1735 1736 1737 1738 1739 1740 1741 1742 1743 1744 1745 1746 1747 1748 1749 1750 1751 1752 1753 1754 1755 1756 1757 1758 1759 1760 1761 1762 1763 1764 1765 1766 1767 1768 1769 1770 1771 1772 1773 1774 1775 1776 1777 1778 1779 1780 1781 1782 1783 1784 1785 1786 1787 1788 1789 1790 1791 1792 1793 1794 1795 1796 1797 1798 1799 1800 1801 1802 1803 1804 1805 1806 1807 1808 1809 1810 1811 1812 1813 1814 1815 1816 1817 1818 1819 1820 1821 1822 1823 1824 1825 1826 1827 1828 1829 1830 1831 1832 1833 1834 1835 1836 1837 1838 1839 1840 1841 1842 1843 1844 1845 1846 1847 1848 1849 1850 1851 1852 1853 1854 1855 1856 1857 1858 1859 1860 1861 1862 1863 1864 1865 1866 1867 1868 1869 1870 1871 1872 1873 1874 1875 1876 1877 1878 1879 1880 1881 1882 1883 1884 1885 1886 1887 1888 1889 1890 1891 1892 1893 1894 1895 1896 1897 1898 1899 1900 1901 1902 1903 1904 1905 1906 1907 1908 1909 1910 1911 1912 1913 1914 1915 1916 1917 1918 1919 1920 1921 1922 1923 1924 1925 1926 1927 1928 1929 1930 1931 1932 1933 1934 1935 1936 1937 1938 1939 1940 1941 1942 1943 1944 1945 1946 1947 1948 1949 1950 1951 1952 1953 1954 1955 1956 1957 1958 1959 1960 1961 1962 1963 1964 1965 1966 1967 1968 1969 1970 1971 1972 1973 1974 1975 1976 1977 1978 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025 2026 2027 2028 2029 2030 2031 2032 2033 2034 2035 2036 2037 2038 2039 2040 2041 2042 2043 2044 2045 2046 2047 2048 2049 2050 2051 2052 2053 2054 2055 2056 2057 2058 2059 2060 2061 2062 2063 2064 2065 2066 2067 2068 2069 2070 2071 2072 2073 2074 2075 2076 2077 2078 2079 2080 2081 2082 2083 2084 2085 2086 2087 2088 2089 2090 2091 2092 2093 2094 2095 2096 2097 2098 2099 2100 2101 2102 2103 2104 2105 2106 2107 2108 2109 2110 2111 2112 2113 2114 2115 2116 2117 2118 2119 2120 2121 2122 2123 2124 2125 2126 2127 2128 2129 2130 2131 2132 2133 2134 2135 2136 2137 2138 2139 2140 2141 2142 2143 2144 2145 2146 2147 2148 2149 2150 2151 2152 2153 2154 2155 2156 2157 2158 2159 2160 2161 2162 2163 2164 2165 2166 2167 2168 2169 2170 2171 2172 2173 2174 2175 2176 2177 2178 2179 2180 2181 2182 2183 2184 2185 2186 2187 2188 2189 2190 2191 2192 2193 2194 2195 2196 2197 2198 2199 2200 2201 2202 2203 2204 2205 2206 2207 2208 2209 2210 2211 2212 2213 2214 2215 2216 2217 2218 2219 2220 2221 2222 2223 2224 2225 2226 2227 2228 2229 2230 2231 2232 2233 2234 2235 2236 2237 2238 2239 2240 2241 2242 2243 2244 2245 2246 2247 2248 2249 2250 2251 2252 2253 2254 2255 2256 2257 2258 2259 2260 2261 2262 2263 2264 2265 2266 2267 2268 2269 2270 2271 2272 2273 2274 2275 2276 2277 2278 2279 2280 2281 2282 2283 2284 2285 2286 2287 2288 2289 2290 2291 2292 2293 2294 2295 2296 2297 2298 2299 2300 2301 2302 2303 2304 2305 2306 2307 2308 2309 2310 2311 2312 2313 2314 2315 2316 2317 2318 2319 2320 2321 2322 2323 2324 2325 2326 2327 2328 2329 2330 2331 2332 2333 2334 2335 2336 2337 2338 2339 2340 2341 2342 2343 2344 2345 2346 2347 2348 2349 2350 2351 2352 2353 2354 2355 2356 2357 2358 2359 2360 2361 2362 2363 2364 2365 2366 2367 2368 2369 2370 2371 2372 2373 2374 2375 2376 2377 2378 2379 2380 2381 2382 2383 2384 2385 2386 2387 2388 2389 2390 2391 2392 2393 2394 2395 2396 2397 2398 2399 2400 2401 2402 2403 2404 2405 2406 2407 2408 2409 2410 2411 2412 2413 2414 2415 2416 2417 2418 2419 2420 2421 2422 2423 2424 2425 2426 2427 2428 2429 2430 2431 2432 2433 2434 2435 2436 2437 2438 2439 2440 2441 2442 2443 2444 2445 2446 2447 2448 2449 2450 2451 2452 2453 2454 2455 2456 2457 2458 2459 2460 2461 2462 2463 2464 2465 2466 2467 2468 2469 2470 2471 2472 2473 2474 2475 2476 2477 2478 2479 2480 2481 2482 2483 2484 2485 2486 2487 2488 2489 2490 2491 2492 2493 2494 2495 2496 2497 2498 2499 2500 2501 2502 2503 2504 2505 2506 2507 2508 2509 2510 2511 2512 2513 2514 2515 2516 2517 2518 2519 2520 2521 2522 2523 2524 2525 2526 2527 2528 2529 2530 2531 2532 2533 2534 2535 2536 2537 2538 2539 2540 2541 2542 2543 2544 2545 2546 2547 2548 2549 2550 2551 2552 2553 2554 2555 2556 2557 2558 2559 2560 2561 2562 2563 2564 2565 2566 2567 2568 2569 2570 2571 2572 2573 2574 2575 2576 2577 2578 2579 2580 2581 2582 2583 2584 2585 2586 2587 2588 2589 2590 2591 2592 2593 2594 2595 2596 2597 2598 2599 2600 2601 2602 2603 2604 2605 2606 2607 2608 2609 2610 2611 2612 2613 2614 2615 2616 2617 2618 2619 2620 2621 2622 2623 2624 2625 2626 2627 2628 2629 2630 2631 2632 2633 2634 2635 2636 2637 2638 2639 2640 2641 2642 2643 2644 2645 2646 2647 2648 2649 2650 2651 2652 2653 2654 2655 2656 2657 2658 2659 2660 2661 2662 2663 2664 2665 2666 2667 2668 2669 2670 2671 2672 2673 2674 2675 2676 2677 2678 2679 2680 2681 2682 2683 2684 2685 2686 2687 2688 2689 2690 2691 2692 2693 2694 2695 2696 2697 2698 2699 2700 2701 2702 2703 2704 2705 2706 2707 2708 2709 2710 2711 2712 2713 2714 2715 2716 2717 2718 2719 2720 2721 2722 2723 2724 2725 2726 2727 2728 2729 2730 2731 2732 2733 2734 2735 2736 2737 2738 2739 2740 2741 2742 2743 2744 2745 2746 2747 2748 2749 2750 2751 2752 2753 2754 2755 2756 2757 2758 2759 2760 2761 2762 2763 2764 2765 2766 2767 2768 2769 2770 2771 2772 2773 2774 2775 2776 2777 2778 2779 2780 2781 2782 2783 2784 2785 2786 2787 2788 2789 2790 2791 2792 2793 2794 2795 2796 2797 2798 2799 2800 2801 2802 2803 2804 2805 2806 2807 2808 2809 2810 2811 2812 2813 2814 2815 2816 2817 2818 2819 2820 2821 2822 2823 2824 2825 2826 2827 2828 2829 2830 2831 2832 2833 2834 2835 2836 2837 2838 2839 2840 2841 2842 2843 2844 2845 2846 2847 2848 2849 2850 2851 2852 2853 2854 2855 2856 2857 2858 2859 2860 2861 2862 2863 2864 2865 2866 2867 2868 2869 2870 2871 2872 2873 2874 2875 2876 2877 2878 2879 2880 2881 2882 2883 2884 2885 2886 2887 2888 2889 2890 2891 2892 2893 2894 2895 2896 2897 2898 2899 2900 2901 2902 2903 2904 2905 2906 2907 2908 2909 2910 2911 2912 2913 2914 2915 2916 2917 2918 2919 2920 2921 2922 2923 2924 2925 2926 2927 2928 2929 2930 2931 2932 2933 2934 2935 2936 2937 2938 2939 2940 2941 2942 2943 2944 2945 2946 2947 2948 2949 2950 2951 2952 2953 2954 2955 2956 2957 2958 2959 2960 2961 2962 2963 2964 2965 2966 2967 2968 2969 2970 2971 2972 2973 2974 2975 2976 2977 2978 2979 2980 2981 2982 2983 2984 2985 2986 2987 2988 2989 2990 2991 2992 2993 2994 2995 2996 2997 2998 2999 3000 3001 3002 3003 3004 3005 3006 3007 3008 3009 3010 3011 3012 3013 3014 3015 3016 3017 3018 3019 3020 3021 3022 3023 3024 3025 3026 3027 3028 3029 3030 3031 3032 3033 3034 3035 3036 3037 3038 3039 3040 3041 3042 3043 3044 3045 3046 3047 3048 3049 3050 3051 3052 3053 3054 3055 3056 3057 3058 3059 3060 3061 3062 3063 3064 3065 3066 3067 3068 3069 3070 3071 3072 3073 3074 3075 3076 3077 3078 3079 3080 3081 3082 3083 3084 3085 3086 3087 3088 3089 3090 3091 3092 3093 3094 3095 3096 3097 3098 3099 3100 3101 3102 3103 3104 3105 3106 3107 3108 3109 3110 3111 3112 3113 3114 3115 3116 3117 3118 3119 3120 3121 3122 3123 3124 3125 3126 3127 3128 3129 3130 3131 3132 3133 3134 3135 3136 3137 3138 3139 3140 3141 3142 3143 3144 3145 3146 3147 3148 3149 3150 3151 3152 3153 3154 3155 3156 3157 3158 3159 3160 3161 3162 3163 3164 3165 3166 3167 3168 3169 3170 3171 3172 3173 3174 3175 3176 3177 3178 3179 3180 3181 3182 3183 3184 3185 3186 3187 3188 3189 3190 3191 3192 3193 3194 3195 3196 3197 3198 3199 3200 3201 3202 3203 3204 3205 3206 3207 3208 3209 3210 3211 3212 3213 3214 3215 3216 3217 3218 3219 3220 3221 3222 3223 3224 3225 3226 3227 3228 3229 3230 3231 3232 3233 3234 3235 3236 3237 3238 3239 3240 3241 3242 3243 3244 3245 3246 3247 3248 3249 3250 3251 3252 3253 3254 3255 3256 3257 3258 3259 3260 3261 3262 3263 3264 3265 3266 3267 3268 3269 3270 3271 3272 3273 3274 3275 3276 3277 3278 3279 3280 3281 3282 3283 3284 3285 3286 3287 3288 3289 3290 3291 3292 3293 3294 3295 3296 3297 3298 3299 3300 3301 3302 3303 3304 3305 3306 3307 3308 3309 3310 3311 3312 3313 3314 3315 3316 3317 3318 3319 3320 3321 3322 3323 3324 3325 3326 3327 3328 3329 3330 3331 3332 3333 3334 3335 3336 3337 3338 3339 3340 3341 3342 3343 3344 3345 3346 3347 3348 3349 3350 3351 3352 3353 3354 3355 3356 3357 3358 3359 3360 3361 3362 3363 3364 3365 3366 3367 3368 3369 3370 3371 3372 3373 3374 3375 3376 3377 3378 3379 3380 3381 3382 3383 3384 3385 3386 3387 3388 3389 3390 3391 3392 3393 3394 3395 3396 3397 3398 3399 3400 3401 3402 3403 3404 3405 3406 3407 3408 3409 3410 3411 3412 3413 3414 3415 3416 3417 3418 3419 3420 3421 3422 3423 3424 3425 3426 3427 3428 3429 3430 3431 3432 3433 3434 3435 3436 3437 3438 3439 3440 3441 3442 3443 3444 3445 3446 3447 3448 3449 3450 3451 3452 3453 3454 3455 3456 3457 3458 3459 3460 3461 3462 3463 3464 3465 3466 3467 3468 3469 3470 3471 3472 3473 3474 3475 3476 3477 3478 3479 3480 3481 3482 3483 3484 3485 3486 3487 3488 3489 3490 3491 3492 3493 3494 3495 3496 3497 3498 3499 3500 3501 3502 3503 3504 3505 3506 3507 3508 3509 3510 3511 3512 3513 3514 3515 3516 3517 3518 3519 3520 3521 3522 3523 3524 3525 3526 3527 3528 3529 3530 3531 3532 3533 3534 3535 3536 3537 3538 3539 3540 3541 3542 3543 3544 3545 3546 3547 3548 3549 3550 3551 3552 3553 3554 3555 3556 3557 3558 3559 3560 3561 3562 3563 3564 3565 3566 3567 3568 3569 3570 3571 3572 3573 3574 3575 3576 3577 3578 3579 3580 3581 3582 3583 3584 3585 3586 3587 3588 3589 3590 3591 3592 3593 3594 3595 3596 3597 3598 3599 3600 3601 3602 3603 3604 3605 3606 3607 3608 3609 3610 3611 3612 3613 3614 3615 3616 3617 3618 3619 3620 3621 3622 3623 3624 3625 3626 3627 3628 3629 3630 3631 3632 3633 3634 3635 3636 3637 3638 3639 3640 3641 3642 3643 3644 3645 3646 3647 3648 3649 3650 3651 3652 3653 3654 3655 3656 3657 3658 3659 3660 3661 3662 3663 3664 3665 3666 3667 3668 3669 3670 3671 3672 3673 3674 3675 3676 3677 3678 3679 3680 3681 3682 3683 3684 3685 3686 3687 3688 3689 3690 3691 3692 3693 3694 3695 3696 3697 3698 3699 3700 3701 3702 3703 3704 3705 3706 3707 3708 3709 3710 3711 3712 3713 3714 3715 3716 3717 3718 3719 3720 3721 3722 3723 3724 3725 3726 3727 3728 3729 3730 3731 3732 3733 3734 3735 3736 3737 3738 3739 3740 3741 3742 3743 3744 3745 3746 3747 3748 3749 3750 3751 3752 3753 3754 3755 3756 3757 3758 3759 3760 3761 3762 3763 3764 3765 3766 3767 3768 3769 3770 3771 3772 3773 3774 3775 3776 3777 3778 3779 3780 3781 3782 3783 3784 3785 3786 3787 3788 3789 3790 3791 3792 3793 3794 3795 3796 3797 3798 3799 3800 3801 3802 3803 3804 3805 3806 3807 3808 3809 3810 3811 3812 3813 3814 3815 3816 3817 3818 3819 3820 3821 3822 3823 3824 3825 3826 3827 3828 3829 3830 3831 3832 3833 3834 3835 3836 3837 3838 3839 3840 3841 3842 3843 3844 3845 3846 3847 3848 3849 3850 3851 3852 3853 3854 3855 3856 3857 3858 3859 3860 3861 3862 3863 3864 3865 3866 3867 3868 3869 3870 3871 3872 3873 3874 3875 3876 3877 3878 3879 3880 3881 3882 3883 3884 3885 3886 3887 3888 3889 3890 3891 3892 3893 3894 3895 3896 3897 3898 3899 3900 3901 3902 3903 3904 3905 3906 3907 3908 3909 3910 3911 3912 3913 3914 3915 3916 3917 3918 3919 3920 3921 3922 3923 3924 3925 3926 3927 3928 3929 3930 3931 3932 3933 3934 3935 3936 3937 3938 3939 3940 3941 3942 3943 3944 3945 3946 3947 3948 3949 3950 3951 3952 3953 3954 3955 3956 3957 3958 3959 3960 3961 3962 3963 3964 3965 3966 3967 3968 3969 3970 3971 3972 3973 3974 3975 3976 3977 3978 3979 3980 3981 3982 3983 3984 3985 3986 3987 3988 3989 3990 3991 3992 3993 3994 3995 3996 3997 3998 3999 4000 4001 4002 4003 4004 4005 4006 4007 4008 4009 4010 4011 4012 4013 4014 4015 4016 4017 4018 4019 4020 4021 4022 4023 4024 4025 4026 4027 4028 4029 4030 4031 4032 4033 4034 4035 4036 4037 4038 4039 4040 4041 4042 4043 4044 4045 4046 4047 4048 4049 4050 4051 4052 4053 4054 4055 4056 4057 4058 4059 4060 4061 4062 4063 4064 4065 4066 4067 4068 4069 4070 4071 4072 4073 4074 4075 4076 4077 4078 4079 4080 4081 4082 4083 4084 4085 4086 4087 4088 4089 4090 4091 4092 4093 4094 4095 4096 4097 4098 4099 4100 4101 4102 4103 4104 4105 4106 4107 4108 4109 4110 4111 4112 4113 4114 4115 4116 4117 4118 4119 4120 4121 4122 4123 4124 4125 4126 4127 4128 4129 4130 4131 4132 4133 4134 4135 4136 4137 4138 4139 4140 4141 4142 4143 4144 4145 4146 4147 4148 4149 4150 4151 4152 4153 4154 4155 4156 4157 4158 4159 4160 4161 4162 4163 4164 4165 4166 4167 4168 4169 4170 4171 4172 4173 4174 4175 4176 4177 4178 4179 4180 4181 4182 4183 4184 4185 4186 4187 4188 4189 4190 4191 4192 4193 4194 4195 4196 4197 4198 4199 4200 4201 4202 4203 4204 4205 4206 4207 4208 4209 4210 4211 4212 4213 4214 4215 4216 4217 4218 4219 4220 4221 4222 4223 4224 4225 4226 4227 4228 4229 4230 4231 4232 4233 4234 4235 4236 4237 4238 4239 4240 4241 4242 4243 4244 4245 4246 4247 4248 4249 4250 4251 4252 4253 4254 4255 4256 4257 4258 4259 4260 4261 4262 4263 4264 4265 4266 4267 4268 4269 4270 4271 4272 4273 4274 4275 4276 4277 4278 4279 4280 4281 4282 4283 4284 4285 4286 4287 4288 4289 4290 4291 4292 4293 4294 4295 4296 4297 4298 4299 4300 4301 4302 4303 4304 4305 4306 4307 4308 4309 4310 4311 4312 4313 4314 4315 4316 4317 4318 4319 4320 4321 4322 4323 4324 4325 4326 4327 4328 4329 4330 4331 4332 4333 4334 4335 4336 4337 4338 4339 4340 4341 4342 4343 4344 4345 4346 4347 4348 4349 4350 4351 4352 4353 4354 4355 4356 4357 4358 4359 4360 4361 4362 4363 4364 4365 4366 4367 4368 4369 4370 4371 4372 4373 4374 4375 4376 4377 4378 4379 4380 4381 4382 4383 4384 4385 4386 4387 4388 4389 4390 4391 4392 4393 4394 4395 4396 4397 4398 4399 4400 4401 4402 4403 4404 4405 4406 4407 4408 4409 4410 4411 4412 4413 4414 4415 4416 4417 4418 4419 4420 4421 4422 4423 4424 4425 4426 4427 4428 4429 4430 4431 4432 4433 4434 4435 4436 4437 4438 4439 4440 4441 4442 4443 4444 4445 4446 4447 4448 4449 4450 4451 4452 4453 4454 4455 4456 4457 4458 4459 4460 4461 4462 4463 4464 4465 4466 4467 4468 4469 4470 4471 4472 4473 4474 4475 4476 4477 4478 4479 4480 4481 4482 4483 4484 4485 4486 4487 4488 4489 4490 4491 4492 4493 4494 4495 4496 4497 4498 4499 4500 4501 4502 4503 4504 4505 4506 4507 4508 4509 4510 4511 4512 4513 4514 4515 4516 4517 4518 4519 4520 4521 4522 4523 4524 4525 4526 4527 4528 4529 4530 4531 4532 4533 4534 4535 4536 4537 4538 4539 4540 4541 4542 4543 4544 4545 4546 4547 4548 4549 4550 4551 4552 4553 4554 4555 4556 4557 4558 4559 4560 4561 4562 4563 4564 4565 4566 4567 4568 4569 4570 4571 4572 4573 4574 4575 4576 4577 4578 4579 4580 4581 4582 4583 4584 4585 4586 4587 4588 4589 4590 4591 4592 4593 4594 4595 4596 4597 4598 4599 4600 4601 4602 4603 4604 4605 4606 4607 4608 4609 4610 4611 4612 4613 4614 4615 4616 4617 4618 4619 4620 4621 4622 4623 4624 4625 4626 4627 4628 4629 4630 4631 4632 4633 4634 4635 4636 4637 4638 4639 4640 4641 4642 4643 4644 4645 4646 4647 4648 4649 4650 4651 4652 4653 4654 4655 4656 4657 4658 4659 4660 4661 4662 4663 4664 4665 4666 4667 4668 4669 4670 4671 4672 4673 4674 4675 4676 4677 4678 4679 4680 4681 4682 4683 4684 4685 4686 4687 4688 4689 4690 4691 4692 4693 4694 4695 4696 4697 4698 4699 4700 4701 4702 4703 4704 4705 4706 4707 4708 4709 4710 4711 4712 4713 4714 4715 4716 4717 4718 4719 4720 4721 4722 4723 4724 4725 4726 4727 4728 4729 4730 4731 4732 4733 4734 4735 4736 4737 4738 4739 4740 4741 4742 4743 4744 4745 4746 4747 4748 4749 4750 4751 4752 4753 4754 4755 4756 4757 4758 4759 4760 4761 4762 4763 4764 4765 4766 4767 4768 4769 4770 4771 4772 4773 4774 4775 4776 4777 4778 4779 4780 4781 4782 4783 4784 4785 4786 4787 4788 4789 4790 4791 4792 4793 4794 4795 4796 4797 4798 4799 4800 4801 4802 4803 4804 4805 4806 4807 4808 4809 4810 4811 4812 4813 4814 4815 4816 4817 4818 4819 4820 4821 4822 4823 4824 4825 4826 4827 4828 4829 4830 4831 4832 4833 4834 4835 4836 4837 4838 4839 4840 4841 4842 4843 4844 4845 4846 4847 4848 4849 4850 4851 4852 4853 4854 4855 4856 4857 4858 4859 4860 4861 4862 4863 4864 4865 4866 4867 4868 4869 4870 4871 4872 4873 4874 4875 4876 4877 4878 4879 4880 4881 4882 4883 4884 4885 4886 4887 4888 4889 4890 4891 4892 4893 4894 4895 4896 4897 4898 4899 4900 4901 4902 4903 4904 4905 4906 4907 4908 4909 4910 4911 4912 4913 4914 4915 4916 4917 4918 4919 4920 4921 4922 4923 4924 4925 4926 4927 4928 4929 4930 4931 4932 4933 4934 4935 4936 4937 4938 4939 4940 4941 4942 4943 4944 4945 4946 4947 4948 4949 4950 4951 4952 4953 4954 4955 4956 4957 4958 4959 4960 4961 4962 4963 4964 4965 4966 4967 4968 4969 4970 4971 4972 4973 4974 4975 4976 4977 4978 4979 4980 4981 4982 4983 4984 4985 4986 4987 4988 4989 4990 4991 4992 4993 4994 4995 4996 4997 4998 4999 5000 5001 5002 5003 5004 5005 5006 5007 5008 5009 5010 5011 5012 5013 5014 5015 5016 5017 5018 5019 5020 5021 5022 5023 5024 5025 5026 5027 5028 5029 5030 5031 5032 5033 5034 5035 5036 5037 5038 5039 5040 5041 5042 5043 5044 5045 5046 5047 5048 5049 5050 5051 5052 5053 5054 5055 5056 5057 5058 5059 5060 5061 5062 5063 5064 5065 5066 5067 5068 5069 5070 5071 5072 5073 5074 5075 5076 5077 5078 5079 5080 5081 5082 5083 5084 5085 5086 5087 5088 5089 5090 5091 5092 5093 5094 5095 5096 5097 5098 5099 5100 5101 5102 5103 5104 5105 5106 5107 5108 5109 5110 5111 5112 5113 5114 5115 5116 5117 5118 5119 5120 5121 5122 5123 5124 5125 5126 5127 5128 5129 5130 5131 5132 5133 5134 5135 5136 5137 5138 5139 5140 5141 5142 5143 5144 5145 5146 5147 5148 5149 5150 5151 5152 5153 5154 5155 5156 5157 5158 5159 5160 5161 5162 5163 5164 5165 5166 5167 5168 5169 5170 5171 5172 5173 5174 5175 5176 5177 5178 5179 5180 5181 5182 5183 5184 5185 5186 5187 5188 5189 5190 5191 5192 5193 5194 5195 5196 5197 5198 5199 5200 5201 5202 5203 5204 5205 5206 5207 5208 5209 5210 5211 5212 5213 5214 5215 5216 5217 5218 5219 5220 5221 5222 5223 5224 5225 5226 5227 5228 5229 5230 5231 5232 5233 5234 5235 5236 5237 5238 5239 5240 5241 5242 5243 5244 5245 5246 5247 5248 5249 5250 5251 5252 5253 5254 5255 5256 5257 5258 5259 5260 5261 5262 5263 5264 5265 5266 5267 5268 5269 5270 5271 5272 5273 5274 5275 5276 5277 5278 5279 5280 5281 5282 5283 5284 5285 5286 5287 5288 5289 5290 5291 5292 5293 5294 5295 5296 5297 5298 5299 5300 5301 5302 5303 5304 5305 5306 5307 5308 5309 5310 5311 5312 5313 5314 5315 5316 5317 5318 5319 5320 5321 5322 5323 5324 5325 5326 5327 5328 5329 5330 5331 5332 5333 5334 5335 5336 5337 5338 5339 5340 5341 5342 5343 5344 5345 5346 5347 5348 5349 5350 5351 5352 5353 5354 5355 5356 5357 5358 5359 5360 5361 5362 5363 5364 5365 5366 5367 5368 5369 5370 5371 5372 5373 5374 5375 5376 5377 5378 5379 5380 5381 5382 5383 5384 5385 5386 5387 5388 5389 5390 5391 5392 5393 5394 5395 5396 5397 5398 5399 5400 5401 5402 5403 5404 5405 5406 5407 5408 5409 5410 5411 5412 5413 5414 5415 5416 5417 5418 5419 5420 5421 5422 5423 5424 5425 5426 5427 5428 5429 5430 5431 5432 5433 5434 5435 5436 5437 5438 5439 5440 5441 5442 5443 5444 5445 5446 5447 5448 5449 5450 5451 5452 5453 5454 5455 5456 5457 5458 5459 5460 5461 5462 5463 5464 5465 5466 5467 5468 5469 5470 5471 5472 5473 5474 5475 5476 5477 5478 5479 5480 5481 5482 5483 5484 5485 5486 5487 5488 5489 5490 5491 5492 5493 5494 5495 5496 5497 5498 5499 5500 5501 5502 5503 5504 5505 5506 5507 5508 5509 5510 5511 5512 5513 5514 5515 5516 5517 5518 5519 5520 5521 5522 5523 5524 5525 5526 5527 5528 5529 5530 5531 5532 5533 5534 5535 5536 5537 5538 5539 5540 5541 5542 5543 5544 5545 5546 5547 5548 5549 5550 5551 5552 with 6 links
##
## Weights style: W
## Weights constants summary:
## n nn S0 S1 S2
## W 5552 30824704 5552 1670.611 22701.22
Use lm.morantest() to perform Moran’s I test
lm.morantest(brazil_lglm2, wm_lw)
##
## Global Moran I for regression residuals
##
## data:
## model: lm(formula = log(GDP_CAPITA) ~ IBGE_CROP_PRODUCTION +
## IDHM_Longevidade + IDHM_Renda + GVA_AGROPEC_p + GVA_INDUSTRY_p +
## GVA_SERVICES_p + tax_to_gdp + Cars_p, data = brazil6)
## weights: wm_lw
##
## Moran I statistic standard deviate = 26.248, p-value < 2.2e-16
## alternative hypothesis: greater
## sample estimates:
## Observed Moran I Expectation Variance
## 1.922521e-01 -5.926404e-04 5.397861e-05
Since pvalue < 2.2e-16, which is less than alpha value (0.05), reject null hypothesis. There is sufficient evidence to conclude that the residuals are not randomly distributed. In fact, from analysis previously, we can now safely conclude that the residuals are normally distributed.
Since the observed Moran I = 0.1929323, which is greater than 0, we can infer that the residuals resemble cluster distribution.
We will be using the CV approach for Gaussian Kernel, with the Adaptive Bandwidth ##### 7.1.1 Computing Adaptive Bandwidth GWR Model We will now set adaptive = TRUE since we are calculating the adaptive bandwidth
dmat <- gw.dist(dp.locat=coordinates(brazil.point.sp))
bw.adaptive <- bw.gwr(log(GDP_CAPITA)~ IBGE_CROP_PRODUCTION + IDHM_Longevidade + IDHM_Renda + GVA_AGROPEC_p + GVA_INDUSTRY_p + GVA_SERVICES_p + tax_to_gdp + Cars_p, data=brazil.point.sp, approach = "CV", kernel="gaussian", adaptive = TRUE, longlat=FALSE, dMat=dmat)
## Take a cup of tea and have a break, it will take a few minutes.
## -----A kind suggestion from GWmodel development group
## Adaptive bandwidth: 3438 CV score: 412.7773
## Adaptive bandwidth: 2133 CV score: 403.4622
## Adaptive bandwidth: 1324 CV score: 391.1685
## Adaptive bandwidth: 827 CV score: 389.4456
## Adaptive bandwidth: 516 CV score: 401.3057
## Adaptive bandwidth: 1015 CV score: 388.4169
## Adaptive bandwidth: 1135 CV score: 388.8933
## Adaptive bandwidth: 944 CV score: 388.7654
## Adaptive bandwidth: 1062 CV score: 388.389
## Adaptive bandwidth: 1088 CV score: 388.5388
## Adaptive bandwidth: 1043 CV score: 388.2676
## Adaptive bandwidth: 1034 CV score: 388.2838
## Adaptive bandwidth: 1051 CV score: 388.3163
## Adaptive bandwidth: 1040 CV score: 388.2508
## Adaptive bandwidth: 1036 CV score: 388.2812
## Adaptive bandwidth: 1040 CV score: 388.2508
The result shows that 1040 is the recommended data points to be used.
Constructing the adaptive bandwidth gwr model
gwr.adaptive <- gwr.basic(log(GDP_CAPITA)~ IBGE_CROP_PRODUCTION + IDHM_Longevidade + IDHM_Renda + GVA_AGROPEC_p + GVA_INDUSTRY_p + GVA_SERVICES_p + tax_to_gdp + Cars_p, data=brazil.point.sp, bw=bw.adaptive, kernel="gaussian", longlat=FALSE)
## Warning in proj4string(data): CRS object has comment, which is lost in output
Display the model output
gwr.adaptive
## ***********************************************************************
## * Package GWmodel *
## ***********************************************************************
## Program starts at: 2020-05-31 22:53:50
## Call:
## gwr.basic(formula = log(GDP_CAPITA) ~ IBGE_CROP_PRODUCTION +
## IDHM_Longevidade + IDHM_Renda + GVA_AGROPEC_p + GVA_INDUSTRY_p +
## GVA_SERVICES_p + tax_to_gdp + Cars_p, data = brazil.point.sp,
## bw = bw.adaptive, kernel = "gaussian", longlat = FALSE)
##
## Dependent (y) variable: GDP_CAPITA
## Independent variables: IBGE_CROP_PRODUCTION IDHM_Longevidade IDHM_Renda GVA_AGROPEC_p GVA_INDUSTRY_p GVA_SERVICES_p tax_to_gdp Cars_p
## Number of data points: 5552
## ***********************************************************************
## * Results of Global Regression *
## ***********************************************************************
##
## Call:
## lm(formula = formula, data = data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.86710 -0.16500 -0.03439 0.13338 2.75733
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 6.512e+00 9.999e-02 65.126 < 2e-16 ***
## IBGE_CROP_PRODUCTION 2.663e-07 2.754e-08 9.671 < 2e-16 ***
## IDHM_Longevidade 6.639e-01 1.522e-01 4.362 1.32e-05 ***
## IDHM_Renda 3.523e+00 1.143e-01 30.807 < 2e-16 ***
## GVA_AGROPEC_p 2.787e-02 6.319e-04 44.104 < 2e-16 ***
## GVA_INDUSTRY_p 1.884e-02 4.022e-04 46.856 < 2e-16 ***
## GVA_SERVICES_p 2.010e-02 6.731e-04 29.859 < 2e-16 ***
## tax_to_gdp 7.833e-04 1.434e-04 5.461 4.93e-08 ***
## Cars_p 4.546e-01 5.691e-02 7.988 1.66e-15 ***
##
## ---Significance stars
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## Residual standard error: 0.2742 on 5543 degrees of freedom
## Multiple R-squared: 0.8392
## Adjusted R-squared: 0.839
## F-statistic: 3616 on 8 and 5543 DF, p-value: < 2.2e-16
## ***Extra Diagnostic information
## Residual sum of squares: 416.8353
## Sigma(hat): 0.2740538
## AIC: 1400.531
## AICc: 1400.571
## ***********************************************************************
## * Results of Geographically Weighted Regression *
## ***********************************************************************
##
## *********************Model calibration information*********************
## Kernel function: gaussian
## Fixed bandwidth: 1040
## Regression points: the same locations as observations are used.
## Distance metric: Euclidean distance metric is used.
##
## ****************Summary of GWR coefficient estimates:******************
## Min. 1st Qu. Median 3rd Qu. Max.
## Intercept 6.5115e+00 6.5117e+00 6.5117e+00 6.5117e+00 6.5120
## IBGE_CROP_PRODUCTION 2.6631e-07 2.6633e-07 2.6634e-07 2.6634e-07 0.0000
## IDHM_Longevidade 6.6336e-01 6.6382e-01 6.6384e-01 6.6386e-01 0.6640
## IDHM_Renda 3.5225e+00 3.5226e+00 3.5226e+00 3.5226e+00 3.5226
## GVA_AGROPEC_p 2.7871e-02 2.7871e-02 2.7871e-02 2.7872e-02 0.0279
## GVA_INDUSTRY_p 1.8844e-02 1.8845e-02 1.8845e-02 1.8845e-02 0.0188
## GVA_SERVICES_p 2.0096e-02 2.0096e-02 2.0096e-02 2.0097e-02 0.0201
## tax_to_gdp 7.8329e-04 7.8332e-04 7.8333e-04 7.8335e-04 0.0008
## Cars_p 4.5454e-01 4.5461e-01 4.5464e-01 4.5465e-01 0.4548
## ************************Diagnostic information*************************
## Number of data points: 5552
## Effective number of parameters (2trace(S) - trace(S'S)): 9.001199
## Effective degrees of freedom (n-2trace(S) + trace(S'S)): 5542.999
## AICc (GWR book, Fotheringham, et al. 2002, p. 61, eq 2.33): 1400.526
## AIC (GWR book, Fotheringham, et al. 2002,GWR p. 96, eq. 4.22): 1389.485
## Residual sum of squares: 416.8318
## R-square value: 0.8392099
## Adjusted R-square value: 0.8389487
##
## ***********************************************************************
## Program stops at: 2020-05-31 22:54:38
The adjusted R2 is 0.8389487, and pvalue < 2.2e-16
We will now analyse the r2 values of the CV Kernel adaptive bandwidth approach.
We will compare based on adjusted R2 value. The higher the Adjusted R2 value, the better the method.
This is because the adjusted R2 value calculates the correlation of the variables. In other words, it takes into account all the variables in the model.
A higher adjusted R2 would mean that the higher the percentage of variation in GDP_CAPITA can be explained in the regression model.
Adjusted r2 = 0.8389487
The adjusted r2 value above is rather high, which means that our model is good. There is high correlation between our variables and GDP_CAPITA.
In addition to regression residuals, the output feature class table includes fields for observed and predicted y values, condition number, local R2, residuals, and explanatory variable coefficients and standard errors.
We would now attempt to visualise the GWR Output using the AIC approach and Guassian kernel, fixed method, as identified above.
To visualise the SDF, we need to first convert it into sf data.frame
brazil.sf.adaptive <- st_as_sf(gwr.adaptive$SDF) %>%
st_transform(crs=4674)
gwr.adaptive.output <- as.data.frame(gwr.adaptive$SDF)
brazil.sf.adaptive <- cbind(brazil.polygon.res.sf, as.matrix(gwr.adaptive.output))
brazil.sf.adaptive
## Simple feature collection with 5552 features and 129 fields
## geometry type: GEOMETRY
## dimension: XY
## bbox: xmin: -73.99045 ymin: -33.75118 xmax: -28.83594 ymax: 5.271841
## geographic CRS: SIRGAS 2000
## First 10 features:
## code_muni name_muni code_state abbrev_state CAPITAL IBGE_RES_POP
## 1 5200050 ABADIA DE GOIÁS 52 GO 0 6876
## 2 3100104 ABADIA DOS DOURADOS 31 MG 0 6704
## 3 5200100 ABADIÂNIA 52 GO 0 15757
## 4 3100203 ABAETÉ 31 MG 0 22690
## 5 1500107 ABAETETUBA 15 PA 0 141100
## 6 2300101 ABAIARA 23 CE 0 10496
## 7 2900108 ABAÍRA 29 BA 0 8316
## 8 2900207 ABARÉ 29 BA 0 17064
## 9 4100103 ABATIÁ 41 PR 0 7764
## 10 4200051 ABDON BATISTA 42 SC 0 2653
## IBGE_RES_POP_BRAS IBGE_RES_POP_ESTR IBGE_DU IBGE_DU_URBAN IBGE_DU_RURAL
## 1 6876 0 2137 1546 591
## 2 6704 0 2328 1481 847
## 3 15609 148 4655 3233 1422
## 4 22690 0 7694 6667 1027
## 5 141040 60 31061 19057 12004
## 6 10496 0 2791 1251 1540
## 7 8316 0 2572 1193 1379
## 8 17064 0 4332 2379 1953
## 9 7764 0 2499 1877 622
## 10 2653 0 848 234 614
## IBGE_POP IBGE_1 IBGE_1.4 IBGE_5.9 IBGE_10.14 Active_pop IBGE_60.
## 1 5300 69 318 438 517 3542 416
## 2 4154 38 207 260 351 2709 589
## 3 10656 139 650 894 1087 6896 990
## 4 18464 176 856 1233 1539 11979 2681
## 5 82956 1354 5567 7618 8905 53516 5996
## 6 4538 98 323 421 483 2631 582
## 7 3725 37 156 263 277 2319 673
## 8 8994 167 733 978 927 5386 803
## 9 5685 69 302 370 483 3650 811
## 10 724 12 32 49 63 479 89
## IBGE_PLANTED_AREA IBGE_CROP_PRODUCTION IDHM.Ranking.2010 IDHM IDHM_Renda
## 1 319 1843 1689 0.708 0.687
## 2 4479 18017 2207 0.690 0.693
## 3 10307 33085 2202 0.690 0.671
## 4 1862 7502 1994 0.698 0.720
## 5 25200 700872 3530 0.628 0.579
## 6 2598 5234 3522 0.628 0.540
## 7 895 3999 4086 0.603 0.577
## 8 2058 22761 4756 0.575 0.533
## 9 1197 9943 2258 0.687 0.676
## 10 5502 26195 2092 0.690 0.660
## IDHM_Longevidade IDHM_Educacao LONG LAT ALT PAY_TV
## 1 0.830 0.622 -49.44055 -16.758812 893.60 360
## 2 0.839 0.563 -47.39683 -18.487565 753.12 77
## 3 0.841 0.579 -48.71881 -16.182672 1017.55 227
## 4 0.848 0.556 -45.44619 -19.155848 644.74 1230
## 5 0.798 0.537 -48.88440 -1.723470 10.12 3389
## 6 0.748 0.612 -39.04755 -7.356977 403.11 29
## 7 0.746 0.510 -41.66161 -13.253532 674.22 952
## 8 0.776 0.460 -39.11659 -8.723418 316.38 51
## 9 0.804 0.596 -50.31253 -23.300494 579.30 55
## 10 0.812 0.625 -51.02527 -27.608987 720.98 109
## FIXED_PHONES AREA REGIAO_TUR CATEGORIA_TUR
## 1 842 147.26 <NA> <NA>
## 2 296 881.06 Caminhos Do Cerrado D
## 3 720 1045.13 Região Turística Do Ouro E Cristais C
## 4 1716 1817.07 Lago De Três Marias D
## 5 1218 1610.65 Araguaia-Tocantins D
## 6 34 180.08 <NA> <NA>
## 7 335 538.68 Chapada Diamantina D
## 8 222 1604.92 <NA> <NA>
## 9 392 228.72 <NA> <NA>
## 10 260 237.16 Vale Do Contestado D
## ESTIMATED_POP RURAL_URBAN GVA_AGROPEC GVA_INDUSTRY GVA_SERVICES
## 1 8583 Urbano 6.20 27991.25 74750.32
## 2 6972 Rural Adjacente 50524.57 25917.70 62689.23
## 3 19614 Rural Adjacente 42.84 16728.30 138198.58
## 4 23223 Urbano 113824.60 31002.62 172.33
## 5 156292 Urbano 140463.72 58610.00 468128.69
## 6 11663 Rural Adjacente 4435.16 5.88 22.81
## 7 8767 Rural Remoto 12.41 3437.43 17990.74
## 8 19814 Rural Remoto 9176.40 6.70 36921.84
## 9 7507 Rural Adjacente 73340.52 8839.71 42999.37
## 10 2577 Rural Adjacente 24996.75 3578.87 16011.10
## GVA_PUBLIC GVA_TOTAL TAXES GDP POP_GDP GDP_CAPITA
## 1 36915.04 145857.60 20554.20 166.41 8053 20664.57
## 2 28083.79 167215.28 12873.50 180.09 7037 25591.70
## 3 63396.20 261161.91 26822.58 287984.49 18427 15628.40
## 4 86081.41 403241.27 26994.09 430235.36 23574 18250.42
## 5 486872.40 1154074.81 95180.48 1249255.29 151934 8222.36
## 6 35989.96 69108.67 4042.79 73151.46 11483 6370.41
## 7 28463.74 62304.83 2019.77 64324.59 9212 6982.70
## 8 65.75 118.55 6.21 124754.26 19939 6256.80
## 9 34103.49 159283.08 5.77 165048.21 7795 21173.60
## 10 17842.64 62429.36 2312.65 64742.01 2617 24739.02
## GVA_MAIN
## 1 Demais serviços
## 2 Demais serviços
## 3 Demais serviços
## 4 Demais serviços
## 5 Administração, defesa, educação e saúde públicas e seguridade social
## 6 Administração, defesa, educação e saúde públicas e seguridade social
## 7 Administração, defesa, educação e saúde públicas e seguridade social
## 8 Administração, defesa, educação e saúde públicas e seguridade social
## 9 Agricultura, inclusive apoio à agricultura e a pós colheita
## 10 Administração, defesa, educação e saúde públicas e seguridade social
## MUN_EXPENDIT COMP_TOT COMP_A COMP_B COMP_C COMP_D COMP_E COMP_F COMP_G
## 1 28227691 284 5 1 56 0 2 29 110
## 2 17909274 476 6 6 30 1 2 34 190
## 3 37513019 288 5 9 26 0 2 7 117
## 4 NA 621 18 1 40 0 1 20 303
## 5 NA 931 4 2 43 0 1 27 500
## 6 NA 86 1 0 4 0 0 6 48
## 7 NA 191 6 0 8 0 1 4 97
## 8 NA 87 2 0 3 0 0 0 71
## 9 NA 285 5 0 20 0 1 10 133
## 10 19506956 69 2 0 4 0 0 2 35
## COMP_H COMP_I COMP_J COMP_K COMP_L COMP_M COMP_N COMP_O COMP_P COMP_Q COMP_R
## 1 26 4 5 0 2 10 12 4 6 6 1
## 2 70 28 11 0 4 15 29 2 9 14 6
## 3 12 57 2 1 0 7 15 3 11 5 1
## 4 62 30 9 6 4 28 27 2 15 19 9
## 5 16 31 6 1 1 22 16 2 155 33 15
## 6 2 10 2 0 0 2 3 2 0 2 0
## 7 5 5 3 1 0 5 5 2 8 1 2
## 8 0 1 1 0 0 0 1 2 0 2 0
## 9 18 14 8 0 4 11 26 2 8 9 4
## 10 8 3 1 1 0 4 0 2 1 3 0
## COMP_S COMP_T COMP_U HOTELS BEDS Pr_Agencies Pu_Agencies Pr_Bank Pu_Bank
## 1 5 0 0 NA NA NA NA NA NA
## 2 19 0 0 NA NA NA NA NA NA
## 3 8 0 0 1 34 1 1 1 1
## 4 27 0 0 NA NA 2 2 2 2
## 5 56 0 0 NA NA 2 4 2 4
## 6 4 0 0 NA NA NA NA NA NA
## 7 38 0 0 1 24 NA NA NA NA
## 8 4 0 0 NA NA 1 0 1 0
## 9 12 0 0 NA NA 0 1 0 1
## 10 3 0 0 NA NA 0 1 0 1
## Pr_Assets Pu_Assets Cars Motorcycles Wheeled_tractor UBER MAC WAL.MART
## 1 NA NA 2158 1246 0 <NA> NA NA
## 2 NA NA 2227 1142 0 <NA> NA NA
## 3 33724584 67091904 2838 1426 0 <NA> NA NA
## 4 44974716 371922572 6928 2953 0 <NA> NA NA
## 5 76181384 800078483 5277 25661 0 <NA> NA NA
## 6 NA NA 553 1674 0 <NA> NA NA
## 7 NA NA 896 696 0 <NA> NA NA
## 8 21823314 0 613 1532 0 <NA> NA NA
## 9 0 45976288 2168 912 0 <NA> NA NA
## 10 0 42909056 976 345 2 <NA> NA NA
## POST_OFFICES LOG_GDP_CAPITA PAY_TV_p FIXED_PHONES_p Cars_p
## 1 1 9.936176 0.044703837 0.104557308 0.26797467
## 2 1 10.150023 0.010942163 0.042063379 0.31647009
## 3 3 9.656845 0.012318880 0.039073099 0.15401313
## 4 4 9.811943 0.052176126 0.072792059 0.29388309
## 5 2 9.014613 0.022305738 0.008016639 0.03473219
## 6 1 8.759419 0.002525472 0.002960899 0.04815815
## 7 1 8.851191 0.103343465 0.036365610 0.09726444
## 8 1 8.741424 0.002557801 0.011133959 0.03074377
## 9 1 9.960510 0.007055805 0.050288647 0.27812700
## 10 1 10.116137 0.041650745 0.099350401 0.37294612
## Motorcycles_p GVA_AGROPEC_p GVA_INDUSTRY_p GVA_SERVICES_p pop_density
## 1 0.15472495 0.0007698994 3.4758785546 9.282294797 54.68559
## 2 0.16228506 7.1798451044 3.6830609635 8.908516413 7.98697
## 3 0.07738644 0.0023248494 0.9078146199 7.499787269 17.63130
## 4 0.12526512 4.8283956902 1.3151191991 0.007310172 12.97363
## 5 0.16889570 0.9245048508 0.3857596061 3.081131873 94.33086
## 6 0.14578072 0.3862370461 0.0005120613 0.001986415 63.76610
## 7 0.07555363 0.0013471559 0.3731469822 1.952967868 17.10106
## 8 0.07683434 0.4602236822 0.0003360249 1.851739806 12.42367
## 9 0.11699808 9.4086619628 1.1340230917 5.516275818 34.08097
## 10 0.13183034 9.5516813145 1.3675468093 6.118112342 11.03474
## tax_to_gdp MLR_RES Intercept IBGE_CROP_PRODUCTION.1 IDHM_Longevidade.1
## 1 1.235154e+02 -0.01765106 6.511676 2.663345e-07 0.6638586
## 2 7.148370e+01 -0.01299481 6.511691 2.663372e-07 0.6638425
## 3 9.313897e-02 -0.02357896 6.511676 2.663346e-07 0.6638579
## 4 6.274261e-02 -0.09409796 6.511700 2.663390e-07 0.6638312
## 5 7.618978e-02 -0.36386803 6.511606 2.663236e-07 0.6639198
## 6 5.526602e-02 -0.18516124 6.511662 2.663345e-07 0.6638534
## 7 3.139966e-02 -0.27987653 6.511683 2.663372e-07 0.6638398
## 8 4.977786e-05 -0.23302640 6.511669 2.663355e-07 0.6638480
## 9 3.495948e-05 0.01027696 6.511705 2.663389e-07 0.6638347
## 10 3.572101e-02 0.14900808 6.511724 2.663416e-07 0.6638195
## IDHM_Renda.1 GVA_AGROPEC_p.1 GVA_INDUSTRY_p.1 GVA_SERVICES_p.1 tax_to_gdp.1
## 1 3.522576 0.02787146 0.01884477 0.02009643 0.0007833277
## 2 3.522571 0.02787146 0.01884477 0.02009643 0.0007833274
## 3 3.522578 0.02787150 0.01884479 0.02009645 0.0007833297
## 4 3.522568 0.02787149 0.01884479 0.02009644 0.0007833289
## 5 3.522619 0.02787192 0.01884504 0.02009665 0.0007833549
## 6 3.522600 0.02787199 0.01884510 0.02009669 0.0007833583
## 7 3.522584 0.02787175 0.01884495 0.02009658 0.0007833444
## 8 3.522596 0.02787195 0.01884508 0.02009667 0.0007833558
## 9 3.522558 0.02787125 0.01884464 0.02009632 0.0007833150
## 10 3.522546 0.02787111 0.01884455 0.02009625 0.0007833065
## Cars_p.1 y yhat residual CV_Score Stud_residual
## 1 0.4546260 9.936176 9.953826 -0.01764977 0 -0.06448664
## 2 0.4546347 10.150023 10.163017 -0.01299326 0 -0.04741797
## 3 0.4546249 9.656845 9.680424 -0.02357919 0 -0.08602561
## 4 0.4546396 9.811943 9.906040 -0.09409690 0 -0.34328575
## 5 0.4545740 9.014613 9.378475 -0.36386277 0 -1.33036648
## 6 0.4546067 8.759419 8.944573 -0.18515435 0 -0.67536221
## 7 0.4546239 8.851191 9.131064 -0.27987286 0 -1.02086747
## 8 0.4546114 8.741424 8.974444 -0.23301998 0 -0.85005300
## 9 0.4546478 9.960510 9.950234 0.01027673 0 0.03748658
## 10 0.4546620 10.116137 9.967132 0.14900468 0 0.54374552
## Intercept_SE IBGE_CROP_PRODUCTION_SE IDHM_Longevidade_SE IDHM_Renda_SE
## 1 0.09998539 2.753994e-08 0.1522088 0.1143442
## 2 0.09998539 2.753994e-08 0.1522088 0.1143442
## 3 0.09998539 2.753994e-08 0.1522088 0.1143442
## 4 0.09998539 2.753994e-08 0.1522088 0.1143442
## 5 0.09998539 2.753994e-08 0.1522088 0.1143442
## 6 0.09998539 2.753994e-08 0.1522088 0.1143442
## 7 0.09998539 2.753994e-08 0.1522088 0.1143442
## 8 0.09998539 2.753994e-08 0.1522088 0.1143442
## 9 0.09998539 2.753994e-08 0.1522088 0.1143442
## 10 0.09998539 2.753994e-08 0.1522088 0.1143442
## GVA_AGROPEC_p_SE GVA_INDUSTRY_p_SE GVA_SERVICES_p_SE tax_to_gdp_SE
## 1 0.0006319413 0.0004021848 0.0006730499 0.0001434313
## 2 0.0006319413 0.0004021848 0.0006730499 0.0001434313
## 3 0.0006319413 0.0004021848 0.0006730499 0.0001434313
## 4 0.0006319413 0.0004021848 0.0006730499 0.0001434313
## 5 0.0006319413 0.0004021848 0.0006730499 0.0001434313
## 6 0.0006319413 0.0004021848 0.0006730499 0.0001434313
## 7 0.0006319413 0.0004021848 0.0006730499 0.0001434313
## 8 0.0006319413 0.0004021848 0.0006730499 0.0001434313
## 9 0.0006319413 0.0004021848 0.0006730499 0.0001434313
## 10 0.0006319413 0.0004021848 0.0006730499 0.0001434313
## Cars_p_SE Intercept_TV IBGE_CROP_PRODUCTION_TV IDHM_Longevidade_TV
## 1 0.05691429 65.12628 9.670845 4.361500
## 2 0.05691429 65.12642 9.670943 4.361394
## 3 0.05691429 65.12627 9.670847 4.361495
## 4 0.05691429 65.12651 9.671009 4.361320
## 5 0.05691429 65.12558 9.670448 4.361902
## 6 0.05691429 65.12614 9.670843 4.361466
## 7 0.05691429 65.12634 9.670941 4.361376
## 8 0.05691429 65.12620 9.670879 4.361430
## 9 0.05691429 65.12657 9.671003 4.361343
## 10 0.05691429 65.12675 9.671103 4.361243
## IDHM_Renda_TV GVA_AGROPEC_p_TV GVA_INDUSTRY_p_TV GVA_SERVICES_p_TV
## 1 30.80678 44.10451 46.85598 29.85875
## 2 30.80673 44.10451 46.85599 29.85875
## 3 30.80679 44.10457 46.85604 29.85878
## 4 30.80671 44.10456 46.85604 29.85877
## 5 30.80716 44.10523 46.85667 29.85908
## 6 30.80698 44.10534 46.85682 29.85914
## 7 30.80684 44.10497 46.85646 29.85897
## 8 30.80695 44.10528 46.85676 29.85911
## 9 30.80662 44.10418 46.85566 29.85859
## 10 30.80651 44.10395 46.85544 29.85849
## tax_to_gdp_TV Cars_p_TV Local_R2 coords.x1 coords.x2
## 1 5.461343 7.987906 0.8392090 -49.44055 -16.758812
## 2 5.461341 7.988060 0.8392090 -47.39683 -18.487565
## 3 5.461357 7.987887 0.8392090 -48.71881 -16.182672
## 4 5.461351 7.988146 0.8392091 -45.44619 -19.155848
## 5 5.461533 7.986993 0.8392100 -48.88440 -1.723470
## 6 5.461556 7.987566 0.8392102 -39.04755 -7.356977
## 7 5.461459 7.987869 0.8392097 -41.66161 -13.253532
## 8 5.461538 7.987649 0.8392101 -39.11659 -8.723418
## 9 5.461255 7.988289 0.8392085 -50.31253 -23.300494
## 10 5.461195 7.988538 0.8392082 -51.02527 -27.608987
## geom
## 1 MULTIPOLYGON (((-49.4444 -1...
## 2 POLYGON ((-47.4384 -18.1657...
## 3 MULTIPOLYGON (((-48.84178 -...
## 4 POLYGON ((-45.16777 -18.890...
## 5 MULTIPOLYGON (((-48.83139 -...
## 6 MULTIPOLYGON (((-39.01733 -...
## 7 MULTIPOLYGON (((-41.64722 -...
## 8 MULTIPOLYGON (((-39.35855 -...
## 9 POLYGON ((-50.22465 -23.226...
## 10 MULTIPOLYGON (((-51.03724 -...
summary(gwr.adaptive$SDF$yhat)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 8.498 9.163 9.678 9.698 10.097 15.499
The maximum value is 15.499
Remove code_muni from brazil.sf.fixed2 data frame
brazil.sf.adaptive <- subset(brazil.sf.adaptive, select= -code_muni)
The code chunks below are used to plot a choropleth to visualise local R2
qtm(brazil.sf.adaptive, "Local_R2", border=NULL)
The range of local R2 values are in the range of 0.83, which is relatively high.
It can be seen that the upper parts of brazil are darker shaded, which suggests that the relationship between GDP_CAPITA and the independent variables are stronger, since the variables are more correlated.
The code chunks below are used to plot an choropleth map to visualise intercept of the regression model
qtm(brazil.sf.adaptive, "Intercept", border=NULL)
We can see that there is a varying range of intercept value around 6.5, and since intercept value > 0, we can conclude that the slope is positive.
Residual is the difference between the observed GDP_CAPITA and the predicted GDP_CAPITA. Residual will be 0 if there is no difference bewteen observed and predicted values of GDP_CAPITA.
qtm(brazil.sf.adaptive,"residual", border=NULL)
## Variable(s) "residual" contains positive and negative values, so midpoint is set to 0. Set midpoint = NA to show the full spectrum of the color palette.
The code chunk below allow us to visualise the y hat of the regression model.
Y-hat represents the predicted equation for a line of best fit in a linear regression model. The value of y hat helps us to differentiate between the predicted data and the observed data of GDP_CAPITA.
qtm(brazil.sf.adaptive, "yhat", border=NULL)
Overall, our regression model is able to perform well in explaining the factors that are affecting GDP_CAPITA in Brazil.