1 Biblioteca Summary Tools

library(summarytools)

1.1 Carga de Dados

dados = read.csv2("Geral.csv", sep = ",")
dados = dados[, 3:13]
dados$q8 = NULL
head(dados)

1.1.1 Tabela de Frequências

print(freq(tobacco$gender), method = 'render')

Frequencies

tobacco$gender

Type: Factor
Valid Total
gender Freq % % Cum. % % Cum.
F 489 50.00 50.00 48.90 48.90
M 489 50.00 100.00 48.90 97.80
<NA> 22 2.20 100.00
Total 1000 100.00 100.00 100.00 100.00

Generated by summarytools 0.9.9 (R version 4.0.3)
2021-07-21

1.1.2 Tabela de Contingência

print(ctable(dados$q1, dados$q2),  method = 'render')

Cross-Tabulation, Row Proportions

q1 * q2

Data Frame: dados
q2
q1 1 2 3 4 Total
1 0 ( 0.0% ) 0 ( 0.0% ) 2 ( 100.0% ) 0 ( 0.0% ) 2 ( 100.0% )
2 2 ( 15.4% ) 6 ( 46.2% ) 3 ( 23.1% ) 2 ( 15.4% ) 13 ( 100.0% )
3 4 ( 15.4% ) 8 ( 30.8% ) 9 ( 34.6% ) 5 ( 19.2% ) 26 ( 100.0% )
4 5 ( 25.0% ) 10 ( 50.0% ) 5 ( 25.0% ) 0 ( 0.0% ) 20 ( 100.0% )
5 10 ( 31.2% ) 14 ( 43.8% ) 5 ( 15.6% ) 3 ( 9.4% ) 32 ( 100.0% )
6 1 ( 12.5% ) 5 ( 62.5% ) 1 ( 12.5% ) 1 ( 12.5% ) 8 ( 100.0% )
Total 22 ( 21.8% ) 43 ( 42.6% ) 25 ( 24.8% ) 11 ( 10.9% ) 101 ( 100.0% )

Generated by summarytools 0.9.9 (R version 4.0.3)
2021-07-21

1.1.3 Tabela de Resumos Estatísticos

print(descr(dados), method = 'render', table.classes = 'st-small')

Descriptive Statistics

dados

N: 101
q1 q10 q11 q2 q3 q4 q5 q6 q7 q9
Mean 3.90 1.58 1.56 2.25 2.26 23349.93 735.04 1091.07 27.42 1.07
Std.Dev 1.26 0.50 0.50 0.92 0.77 70373.14 799.27 2900.66 140.72 0.26
Min 1.00 1.00 1.00 1.00 1.00 351.00 35.00 0.00 0.00 1.00
Q1 3.00 1.00 1.00 2.00 2.00 2041.00 286.00 54.00 1.00 1.00
Median 4.00 2.00 2.00 2.00 2.00 5326.00 487.00 156.00 4.00 1.00
Q3 5.00 2.00 2.00 3.00 3.00 14084.00 797.00 610.00 11.00 1.00
Max 6.00 2.00 2.00 4.00 3.00 613011.00 3717.00 22000.00 1400.00 2.00
MAD 1.48 0.00 0.00 1.48 1.48 5232.10 386.96 177.91 5.93 0.00
IQR 2.00 1.00 1.00 1.00 1.00 12043.00 511.00 556.00 10.00 0.00
CV 0.32 0.31 0.32 0.41 0.34 3.01 1.09 2.66 5.13 0.24
Skewness -0.20 -0.34 -0.26 0.34 -0.47 6.51 2.17 4.72 9.21 3.34
SE.Skewness 0.24 0.24 0.24 0.24 0.24 0.24 0.24 0.24 0.24 0.24
Kurtosis -0.92 -1.91 -1.95 -0.73 -1.19 48.29 4.19 26.86 86.64 9.26
N.Valid 101 101 101 101 101 101 101 101 101 101
Pct.Valid 100.00 100.00 100.00 100.00 100.00 100.00 100.00 100.00 100.00 100.00

Generated by summarytools 0.9.9 (R version 4.0.3)
2021-07-21

1.1.4 Tabela de Consolidacão

print(dfSummary(dados, graph.magnif = 0.75), method = 'render')

Data Frame Summary

dados

Dimensions: 101 x 10
Duplicates: 0
No Variable Stats / Values Freqs (% of Valid) Graph Valid Missing
1 q1 [integer] Mean (sd) : 3.9 (1.3) min < med < max: 1 < 4 < 6 IQR (CV) : 2 (0.3)
1:2(2.0%)
2:13(12.9%)
3:26(25.7%)
4:20(19.8%)
5:32(31.7%)
6:8(7.9%)
101 (100.0%) 0 (0.0%)
2 q2 [integer] Mean (sd) : 2.2 (0.9) min < med < max: 1 < 2 < 4 IQR (CV) : 1 (0.4)
1:22(21.8%)
2:43(42.6%)
3:25(24.8%)
4:11(10.9%)
101 (100.0%) 0 (0.0%)
3 q3 [integer] Mean (sd) : 2.3 (0.8) min < med < max: 1 < 2 < 3 IQR (CV) : 1 (0.3)
1:20(19.8%)
2:35(34.7%)
3:46(45.5%)
101 (100.0%) 0 (0.0%)
4 q4 [integer] Mean (sd) : 23349.9 (70373.1) min < med < max: 351 < 5326 < 613011 IQR (CV) : 12043 (3) 100 distinct values 101 (100.0%) 0 (0.0%)
5 q5 [integer] Mean (sd) : 735 (799.3) min < med < max: 35 < 487 < 3717 IQR (CV) : 511 (1.1) 98 distinct values 101 (100.0%) 0 (0.0%)
6 q6 [integer] Mean (sd) : 1091.1 (2900.7) min < med < max: 0 < 156 < 22000 IQR (CV) : 556 (2.7) 86 distinct values 101 (100.0%) 0 (0.0%)
7 q7 [integer] Mean (sd) : 27.4 (140.7) min < med < max: 0 < 4 < 1400 IQR (CV) : 10 (5.1) 34 distinct values 101 (100.0%) 0 (0.0%)
8 q9 [integer] Min : 1 Mean : 1.1 Max : 2
1:94(93.1%)
2:7(6.9%)
101 (100.0%) 0 (0.0%)
9 q10 [integer] Min : 1 Mean : 1.6 Max : 2
1:42(41.6%)
2:59(58.4%)
101 (100.0%) 0 (0.0%)
10 q11 [integer] Min : 1 Mean : 1.6 Max : 2
1:44(43.6%)
2:57(56.4%)
101 (100.0%) 0 (0.0%)

Generated by summarytools 0.9.9 (R version 4.0.3)
2021-07-21

2 Biblioteca Model Summnary

library(remotes)
library(modelsummary)

2.1 Mesclandop dados de Regressão Linear e Logit

models <- list()
models[['RegLinear']] <- lm(mpg ~ factor(cyl), mtcars)
models[['Logit']] <- glm(am ~ factor(cyl), mtcars, family = binomial)

library(tibble)
## 
## Attaching package: 'tibble'
## The following object is masked from 'package:summarytools':
## 
##     view
rows <- tribble(~term,          ~RegLinear,  ~Logit,
                'Classe(cyl)4', '-',   '-',
                'Informação',         '???', 'XYZ')
attr(rows, 'position') <- c(3, 9)

modelsummary(models, add_rows = rows)
RegLinear Logit
(Intercept) 26.664 0.981
(0.972) (0.677)
Classe(cyl)4
factor(cyl)6 -6.921 -1.269
(1.558) (1.021)
factor(cyl)8 -11.564 -2.773
(1.299) (1.021)
Num.Obs. 32 32
Informação ??? XYZ
R2 0.732
R2 Adj. 0.714
AIC 170.6 39.9
BIC 176.4 44.3
Log.Lik. -81.282 -16.967
F 39.698

2.2 Consolidacão

load("dat.Rda")
dat$Small <- dat$Pop1831 > median(dat$Pop1831)
datasummary_skim(dat)
Unique (#) Missing (%) Mean SD Min Median Max
X 86 0 43.5 25.0 1.0 43.5 86.0
dept 86 0 46.9 30.4 1.0 45.5 200.0
Crime_pers 85 0 19754.4 7504.7 2199.0 18748.5 37014.0
Crime_prop 86 0 7843.1 3051.4 1368.0 7595.0 20235.0
Literacy 50 0 39.3 17.4 12.0 38.0 74.0
Donations 85 0 7075.5 5834.6 1246.0 5020.0 37015.0
Infants 86 0 19049.9 8820.2 2660.0 17141.5 62486.0
Suicides 86 0 36522.6 31312.5 3460.0 26743.5 163241.0
Wealth 86 0 43.5 25.0 1.0 43.5 86.0
Commerce 84 0 42.8 25.0 1.0 42.5 86.0
Clergy 85 0 43.4 25.0 1.0 43.5 86.0
Crime_parents 86 0 43.5 25.0 1.0 43.5 86.0
Infanticide 81 0 43.5 24.9 1.0 43.5 86.0
Donation_clergy 86 0 43.5 25.0 1.0 43.5 86.0
Lottery 86 0 43.5 25.0 1.0 43.5 86.0
Desertion 86 0 43.5 25.0 1.0 43.5 86.0
Instruction 82 0 43.1 24.8 1.0 41.5 86.0
Prostitutes 63 0 141.9 521.0 0.0 33.0 4744.0
Distance 86 0 208.0 109.3 0.0 200.6 539.2
Area 84 0 6147.0 1398.2 762.0 6070.5 10000.0
Pop1831 86 0 378.6 148.8 129.1 346.2 989.9

2.3 Estatisticas para subconjuntos

datasummary_balance(~Small, dat)
## Warning in sanitize_datasummary_balance_data(formula, data): These variables
## were omitted because they include more than 50 levels: Department.
## Warning in datasummary_balance(~Small, dat): Please install the `estimatr` package or set `dinm=FALSE` to
##              suppress this warning.
FALSE (N=43)
TRUE (N=43)
Mean Std. Dev. Mean Std. Dev.
X 41.4 27.4 45.6 22.3
dept 46.0 36.6 47.7 23.0
Crime_pers 18040.6 7638.4 21468.2 7044.3
Crime_prop 8422.5 3406.7 7263.7 2559.3
Literacy 37.9 19.1 40.6 15.6
Donations 7258.5 6194.1 6892.6 5519.0
Infants 20790.2 9363.5 17309.6 7973.0
Suicides 42565.4 37074.1 30479.8 23130.9
Wealth 51.0 23.9 36.0 23.9
Commerce 42.7 24.6 43.0 25.7
Clergy 39.1 26.7 47.7 22.7
Crime_parents 54.2 25.2 32.8 19.9
Infanticide 37.9 25.7 49.1 23.1
Donation_clergy 52.3 24.0 34.7 22.9
Lottery 54.8 23.0 32.2 21.7
Desertion 41.7 25.9 45.3 24.2
Instruction 46.7 26.7 39.6 22.5
Prostitutes 52.8 93.1 230.9 724.1
Distance 228.7 116.7 187.2 98.4
Area 5989.0 1142.8 6305.0 1612.4
Pop1831 272.4 53.4 484.8 137.3
N % N %
Region C 13 30.2 4 9.3
E 9 20.9 8 18.6
N 4 9.3 13 30.2
S 12 27.9 5 11.6
W 4 9.3 13 30.2
NA 1 2.3 0 0.0
MainCity 1:Sm 10 23.3 0 0.0
2:Med 33 76.7 33 76.7
3:Lg 0 0.0 10 23.3

2.4 Matriz de correlação

datasummary_correlation(dat)
X dept Crime_pers Crime_prop Literacy Donations Infants Suicides Wealth Commerce Clergy Crime_parents Infanticide Donation_clergy Lottery Desertion Instruction Prostitutes Distance Area Pop1831
X 1 . . . . . . . . . . . . . . . . . . . .
dept .92 1 . . . . . . . . . . . . . . . . . . .
Crime_pers -.12 -.20 1 . . . . . . . . . . . . . . . . . .
Crime_prop -.28 -.29 .27 1 . . . . . . . . . . . . . . . . .
Literacy .09 .10 -.04 -.37 1 . . . . . . . . . . . . . . . .
Donations .06 .27 -.04 -.13 -.13 1 . . . . . . . . . . . . . . .
Infants -.07 -.03 -.04 .27 -.41 .17 1 . . . . . . . . . . . . . .
Suicides -.18 -.15 -.13 .52 -.37 -.03 .29 1 . . . . . . . . . . . . .
Wealth -.08 -.08 -.12 .46 -.28 .08 .34 .42 1 . . . . . . . . . . . .
Commerce -.03 .05 .05 .41 -.58 .30 .39 .48 .48 1 . . . . . . . . . . .
Clergy .15 .06 .26 -.07 -.17 .09 -.06 -.32 -.11 -.12 1 . . . . . . . . . .
Crime_parents -.16 -.07 -.20 .36 -.20 -.02 .06 .35 .22 .18 -.18 1 . . . . . . . . .
Infanticide .01 -.07 .27 -.13 .32 -.15 -.24 -.08 -.22 -.28 -.01 -.09 1 . . . . . . . .
Donation_clergy -.08 .00 -.18 .30 -.38 .25 .10 .19 .34 .18 .30 .29 -.23 1 . . . . . . .
Lottery -.21 -.11 .00 .43 -.36 .15 .42 .49 .48 .45 -.28 .28 -.35 .36 1 . . . . . .
Desertion .10 .03 .33 -.26 .40 -.04 .00 -.47 -.23 -.36 .25 -.39 .11 -.41 -.30 1 . . . . .
Instruction -.10 -.11 .05 .39 -.98 .14 .43 .36 .31 .59 .21 .21 -.32 .40 .37 -.37 1 . . . .
Prostitutes .17 .14 -.05 -.33 .30 -.07 -.28 -.21 -.32 -.27 .20 -.03 .16 -.04 -.28 .04 -.26 1 . . .
Distance -.10 .04 -.51 .25 -.28 .08 .23 .41 .40 .38 -.31 .26 -.16 .28 .28 -.44 .24 -.37 1 . .
Area -.18 -.08 .22 .09 -.23 .18 .16 .00 .06 .18 .08 -.20 -.23 .02 .23 .04 .20 -.41 .06 1 .
Pop1831 .17 .09 .27 -.26 .09 .00 -.23 -.17 -.31 -.05 .29 -.40 .34 -.22 -.47 .11 -.11 .48 -.37 -.01 1

2.5 Estatisticas Aninhadas

datasummary(Literacy + Commerce ~ Small * (mean + sd), dat)
FALSE
TRUE
mean sd mean sd
Literacy 37.88 19.08 40.63 15.57
Commerce 42.65 24.59 42.95 25.75

2.6 Resumo de Regressão Linear

mod <- lm(Donations ~ Crime_prop, data = dat)
modelsummary(mod)
Model 1
(Intercept) 9065.287
(1738.926)
Crime_prop -0.254
(0.207)
Num.Obs. 86
R2 0.018
R2 Adj. 0.006
AIC 1739.0
BIC 1746.4
Log.Lik. -866.516
F 1.505

2.7 Diferentes Regressões Comparadas

modelo <- list(
  "RegLinear 1"     = lm(Donations ~ Literacy + Clergy, data = dat),
  "Poisson 1" = glm(Donations ~ Literacy + Commerce, family = poisson, data = dat),
  "RegLinear 2"     = lm(Crime_pers ~ Literacy + Clergy, data = dat),
  "Poisson 2" = glm(Crime_pers ~ Literacy + Commerce, family = poisson, data = dat),
  "RegLinear 3"     = lm(Crime_prop ~ Literacy + Clergy, data = dat)
)

modelsummary(modelo)
RegLinear 1 Poisson 1 RegLinear 2 Poisson 2 RegLinear 3
(Intercept) 7948.667 8.241 16259.384 9.876 11243.544
(2078.276) (0.006) (2611.140) (0.003) (1011.240)
Literacy -39.121 0.003 3.680 0.000 -68.507
(37.052) (0.000) (46.552) (0.000) (18.029)
Clergy 15.257 77.148 -16.376
(25.735) (32.334) (12.522)
Commerce 0.011 0.001
(0.000) (0.000)
Num.Obs. 86 86 86 86 86
R2 0.020 0.065 0.152
R2 Adj. -0.003 0.043 0.132
AIC 1740.8 274160.8 1780.0 257564.4 1616.9
BIC 1750.6 274168.2 1789.9 257571.7 1626.7
Log.Lik. -866.392 -137077.401 -886.021 -128779.186 -804.441
F 0.866 2.903 7.441
modelo <- list(
  "RegLinear 1"     = lm(Donations ~ Literacy + Clergy, data = dat),
  "Poisson 1" = glm(Donations ~ Literacy + Commerce, family = poisson, data = dat),
  "RegLinear 2"     = lm(Crime_pers ~ Literacy + Clergy, data = dat),
  "Poisson 2" = glm(Crime_pers ~ Literacy + Commerce, family = poisson, data = dat),
  "RegLinear 3"     = lm(Crime_prop ~ Literacy + Clergy, data = dat)
)

modelsummary(modelo)
RegLinear 1 Poisson 1 RegLinear 2 Poisson 2 RegLinear 3
(Intercept) 7948.667 8.241 16259.384 9.876 11243.544
(2078.276) (0.006) (2611.140) (0.003) (1011.240)
Literacy -39.121 0.003 3.680 0.000 -68.507
(37.052) (0.000) (46.552) (0.000) (18.029)
Clergy 15.257 77.148 -16.376
(25.735) (32.334) (12.522)
Commerce 0.011 0.001
(0.000) (0.000)
Num.Obs. 86 86 86 86 86
R2 0.020 0.065 0.152
R2 Adj. -0.003 0.043 0.132
AIC 1740.8 274160.8 1780.0 257564.4 1616.9
BIC 1750.6 274168.2 1789.9 257571.7 1626.7
Log.Lik. -866.392 -137077.401 -886.021 -128779.186 -804.441
F 0.866 2.903 7.441
modelsummary(modelo, fmt = 4)
RegLinear 1 Poisson 1 RegLinear 2 Poisson 2 RegLinear 3
(Intercept) 7948.6672 8.2410 16259.3843 9.8758 11243.5443
(2078.2761) (0.0058) (2611.1396) (0.0034) (1011.2402)
Literacy -39.1209 0.0030 3.6800 -0.0003 -68.5065
(37.0520) (0.0001) (46.5521) (0.0001) (18.0286)
Clergy 15.2567 77.1481 -16.3758
(25.7354) (32.3339) (12.5222)
Commerce 0.0111 0.0006
(0.0001) (0.0000)
Num.Obs. 86 86 86 86 86
R2 0.020 0.065 0.152
R2 Adj. -0.003 0.043 0.132
AIC 1740.8 274160.8 1780.0 257564.4 1616.9
BIC 1750.6 274168.2 1789.9 257571.7 1626.7
Log.Lik. -866.392 -137077.401 -886.021 -128779.186 -804.441
F 0.866 2.903 7.441
modelsummary(modelo, fmt = "%.4f")
RegLinear 1 Poisson 1 RegLinear 2 Poisson 2 RegLinear 3
(Intercept) 7948.6672 8.2410 16259.3843 9.8758 11243.5443
(2078.2761) (0.0058) (2611.1396) (0.0034) (1011.2402)
Literacy -39.1209 0.0030 3.6800 -0.0003 -68.5065
(37.0520) (0.0001) (46.5521) (0.0001) (18.0286)
Clergy 15.2567 77.1481 -16.3758
(25.7354) (32.3339) (12.5222)
Commerce 0.0111 0.0006
(0.0001) (0.0000)
Num.Obs. 86 86 86 86 86
R2 0.020 0.065 0.152
R2 Adj. -0.003 0.043 0.132
AIC 1740.8 274160.8 1780.0 257564.4 1616.9
BIC 1750.6 274168.2 1789.9 257571.7 1626.7
Log.Lik. -866.392 -137077.401 -886.021 -128779.186 -804.441
F 0.866 2.903 7.441