Com base nas informações será realizada uma análise descritiva das variáveis e as ferramentas para tal interpretação.
Na base de dados utilizada estão relacionadas aos artigos citados abaixo:
- I-Cheng Yeh, “Modeling of strength of high performance concrete using artificial neural networks,” Cement and Concrete Research, Vol. 28, No. 12, pp. 1797-1808 (1998).
- I-Cheng Yeh, “Modeling Concrete Strength with Augment-Neuron Networks,” J. of Materials in Civil Engineering, ASCE, Vol. 10, No. 4, pp. 263-268 (1998).
- I-Cheng Yeh, “Design of High Performance Concrete Mixture Using Neural Networks,” J. of Computing in Civil Engineering, ASCE, Vol. 13, No. 1, pp. 36-42 (1999).
- I-Cheng Yeh, “Prediction of Strength of Fly Ash and Slag Concrete By The Use of Artificial Neural Networks,” Journal of the Chinese Institute of Civil and Hydraulic Engineering, Vol. 15, No. 4, pp. 659-663 (2003).
- I-Cheng Yeh, “A mix Proportioning Methodology for Fly Ash and Slag Concrete Using Artificial Neural Networks,” Chung Hua Journal of Science and Engineering, Vol. 1, No. 1, pp. 77-84 (2003).
- Yeh, I-Cheng, “Analysis of strength of concrete using design of experiments and neural networks,” Journal of Materials in Civil Engineering, ASCE, Vol.18, No.4, pp.597-604 (2006).
Para o download do dataset:
http://archive.ics.uci.edu/ml/datasets/concrete+compressive+strength
As variáveis dessa banco de dados são as seguintes:
library(knitr)
library(data.table)
library(lattice)
library(ggplot2)
library(bpca)
library(Amelia)
library(corrplot)
library(factoextra)
library(dplyr)
library(car)
library(caret)
library(Metrics)
library(MLmetrics)
library(tidyverse)
library(kableExtra)
library(tidyverse)
library(tibble)
library(gridExtra)
library(plot3D)
library(FactoMineR)
library(bpca)
library(GGally)
library(PerformanceAnalytics)
library(cowplot)
| cement | blast_furnace_slag | fly_ash | water | superplasticizer | coarse_aggregate | fine_aggregate | age | CCS | w.c | s.c |
|---|---|---|---|---|---|---|---|---|---|---|
| 540.0 | 0.0 | 0 | 162 | 2.5 | 1040.0 | 676.0 | 28 | 79.99 | 0.3000000 | 1.251852 |
| 540.0 | 0.0 | 0 | 162 | 2.5 | 1055.0 | 676.0 | 28 | 61.89 | 0.3000000 | 1.251852 |
| 332.5 | 142.5 | 0 | 228 | 0.0 | 932.0 | 594.0 | 270 | 40.27 | 0.6857143 | 1.786466 |
| 332.5 | 142.5 | 0 | 228 | 0.0 | 932.0 | 594.0 | 365 | 41.05 | 0.6857143 | 1.786466 |
| 198.6 | 132.4 | 0 | 192 | 0.0 | 978.4 | 825.5 | 360 | 44.30 | 0.9667674 | 4.156596 |
| 266.0 | 114.0 | 0 | 228 | 0.0 | 932.0 | 670.0 | 90 | 47.03 | 0.8571429 | 2.518797 |
| 380.0 | 95.0 | 0 | 228 | 0.0 | 932.0 | 594.0 | 365 | 43.70 | 0.6000000 | 1.563158 |
| 380.0 | 95.0 | 0 | 228 | 0.0 | 932.0 | 594.0 | 28 | 36.45 | 0.6000000 | 1.563158 |
| 266.0 | 114.0 | 0 | 228 | 0.0 | 932.0 | 670.0 | 28 | 45.85 | 0.8571429 | 2.518797 |
| 475.0 | 0.0 | 0 | 228 | 0.0 | 932.0 | 594.0 | 28 | 39.29 | 0.4800000 | 1.250526 |
Abaixo são apresentadas as Estatisticas Univariadas.
summary(dados_expl)
## cement blast_furnace_slag fly_ash water
## Min. :102.0 Min. : 0.0 Min. : 0.00 Min. :121.8
## 1st Qu.:190.7 1st Qu.: 0.0 1st Qu.: 0.00 1st Qu.:165.6
## Median :273.0 Median : 24.0 Median : 0.00 Median :185.7
## Mean :280.4 Mean : 74.0 Mean : 57.25 Mean :182.3
## 3rd Qu.:350.0 3rd Qu.:142.5 3rd Qu.:118.60 3rd Qu.:193.0
## Max. :540.0 Max. :359.4 Max. :260.00 Max. :252.5
## NA's :211 NA's :231 NA's :234 NA's :211
## superplasticizer coarse_aggregate fine_aggregate age
## Min. : 0.000 Min. : 708.0 Min. :594.0 Min. : 1.00
## 1st Qu.: 0.000 1st Qu.: 931.2 1st Qu.:724.3 1st Qu.: 14.00
## Median : 6.500 Median : 967.1 Median :778.5 Median : 28.00
## Mean : 6.274 Mean : 969.8 Mean :772.3 Mean : 43.88
## 3rd Qu.:10.200 3rd Qu.:1028.4 3rd Qu.:822.0 3rd Qu.: 28.00
## Max. :32.200 Max. :1145.0 Max. :992.6 Max. :365.00
## NA's :229 NA's :211 NA's :211
## CCS w.c s.c
## Min. : 1.50 Min. :0.2600 Min. :0.000
## 1st Qu.:23.52 1st Qu.:0.7178 1st Qu.:2.084
## Median :33.73 Median :1.1060 Median :2.782
## Mean :35.08 Mean :1.2381 Mean :3.067
## 3rd Qu.:44.56 3rd Qu.:1.6599 3rd Qu.:4.031
## Max. :82.60 Max. :3.7468 Max. :9.235
##
A união de vários datasets num mesmo arquivo poderá gerar dados repetidos e dados faltantes (NAs).
Para os dados repetidos são tomadas as seguintes medidas:
# número de dados inicialmente inputados
nrow(dados_expl)
## [1] 3427
# retirada das observações repetidas
dados_expl_1 <- unique(dados_expl)
# número de dados final sem repetições
nrow(dados_expl_1)
## [1] 2656
print(paste0("Foram eliminadas ", nrow(dados_expl) - nrow(dados_expl_1), " observações repetidas." ))
## [1] "Foram eliminadas 771 observações repetidas."
print(paste0("O novo dataset (dados_expl_1) contém ",nrow(dados_expl_1), " observações." ))
## [1] "O novo dataset (dados_expl_1) contém 2656 observações."
Para as observações faltantes são tomadas as seguintes medidas:
# Mapa de valores ausentes (univariado).
missmap(dados_expl, col=c("black", "grey"), legend=FALSE)
NAs <- round(colSums(is.na(dados_expl_1))*100/nrow(dados_expl_1), 2)
NAs
## cement blast_furnace_slag fly_ash water
## 7.64 8.40 8.51 7.64
## superplasticizer coarse_aggregate fine_aggregate age
## 8.32 7.64 7.64 0.00
## CCS w.c s.c
## 0.00 0.00 0.00
dados_expl_2 <- dados_expl_1[!is.na(dados_expl_1$fly_ash),]
NAs <- round(colSums(is.na(dados_expl_2))*100/nrow(dados_expl_2), 2)
NAs
## cement blast_furnace_slag fly_ash water
## 0 0 0 0
## superplasticizer coarse_aggregate fine_aggregate age
## 0 0 0 0
## CCS w.c s.c
## 0 0 0
# Mapa de valores ausentes (univariado).
missmap(dados_expl_2, col=c("black", "grey"), legend=FALSE)
glimpse(dados_expl_2)
## Rows: 2,430
## Columns: 11
## $ cement <dbl> 540.0, 540.0, 332.5, 332.5, 198.6, 266.0, 380.0, 38~
## $ blast_furnace_slag <dbl> 0.0, 0.0, 142.5, 142.5, 132.4, 114.0, 95.0, 95.0, 1~
## $ fly_ash <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ~
## $ water <dbl> 162, 162, 228, 228, 192, 228, 228, 228, 228, 228, 1~
## $ superplasticizer <dbl> 2.5, 2.5, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0~
## $ coarse_aggregate <dbl> 1040.0, 1055.0, 932.0, 932.0, 978.4, 932.0, 932.0, ~
## $ fine_aggregate <dbl> 676.0, 676.0, 594.0, 594.0, 825.5, 670.0, 594.0, 59~
## $ age <dbl> 28, 28, 270, 365, 360, 90, 365, 28, 28, 28, 90, 28,~
## $ CCS <dbl> 79.99, 61.89, 40.27, 41.05, 44.30, 47.03, 43.70, 36~
## $ w.c <dbl> 0.3000000, 0.3000000, 0.6857143, 0.6857143, 0.96676~
## $ s.c <dbl> 1.251852, 1.251852, 1.786466, 1.786466, 4.156596, 2~
summary(dados_expl_2)
## cement blast_furnace_slag fly_ash water
## Min. :102.0 Min. : 0.00 Min. : 0.00 Min. :121.8
## 1st Qu.:190.3 1st Qu.: 0.00 1st Qu.: 0.00 1st Qu.:164.0
## Median :252.0 Median : 19.00 Median : 81.90 Median :182.9
## Mean :273.7 Mean : 65.05 Mean : 67.53 Mean :181.0
## 3rd Qu.:337.9 3rd Qu.:129.80 3rd Qu.:123.00 3rd Qu.:192.0
## Max. :540.0 Max. :359.40 Max. :260.00 Max. :247.0
## superplasticizer coarse_aggregate fine_aggregate age
## Min. : 0.000 Min. : 708.0 Min. :594.0 Min. : 1.00
## 1st Qu.: 0.000 1st Qu.: 932.0 1st Qu.:737.0 1st Qu.: 14.00
## Median : 6.860 Median : 971.8 Median :780.1 Median : 28.00
## Mean : 6.452 Mean : 974.5 Mean :775.5 Mean : 43.96
## 3rd Qu.:10.100 3rd Qu.:1040.0 3rd Qu.:824.0 3rd Qu.: 56.00
## Max. :32.200 Max. :1145.0 Max. :992.6 Max. :365.00
## CCS w.c s.c
## Min. : 2.33 Min. :0.2669 Min. :1.135
## 1st Qu.:24.00 1st Qu.:0.7282 1st Qu.:2.231
## Median :34.34 Median :1.0996 Median :3.016
## Mean :35.47 Mean :1.2055 Mean :3.272
## 3rd Qu.:45.03 3rd Qu.:1.5376 3rd Qu.:4.217
## Max. :82.60 Max. :3.7468 Max. :9.235
# Padronizar variaveis em escala
dado.padronizado <- scale(dados_expl_2)
# Criar clusters
concreto.k2 <- kmeans(dado.padronizado, centers = 2)
concreto.k3 <- kmeans(dado.padronizado, centers = 3)
concreto.k4 <- kmeans(dado.padronizado, centers = 4)
concreto.k5 <- kmeans(dado.padronizado, centers = 5)
concreto.k6 <- kmeans(dado.padronizado, centers = 6)
concreto.k7 <- kmeans(dado.padronizado, centers = 7)
# Criar graficos das possíveis combinações => cluster
G1 <- fviz_cluster(concreto.k2, geom = "point", data = dado.padronizado) + ggtitle("k = 2")
G2 <- fviz_cluster(concreto.k3, geom = "point", data = dado.padronizado) + ggtitle("k = 3")
G3 <- fviz_cluster(concreto.k4, geom = "point", data = dado.padronizado) + ggtitle("k = 4")
G4 <- fviz_cluster(concreto.k5, geom = "point", data = dado.padronizado) + ggtitle("k = 5")
G5 <- fviz_cluster(concreto.k6, geom = "point", data = dado.padronizado) + ggtitle("k = 6")
G6 <- fviz_cluster(concreto.k7, geom = "point", data = dado.padronizado) + ggtitle("k = 7")
#Imprimir gráficos na mesma tela
grid.arrange(G1, G2, G3, G4, G5, G6, ncol = 2)
#VERIFICANDO ELBOW E SILHOUTTE
fviz_nbclust(dado.padronizado, kmeans, method = "wss")
fviz_nbclust(dado.padronizado, kmeans, method = "silhouette")
# Seguindo a recomendação do gráfico "silhouette" serão adotados 7 clusters,
dados_expl_2$cluster <- concreto.k7$cluster
#Visualizando a base de dados
kable(dados_expl_2[1:10,]) %>%
kable_styling(bootstrap_options = "striped",
full_width = T,
font_size = 12)
| cement | blast_furnace_slag | fly_ash | water | superplasticizer | coarse_aggregate | fine_aggregate | age | CCS | w.c | s.c | cluster |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 540.0 | 0.0 | 0 | 162 | 2.5 | 1040.0 | 676.0 | 28 | 79.99 | 0.3000000 | 1.251852 | 7 |
| 540.0 | 0.0 | 0 | 162 | 2.5 | 1055.0 | 676.0 | 28 | 61.89 | 0.3000000 | 1.251852 | 1 |
| 332.5 | 142.5 | 0 | 228 | 0.0 | 932.0 | 594.0 | 270 | 40.27 | 0.6857143 | 1.786466 | 2 |
| 332.5 | 142.5 | 0 | 228 | 0.0 | 932.0 | 594.0 | 365 | 41.05 | 0.6857143 | 1.786466 | 2 |
| 198.6 | 132.4 | 0 | 192 | 0.0 | 978.4 | 825.5 | 360 | 44.30 | 0.9667674 | 4.156596 | 2 |
| 266.0 | 114.0 | 0 | 228 | 0.0 | 932.0 | 670.0 | 90 | 47.03 | 0.8571429 | 2.518797 | 4 |
| 380.0 | 95.0 | 0 | 228 | 0.0 | 932.0 | 594.0 | 365 | 43.70 | 0.6000000 | 1.563158 | 2 |
| 380.0 | 95.0 | 0 | 228 | 0.0 | 932.0 | 594.0 | 28 | 36.45 | 0.6000000 | 1.563158 | 4 |
| 266.0 | 114.0 | 0 | 228 | 0.0 | 932.0 | 670.0 | 28 | 45.85 | 0.8571429 | 2.518797 | 4 |
| 475.0 | 0.0 | 0 | 228 | 0.0 | 932.0 | 594.0 | 28 | 39.29 | 0.4800000 | 1.250526 | 1 |
## o cluster também serve como variável categórica para análise exploratória
## Inicialmente são verificadas as correlações entre as variáveis
#A função ggpairs() do pacote GGally apresenta as distribuições das variáveis,
#scatters, valores das correlações e suas respectivas significâncias
ggpairs(dados_expl_2[,1:8])
#A função chart.Correlation() do pacote PerformanceAnalytics também apresenta as
#distribuições das variáveis, scatters, valores das correlações e suas
#respectivas significancias
chart.Correlation((dados_expl_2[,1:8]), histogram = TRUE)
correlacao <- cor(dados_expl_2[,1:8])
cores <- colorRampPalette(c("red", "white", "blue"))
corrplot(correlacao, order="AOE", method="square", col=cores(20), tl.srt=45, tl.cex=0.75, tl.col="black")
corrplot(correlacao, add=TRUE, type="lower", method="number", order="AOE", col="black", diag=FALSE, tl.pos="n", cl.pos="n", number.cex=0.75)
# selecionando as variáveis da análise de PCA
dados_expl_3 <- dados_expl_2[c(1,2,3,4,5,6,7,8)]
plot(bpca(dados_expl_3))
res.pca <- PCA(dados_expl_3, graph = FALSE)
# Extract eigenvalues/variances
get_eig(res.pca)
## eigenvalue variance.percent cumulative.variance.percent
## Dim.1 2.27634802 28.4543502 28.45435
## Dim.2 1.39429097 17.4286371 45.88299
## Dim.3 1.27749929 15.9687412 61.85173
## Dim.4 1.01180933 12.6476166 74.49935
## Dim.5 0.96983672 12.1229590 86.62230
## Dim.6 0.83664602 10.4580753 97.08038
## Dim.7 0.20157156 2.5196446 99.60002
## Dim.8 0.03199809 0.3999761 100.00000
# Visualize eigenvalues/variances
fviz_screeplot(res.pca, addlabels = TRUE, ylim = c(0, 50))
# Extract the results for variables
var <- get_pca_var(res.pca)
var
## Principal Component Analysis Results for variables
## ===================================================
## Name Description
## 1 "$coord" "Coordinates for the variables"
## 2 "$cor" "Correlations between variables and dimensions"
## 3 "$cos2" "Cos2 for the variables"
## 4 "$contrib" "contributions of the variables"
# Coordinates of variables
head(var$coord)
## Dim.1 Dim.2 Dim.3 Dim.4 Dim.5
## cement -0.2305701 0.4252639 0.8399345852 -0.01944138 0.16221684
## blast_furnace_slag -0.3462523 -0.7686563 0.0005870709 -0.44748003 -0.06080994
## fly_ash 0.6315563 -0.1298539 -0.3752186984 0.36231207 0.52032135
## water -0.8132067 -0.1619217 -0.1953682521 0.35891444 0.05495657
## superplasticizer 0.7203124 -0.2863339 0.4143713141 -0.04263821 0.28008634
## coarse_aggregate 0.0876480 0.6628116 -0.4694115845 -0.53926663 0.08908167
# Contribution of variables
head(var$contrib)
## Dim.1 Dim.2 Dim.3 Dim.4 Dim.5
## cement 2.3354326 12.970707 5.522431e+01 0.03735558 2.7132715
## blast_furnace_slag 5.2667973 42.375126 2.697866e-05 19.79012962 0.3812857
## fly_ash 17.5220755 1.209363 1.102068e+01 12.97379218 27.9154521
## water 29.0511446 1.880428 2.987771e+00 12.73160552 0.3114157
## superplasticizer 22.7930828 5.880201 1.344060e+01 0.17967976 8.0888211
## coarse_aggregate 0.3374779 31.508430 1.724833e+01 28.74143262 0.8182351
# Graph of variables: default plot
fviz_pca_var(res.pca, col.var = "black")
# Control variable colors using their contributions
fviz_pca_var(res.pca, col.var="contrib",
gradient.cols = c("#00AFBB", "#E7B800", "#FC4E07"),
repel = TRUE # Avoid text overlapping
)
# Contributions of variables to PC1
fviz_contrib(res.pca, choice = "var", axes = 1, top = 8)
# Contributions of variables to PC2
fviz_contrib(res.pca, choice = "var", axes = 2, top = 8)
# Contributions of variables to PC3
fviz_contrib(res.pca, choice = "var", axes = 3, top = 8)
# Contributions of variables to PC4
fviz_contrib(res.pca, choice = "var", axes = 4, top = 8)
# Contributions of variables to PC5
fviz_contrib(res.pca, choice = "var", axes = 5, top = 8)
fviz_pca_ind(res.pca, col.ind = "cos2",
gradient.cols = c("#00AFBB", "#E7B800", "#FC4E07"),
repel = TRUE # Avoid text overlapping (slow if many points)
)
## Warning: ggrepel: 2398 unlabeled data points (too many overlaps). Consider
## increasing max.overlaps
# Biplot of individuals and variables
fviz_pca_biplot(res.pca, repel = TRUE)
## Warning: ggrepel: 2389 unlabeled data points (too many overlaps). Consider
## increasing max.overlaps
## a título de comparação são apresentadas novamente as correlação após o tratamento das variáveis com PCA
corrplot(var$cos2, is.corr=FALSE)
corrplot(var$contrib, is.corr=FALSE)
Análise das variáveis independentes com relação a variável dependente.
plot_grid(
ggplot(data = dados_expl_2, aes(x = CCS, y = cement))+
geom_point()+
geom_smooth(method = "loess")+
labs(x = "Resistência Compressão", y = "Cimento"),
ggplot(data = dados_expl_2, aes(x = CCS, y = water))+
geom_point()+
geom_smooth(method = "loess")+
labs(x = "Resistência Compressão", y = "Agua"),
ggplot(data = dados_expl_2, aes(x = CCS, y = blast_furnace_slag))+
geom_point()+
geom_smooth(method = "loess")+
labs(x = "Resistência Compressão", y = "Escória de Alto Forno"),
ggplot(data = dados_expl_2, aes(x = CCS, y = fly_ash))+
geom_point()+
geom_smooth(method = "loess")+
labs(x = "Resistência Compressão", y = "Cinzas Volantes"),
ggplot(data = dados_expl_2, aes(x = CCS, y = superplasticizer))+
geom_point()+
geom_smooth(method = "loess")+
labs(x = "Resistência Compressão", y = "Superplastificante"),
ggplot(data = dados_expl_2, aes(x = CCS, y = coarse_aggregate))+
geom_point()+
geom_smooth(method = "loess")+
labs(x = "Resistência Compressão", y = "Agregado Graudo - Brita"),
ggplot(data = dados_expl_2, aes(x = CCS, y = fine_aggregate))+
geom_point()+
geom_smooth(method = "loess")+
labs(x = "Resistência Compressão", y = "Agregado Fino - Areia"),
ggplot(data = dados_expl_2, aes(x = CCS, y = age))+
geom_point()+
geom_smooth(method = "loess")+
labs(x = "Resistência Compressão", y = "Idade"),
nrow = 3,
label_x = 0, label_y = 0,
hjust = -0.5, vjust = -0.5)
## `geom_smooth()` using formula 'y ~ x'
## `geom_smooth()` using formula 'y ~ x'
## `geom_smooth()` using formula 'y ~ x'
## `geom_smooth()` using formula 'y ~ x'
## `geom_smooth()` using formula 'y ~ x'
## `geom_smooth()` using formula 'y ~ x'
## `geom_smooth()` using formula 'y ~ x'
## `geom_smooth()` using formula 'y ~ x'
## deve ser observado que alguns gráficos esão apresentando resultados com valores nulos (zero),
# estes mesmos precisam ser eliminados no momento da análise pois não foram usados estes componentes na mistura do concreto
### ESTE ITEM DEVERÁ SER ACRESCENTADO FUTURAMENTE
Os gráficos estão sendo gerados a partir dos clusters
layout(matrix(1:6, ncol = 3))
boxplot(data = dados_expl_2, water ~ cluster, main = "Agua")
boxplot(data = dados_expl_2, cement ~ cluster, main = "Cimento")
boxplot(data = dados_expl_2, fine_aggregate ~ cluster, main = "Areia")
boxplot(data = dados_expl_2, coarse_aggregate ~ cluster, main = "Brita")
boxplot(data = dados_expl_2, age ~ cluster, main = "Idade")
boxplot(data = dados_expl_2, CCS ~ cluster, main = "Resistencia")
layout(matrix(1:1, ncol = 1))
Os gráficos estão sendo gerados através da combinação das variáveis no formato long
library(data.table)
## convertendo as variáveis para o formato long
long <- melt(setDT(dados_expl_2[,c(1,4,6,7,8,12)]), id.vars = c("cluster"), variable.name = c("componente"))
# Basic box plot - indicação da média com o ponto vermelho
ggplot(long, aes(x = componente, y = value,
color = componente)) +
geom_boxplot(width = .95, outlier.colour = NA, coef = 100, alpha=0.7) +
stat_summary(fun.y=mean, geom="point", shape=20, size=5, color="red", fill="red", alpha=0.9)
## Warning: `fun.y` is deprecated. Use `fun` instead.
# Basic box plot - indicação da distribuição dos dados de cada variável
ggplot(long, aes(x = componente, y = value,
color = componente)) +
geom_boxplot(width = .95, outlier.colour = NA, coef = 100, alpha=0.7) +
geom_jitter(
width = .05,
alpha = .9,
size = 1,
color = "orange")
# Basic box plot - indicação dos outliers
ggplot(long, aes(x = componente, y = value,
color = componente)) +
geom_boxplot()
# Basic box plot - analisando os conjuntos através dos diferentes clusters
ggplot(long, aes(x = componente, y = value,
color = componente )) +
geom_boxplot() +
facet_wrap(~cluster, scale="free")
# Basic box plot - variáveis individualmente
ggplot(long, aes(x = cluster, y = value,
color = cluster )) +
geom_boxplot() +
facet_wrap(~componente, scale="free") +
ggtitle("Adjust line width of boxplot in ggplot2")
## Warning: Continuous x aesthetic -- did you forget aes(group=...)?
Os gráficos estão sendo gerados a partir da combinação dos dois modelos e variáveis em conjunto
layout(matrix(1:2, ncol = 1))
## Frequência
hist(dados_expl_2$CCS, freq = TRUE, labels = TRUE)
## Densidade
hist(dados_expl_2$CCS, freq = FALSE, labels = TRUE)
layout(matrix(1:1, ncol = 1))
# Histograma (univariado).
par(mfrow=c(3,3))
for(i in 1:9) {
hist(dados_expl_2[,i], main=names(dados_expl_2)[i])
}
# Gráfico de densidade (univariado).
par(mfrow=c(3,3))
for(i in 1:9) {
plot(density(dados_expl_2[,i]), main=names(dados_expl_2)[i])
}
layout(matrix(1:1, ncol = 1))
Os gráficos estão sendo gerados a partir da combinação dos dois modelos e variáveis individualmente
dados <- dados_expl_2
ggplot(dados,aes(x = water)) +
geom_histogram(aes(y = ..density.. ), bins = 20,
colour = "blue",
fill = "cornflowerblue") +
stat_function(fun = dnorm,
args = list(mean = mean(dados$water), sd (dados$water)),
col = "red", lwd = 1.2) +
theme_light()
ggplot(dados,aes(x = cement)) +
geom_histogram(aes(y = ..density.. ), bins = 20,
colour = "blue",
fill = "cornflowerblue") +
stat_function(fun = dnorm,
args = list(mean = mean(dados$cement), sd (dados$cement)),
col = "red", lwd = 1.2) +
theme_light()
ggplot(dados,aes(x = coarse_aggregate)) +
geom_histogram(aes(y = ..density.. ), bins = 20,
colour = "blue",
fill = "cornflowerblue") +
stat_function(fun = dnorm,
args = list(mean = mean(dados$coarse_aggregate), sd (dados$coarse_aggregate)),
col = "red", lwd = 1.2) +
theme_light()
ggplot(dados,aes(x = fine_aggregate)) +
geom_histogram(aes(y = ..density.. ), bins = 20,
colour = "blue",
fill = "cornflowerblue") +
stat_function(fun = dnorm,
args = list(mean = mean(dados$fine_aggregate), sd (dados$fine_aggregate)),
col = "red", lwd = 1.2) +
theme_light()
ggplot(dados,aes(x = age)) +
geom_histogram(aes(y = ..density.. ), bins = 20,
colour = "blue",
fill = "cornflowerblue") +
stat_function(fun = dnorm,
args = list(mean = mean(dados$age), sd (dados$age)),
col = "red", lwd = 1.2) +
theme_light()
ggplot(dados,aes(x = w.c)) +
geom_histogram(aes(y = ..density.. ), bins = 20,
colour = "blue",
fill = "cornflowerblue") +
stat_function(fun = dnorm,
args = list(mean = mean(dados$w.c), sd (dados$w.c)),
col = "red", lwd = 1.2) +
theme_light()
ggplot(dados,aes(x = s.c)) +
geom_histogram(aes(y = ..density.. ), bins = 20,
colour = "blue",
fill = "cornflowerblue") +
stat_function(fun = dnorm,
args = list(mean = mean(dados$s.c), sd (dados$s.c)),
col = "red", lwd = 1.2) +
theme_light()
Os gráficos estão sendo gerados a partir variáveis individualmente
a <- ggplot(dados_expl_2, aes(x = CCS))
# y axis scale = ..density.. (default behaviour)
a + geom_density() +
geom_vline(aes(xintercept = mean(CCS)),
linetype = "dashed", size = 0.6)
# Change y axis to count instead of density
a + geom_density(aes(y = ..count..), fill = "lightgray") +
geom_vline(aes(xintercept = mean(CCS)),
linetype = "dashed", size = 0.6,
color = "#FC4E07")
Os gráficos estão sendo gerados a partir variáveis em conjunto
# Change outline and fill colors by groups
# Use a custom palette
# a variável cluster se torna categórica
dados_expl_2$cluster <- as.factor(dados_expl_2$cluster)
# Change point shapes by groups
ggplot(dados_expl_2, aes(sample = CCS)) +
stat_qq(aes(color = cluster)) +
scale_color_manual(values = c("#868686FF", "#EFC000FF", "#FF0000", "#00FF00", "#0000FF", "#ffea00", "#ff00ae"))+
labs(y = "CCS")
library(ggridges)
ggplot(dados_expl_2, aes(x = CCS, y = cluster)) +
geom_density_ridges(aes(fill = cluster)) +
scale_fill_manual(values = c("#868686FF", "#EFC000FF", "#FF0000", "#00FF00", "#0000FF", "#ffea00", "#ff00ae"))
## Picking joint bandwidth of 3.66
ggplot(dados_expl_2, aes(x = cement, y = cluster)) +
geom_density_ridges(aes(fill = cluster)) +
scale_fill_manual(values = c("#868686FF", "#EFC000FF", "#FF0000", "#00FF00", "#0000FF", "#ffea00", "#ff00ae"))
## Picking joint bandwidth of 17.2
ggplot(dados_expl_2, aes(x = water, y = cluster)) +
geom_density_ridges(aes(fill = cluster)) +
scale_fill_manual(values = c("#868686FF", "#EFC000FF", "#FF0000", "#00FF00", "#0000FF", "#ffea00", "#ff00ae"))
## Picking joint bandwidth of 4.21
ggplot(dados_expl_2, aes(x = coarse_aggregate, y = cluster)) +
geom_density_ridges(aes(fill = cluster)) +
scale_fill_manual(values = c("#868686FF", "#EFC000FF", "#FF0000", "#00FF00", "#0000FF", "#ffea00", "#ff00ae"))
## Picking joint bandwidth of 16.8
ggplot(dados_expl_2, aes(x = fine_aggregate, y = cluster)) +
geom_density_ridges(aes(fill = cluster)) +
scale_fill_manual(values = c("#868686FF", "#EFC000FF", "#FF0000", "#00FF00", "#0000FF", "#ffea00", "#ff00ae"))
## Picking joint bandwidth of 19.1
ggplot(dados_expl_2, aes(x = fly_ash, y = cluster)) +
geom_density_ridges(aes(fill = cluster)) +
scale_fill_manual(values = c("#868686FF", "#EFC000FF", "#FF0000", "#00FF00", "#0000FF", "#ffea00", "#ff00ae"))
## Picking joint bandwidth of 9.75
Os gráficos estão sendo gerados a partir variáveis individualmente
library(ggpubr)
##
## Attaching package: 'ggpubr'
## The following object is masked from 'package:cowplot':
##
## get_legend
ggplot(dados_expl_2, aes(cluster)) +
geom_bar(fill = "#0073C2FF") +
theme_pubclean()
# Organizando a frequência para cada cluster
df <- dados_expl_2 %>%
group_by(cluster) %>%
summarise(counts = n())
df
## # A tibble: 7 x 2
## cluster counts
## <fct> <int>
## 1 1 389
## 2 2 95
## 3 3 233
## 4 4 218
## 5 5 822
## 6 6 303
## 7 7 370
# Create the bar plot. Use theme_pubclean() [in ggpubr]
ggplot(df, aes(x = cluster, y = counts)) +
geom_bar(fill = "#0073C2FF", stat = "identity") +
geom_text(aes(label = counts), vjust = -0.3) +
theme_pubclean()
counts <- table(dados_expl_2$cluster)
barplot(counts, main="Distribuição dos Clusters",
xlab="Número de Clusters")
# Stacked Bar Plot with Colors and Legend
counts <- table(dados_expl_2$cluster, dados_expl_2$cement)
barplot(counts, main="Distribuição dos Clusters by quantidades de cimento",
xlab="Número de Clusters", col=c("darkblue","red"),
legend = rownames(counts))
med <- c("Cimento", "Agua", "Brita", "Areia", "Age",
"Cimento", "Agua", "Brita", "Areia", "Age",
"Cimento", "Agua", "Brita", "Areia", "Age",
"Cimento", "Agua", "Brita", "Areia", "Age",
"Cimento", "Agua", "Brita", "Areia", "Age")
val <- c(203.4354, 167.9896, 1040.671, 773.8045, 38.93557,
255.5413, 195.1651, 866.1292, 749.5544, 27.16923,
297.1801, 195.5849, 1004.152, 742.0937, 63.77561,
380.8699, 159.3411, 938.4668, 791.905, 31.15119,
232.0802, 173.2611, 967.8469, 881.7636, 38.13882)
clus <- c(1,1,1,1,1,
2,2,2,2,2,
3,3,3,3,3,
4,4,4,4,4,
5,5,5,5,5)
grupos <- data.frame(clus, val, med)
ggp <- ggplot(grupos, aes(x = as.factor(clus), y = val, fill = med, label = val)) + # Create stacked bar chart
geom_bar(stat = "identity", width=0.9) + # Add values on top of bars
geom_text(size = 5, position = position_stack(vjust = 0.5)) +
xlab("Clusters") +
ylab("Médias") +
ggtitle("Distribuição média dos componentes por cluster") +
labs(fill = "Componentes")
ggp
Os gráficos estão sendo gerados a partir diversas variáveis
INSERIR AS LEGENDAS
ggplot(data = dados_expl_2, aes(cement, water))+
geom_point(aes(colour = cluster))
ggplot(data = dados_expl_2, aes(x = CCS, y = water, colour = as.factor(cluster), shape = as.factor(age), size = cement))+
geom_point()
ggplot(data = dados_expl_2, aes(age, CCS))+
geom_point(aes(colour = as.factor(cluster)))
Os gráficos estão sendo gerados a partir das variáveis e seus sub-conjuntos internos
INSERIR AS LEGENDAS
# preparação dos dados
dados_expl_2 %>%
filter(cluster == 1) %>%
summary()
## cement blast_furnace_slag fly_ash water
## Min. :200.0 Min. : 0.000 Min. : 0.0000 Min. :142.0
## 1st Qu.:296.0 1st Qu.: 0.000 1st Qu.: 0.0000 1st Qu.:186.0
## Median :339.0 Median : 0.000 Median : 0.0000 Median :192.0
## Mean :345.9 Mean : 1.185 Mean : 0.3059 Mean :192.7
## 3rd Qu.:382.5 3rd Qu.: 0.000 3rd Qu.: 0.0000 3rd Qu.:193.0
## Max. :540.0 Max. :50.000 Max. :60.0000 Max. :234.0
##
## superplasticizer coarse_aggregate fine_aggregate age
## Min. :0.0000 Min. : 838.4 Min. :594.0 Min. : 1.00
## 1st Qu.:0.0000 1st Qu.: 968.0 1st Qu.:754.0 1st Qu.: 7.00
## Median :0.0000 Median :1012.0 Median :784.0 Median : 28.00
## Mean :0.2612 Mean :1017.5 Mean :776.4 Mean : 35.96
## 3rd Qu.:0.0000 3rd Qu.:1075.0 3rd Qu.:821.0 3rd Qu.: 28.00
## Max. :8.0000 Max. :1125.0 Max. :945.0 Max. :180.00
##
## CCS w.c s.c cluster
## Min. : 6.27 Min. :0.2989 Min. :1.135 1:389
## 1st Qu.:17.54 1st Qu.:0.5766 1st Qu.:1.959 2: 0
## Median :26.06 Median :0.8220 Median :2.318 3: 0
## Mean :27.61 Mean :1.1479 Mean :2.386 4: 0
## 3rd Qu.:36.94 3rd Qu.:1.7240 3rd Qu.:2.705 5: 0
## Max. :71.99 Max. :2.7778 Max. :4.225 6: 0
## 7: 0
dados_expl_2 %>%
filter(cluster == 2) %>%
summary()
## cement blast_furnace_slag fly_ash water
## Min. :139.6 Min. : 0.0 Min. :0 Min. :173.0
## 1st Qu.:266.0 1st Qu.: 0.0 1st Qu.:0 1st Qu.:197.0
## Median :339.0 Median : 38.0 Median :0 Median :228.0
## Mean :343.0 Mean : 66.6 Mean :0 Mean :214.9
## 3rd Qu.:380.0 3rd Qu.:114.0 3rd Qu.:0 3rd Qu.:228.0
## Max. :540.0 Max. :237.5 Max. :0 Max. :228.0
##
## superplasticizer coarse_aggregate fine_aggregate age
## Min. :0 Min. : 932.0 Min. :594.0 Min. :180.0
## 1st Qu.:0 1st Qu.: 932.0 1st Qu.:594.0 1st Qu.:180.0
## Median :0 Median : 932.0 Median :670.0 Median :270.0
## Mean :0 Mean : 967.4 Mean :677.9 Mean :281.9
## 3rd Qu.:0 3rd Qu.: 968.0 3rd Qu.:722.5 3rd Qu.:365.0
## Max. :0 Max. :1125.0 Max. :885.0 Max. :365.0
##
## CCS w.c s.c cluster
## Min. :25.08 Min. :0.3204 Min. :1.135 1: 0
## 1st Qu.:40.02 1st Qu.:0.6240 1st Qu.:1.563 2:95
## Median :43.70 Median :0.9668 Median :2.204 3: 0
## Mean :46.14 Mean :1.1378 Mean :2.212 4: 0
## 3rd Qu.:52.91 3rd Qu.:1.5453 3rd Qu.:2.519 5: 0
## Max. :74.17 Max. :3.1214 Max. :5.780 6: 0
## 7: 0
# library
library(treemap)
# treemap
treemap(dados_expl_2,
index="cluster",
vSize="water",
type="index")
# treemap
treemap(grupos,
index=c("clus","med"),
vSize="val",
fontsize.labels=c(20, 16),
type="index")
library(treemapify)
ggplot(dados_expl_2, aes(area = cement, fill = cluster)) +
geom_treemap()
ggplot(grupos, aes(area = val, fill = clus)) +
geom_treemap()+
facet_wrap( ~ clus) +
labs(
title = "Título",
caption = "Legenda",
fill = "Cluster")
names(dados_expl_2)
## [1] "cement" "blast_furnace_slag" "fly_ash"
## [4] "water" "superplasticizer" "coarse_aggregate"
## [7] "fine_aggregate" "age" "CCS"
## [10] "w.c" "s.c" "cluster"
library(data.table)
long <- melt(setDT(dados_expl_2[,c(1,4,6,7,8,12)]), id.vars = c("cluster"), variable.name = c("componente"))
# treemap
treemap(long,
index=c("cluster","componente"),
vSize="value",
fontsize.labels=c(20, 16),
type="index",
vColor = "cluster",
palette="Pastel1")
## palette options => https://colorbrewer2.org/#type=qualitative&scheme=Pastel1&n=5
Os gráficos estão sendo gerados a partir das arvores aleatórias
Apesar de ser um técnica “ensemble” (classificação e regressão) pode ajudar muito no entendimento e na exploração dos dados
library(rpart)
library(rpart.plot)
library(rattle)
## Loading required package: bitops
## Rattle: A free graphical interface for data science with R.
## Version 5.4.0 Copyright (c) 2006-2020 Togaware Pty Ltd.
## Type 'rattle()' to shake, rattle, and roll your data.
library(RColorBrewer)
library(party)
## Loading required package: grid
## Loading required package: mvtnorm
## Loading required package: modeltools
## Loading required package: stats4
##
## Attaching package: 'modeltools'
## The following object is masked from 'package:car':
##
## Predict
## Loading required package: strucchange
## Loading required package: sandwich
##
## Attaching package: 'strucchange'
## The following object is masked from 'package:stringr':
##
## boundary
# Split the data into training and test set
pima.data <- na.omit(dados_expl_2[c(-12)])
set.seed(123)
training.samples <- dados_expl_2[c(-12)]$CCS %>%
createDataPartition(p = 0.75, list = FALSE)
train.data <- pima.data[training.samples, ]
test.data <- pima.data[-training.samples, ]
library(party)
set.seed(123)
model <- train(
CCS ~., data = train.data, method = "ctree2",
trControl = trainControl("cv", number = 10),
tuneGrid = expand.grid(maxdepth = 3, mincriterion = 0.95 )
)
plot(model$finalModel)
# Create a decision tree model
tree <- rpart(CCS~., data = dados_expl_2[c(-12)], cp=.02)
# Visualize the decision tree with rpart.plot
rpart.plot(tree, box.palette="RdBu", shadow.col="gray", nn=TRUE)
fancyRpartPlot(tree, caption = NULL)
plotcp(tree)
fit <- rpart(CCS ~ ., method = "anova", data = dados_expl_2[c(-12)])
plot(fit, uniform = T, main="Classification Tree for Kyphosis")
text(fit, use.n = TRUE, all = TRUE, cex = .8)
fit <- ctree(CCS ~ ., data = dados_expl_2[c(-12)])
plot(fit, main="Conditional Inference Tree for Kyphosis")
gtree <- ctree(CCS ~ ., data = dados_expl_2[c(-12)])
plot(gtree)
a <- rpart(CCS ~., data = dados_expl_2[c(-12)], method="class", cp=.001, minsplit=5)
rpart.plot(a, type=0, extra=4, under=TRUE, branch=.3)
# Step1: Begin with a small cp.
set.seed(123)
tree <- rpart(CCS ~ ., data = dados_expl_2[c(-12)],
control = rpart.control(cp = 0.0001))
printcp(tree)
##
## Regression tree:
## rpart(formula = CCS ~ ., data = dados_expl_2[c(-12)], control = rpart.control(cp = 1e-04))
##
## Variables actually used in tree construction:
## [1] age blast_furnace_slag cement coarse_aggregate
## [5] fine_aggregate fly_ash s.c superplasticizer
## [9] water
##
## Root node error: 610152/2430 = 251.09
##
## n= 2430
##
## CP nsplit rel error xerror xstd
## 1 0.26907068 0 1.000000 1.000889 0.0267347
## 2 0.13667044 1 0.730929 0.732221 0.0210169
## 3 0.07018919 2 0.594259 0.597817 0.0175576
## 4 0.05361979 3 0.524070 0.528792 0.0154163
## 5 0.04345721 4 0.470450 0.477827 0.0141349
## 6 0.02997631 5 0.426993 0.433772 0.0127434
## 7 0.02880672 6 0.397016 0.407046 0.0124516
## 8 0.02332465 7 0.368210 0.374631 0.0118797
## 9 0.02203690 8 0.344885 0.358359 0.0113467
## 10 0.01481403 9 0.322848 0.332636 0.0101677
## 11 0.01258723 10 0.308034 0.320823 0.0101306
## 12 0.00884509 11 0.295447 0.303074 0.0098086
## 13 0.00880508 12 0.286602 0.295473 0.0099057
## 14 0.00854518 13 0.277797 0.294884 0.0098997
## 15 0.00847648 14 0.269251 0.293064 0.0097912
## 16 0.00825518 15 0.260775 0.291545 0.0097303
## 17 0.00802649 16 0.252520 0.284042 0.0094680
## 18 0.00791332 17 0.244493 0.275804 0.0092073
## 19 0.00685384 18 0.236580 0.263352 0.0084330
## 20 0.00548474 19 0.229726 0.244359 0.0079897
## 21 0.00530424 20 0.224241 0.243421 0.0080357
## 22 0.00490457 21 0.218937 0.240568 0.0079490
## 23 0.00471301 22 0.214033 0.235440 0.0077831
## 24 0.00443272 23 0.209320 0.228660 0.0076866
## 25 0.00428865 24 0.204887 0.218450 0.0073296
## 26 0.00422457 26 0.196310 0.216649 0.0072702
## 27 0.00384991 27 0.192085 0.214947 0.0072619
## 28 0.00384954 28 0.188235 0.212443 0.0072337
## 29 0.00362786 29 0.184386 0.211036 0.0072247
## 30 0.00343244 30 0.180758 0.207461 0.0072618
## 31 0.00333735 32 0.173893 0.205230 0.0072278
## 32 0.00329594 33 0.170555 0.204730 0.0072243
## 33 0.00306182 35 0.163964 0.203532 0.0072163
## 34 0.00296884 36 0.160902 0.198300 0.0070531
## 35 0.00259727 37 0.157933 0.194256 0.0071361
## 36 0.00258373 38 0.155336 0.187874 0.0068979
## 37 0.00255136 40 0.150168 0.185232 0.0068321
## 38 0.00252734 41 0.147617 0.183781 0.0068189
## 39 0.00250391 42 0.145090 0.183094 0.0068116
## 40 0.00247208 43 0.142586 0.180857 0.0067146
## 41 0.00244862 44 0.140114 0.179872 0.0066687
## 42 0.00244128 45 0.137665 0.179088 0.0066567
## 43 0.00235057 46 0.135224 0.174671 0.0065958
## 44 0.00213024 47 0.132873 0.169311 0.0062737
## 45 0.00210225 49 0.128613 0.165840 0.0060708
## 46 0.00209625 51 0.124408 0.166084 0.0061255
## 47 0.00189329 52 0.122312 0.164831 0.0061407
## 48 0.00174604 53 0.120419 0.161902 0.0060586
## 49 0.00173567 54 0.118672 0.156030 0.0058408
## 50 0.00163052 55 0.116937 0.154258 0.0058271
## 51 0.00157897 56 0.115306 0.149849 0.0056940
## 52 0.00155800 57 0.113727 0.149910 0.0057243
## 53 0.00152325 59 0.110611 0.148937 0.0057173
## 54 0.00150656 60 0.109088 0.148414 0.0057103
## 55 0.00149121 61 0.107582 0.146648 0.0055398
## 56 0.00146684 62 0.106090 0.145814 0.0055301
## 57 0.00144448 63 0.104623 0.144027 0.0054979
## 58 0.00144297 64 0.103179 0.143356 0.0054999
## 59 0.00142606 65 0.101736 0.143022 0.0054823
## 60 0.00142193 66 0.100310 0.142861 0.0054827
## 61 0.00141828 67 0.098888 0.142757 0.0054803
## 62 0.00139428 70 0.094569 0.141481 0.0054724
## 63 0.00137856 73 0.090386 0.141012 0.0054707
## 64 0.00123616 74 0.089007 0.138237 0.0054379
## 65 0.00120596 76 0.086535 0.134080 0.0052977
## 66 0.00119621 77 0.085329 0.131957 0.0052770
## 67 0.00105951 78 0.084133 0.127098 0.0052033
## 68 0.00104530 79 0.083073 0.125338 0.0051449
## 69 0.00093666 80 0.082028 0.122969 0.0050327
## 70 0.00084541 81 0.081091 0.119073 0.0049569
## 71 0.00084279 82 0.080246 0.116876 0.0049005
## 72 0.00079491 83 0.079403 0.115771 0.0048804
## 73 0.00072182 84 0.078608 0.113672 0.0047505
## 74 0.00072065 85 0.077886 0.111894 0.0046391
## 75 0.00071339 86 0.077166 0.111782 0.0046393
## 76 0.00070541 87 0.076452 0.111612 0.0046358
## 77 0.00067232 88 0.075747 0.110276 0.0046341
## 78 0.00064813 89 0.075075 0.109373 0.0046448
## 79 0.00063945 90 0.074426 0.109481 0.0046456
## 80 0.00062175 91 0.073787 0.108888 0.0046308
## 81 0.00060900 92 0.073165 0.108742 0.0046310
## 82 0.00060653 93 0.072556 0.108338 0.0046268
## 83 0.00057929 94 0.071950 0.108531 0.0046244
## 84 0.00054857 95 0.071370 0.106417 0.0045389
## 85 0.00054270 96 0.070822 0.105866 0.0045349
## 86 0.00053774 98 0.069736 0.105508 0.0045347
## 87 0.00053463 100 0.068661 0.105067 0.0045319
## 88 0.00053332 102 0.067592 0.104806 0.0045268
## 89 0.00051173 103 0.067058 0.104135 0.0045203
## 90 0.00050639 104 0.066547 0.103624 0.0045207
## 91 0.00049746 105 0.066040 0.103397 0.0045228
## 92 0.00046981 106 0.065543 0.102105 0.0044889
## 93 0.00046845 107 0.065073 0.101153 0.0044791
## 94 0.00046329 108 0.064604 0.100791 0.0044773
## 95 0.00045663 109 0.064141 0.100449 0.0044717
## 96 0.00044203 110 0.063685 0.100492 0.0044762
## 97 0.00043694 111 0.063243 0.099136 0.0044418
## 98 0.00043674 112 0.062806 0.098949 0.0044422
## 99 0.00042992 113 0.062369 0.098647 0.0044412
## 100 0.00042269 114 0.061939 0.098613 0.0044454
## 101 0.00040872 115 0.061516 0.098348 0.0044495
## 102 0.00040768 116 0.061108 0.097861 0.0044330
## 103 0.00040624 117 0.060700 0.097626 0.0044322
## 104 0.00039624 119 0.059887 0.097381 0.0044311
## 105 0.00036548 120 0.059491 0.096397 0.0044284
## 106 0.00034583 121 0.059126 0.095818 0.0044232
## 107 0.00033607 122 0.058780 0.095435 0.0044198
## 108 0.00033452 123 0.058444 0.095015 0.0043990
## 109 0.00033190 124 0.058109 0.094864 0.0043986
## 110 0.00032988 125 0.057777 0.094755 0.0043988
## 111 0.00032532 127 0.057118 0.094813 0.0044008
## 112 0.00031721 128 0.056792 0.094685 0.0044012
## 113 0.00029800 129 0.056475 0.094435 0.0043952
## 114 0.00029314 130 0.056177 0.093832 0.0043800
## 115 0.00028457 132 0.055591 0.092951 0.0043757
## 116 0.00028147 133 0.055306 0.092687 0.0043740
## 117 0.00026273 134 0.055025 0.091994 0.0043700
## 118 0.00026151 137 0.054237 0.091358 0.0043608
## 119 0.00025999 138 0.053975 0.091358 0.0043608
## 120 0.00025819 139 0.053715 0.091304 0.0043618
## 121 0.00024384 140 0.053457 0.090822 0.0043601
## 122 0.00023650 141 0.053213 0.090164 0.0043586
## 123 0.00023385 142 0.052976 0.089884 0.0043483
## 124 0.00023312 143 0.052743 0.089832 0.0043480
## 125 0.00022673 144 0.052510 0.089772 0.0043472
## 126 0.00021980 145 0.052283 0.089302 0.0043453
## 127 0.00021952 146 0.052063 0.089226 0.0043462
## 128 0.00021418 147 0.051843 0.089145 0.0043465
## 129 0.00021100 148 0.051629 0.089100 0.0043461
## 130 0.00021075 149 0.051418 0.089181 0.0043462
## 131 0.00020179 150 0.051208 0.088957 0.0043322
## 132 0.00019441 151 0.051006 0.088296 0.0043131
## 133 0.00017584 152 0.050811 0.087830 0.0043113
## 134 0.00016983 154 0.050460 0.087522 0.0043130
## 135 0.00016944 155 0.050290 0.087361 0.0043100
## 136 0.00016589 156 0.050120 0.087194 0.0043107
## 137 0.00016322 157 0.049954 0.087170 0.0043108
## 138 0.00016250 158 0.049791 0.087110 0.0043125
## 139 0.00015946 159 0.049629 0.086651 0.0043033
## 140 0.00015259 160 0.049469 0.086642 0.0043014
## 141 0.00014679 161 0.049317 0.086496 0.0043004
## 142 0.00014343 162 0.049170 0.086307 0.0043004
## 143 0.00014284 163 0.049026 0.086288 0.0043004
## 144 0.00014094 164 0.048884 0.086202 0.0043008
## 145 0.00013741 165 0.048743 0.086092 0.0042999
## 146 0.00013389 166 0.048605 0.085715 0.0042959
## 147 0.00011307 167 0.048471 0.084939 0.0042904
## 148 0.00010345 168 0.048358 0.084606 0.0042892
## 149 0.00010000 169 0.048255 0.084560 0.0042883
bestcp <- tree$cptable[which.min(tree$cptable[,"xerror"]),"CP"]
tree.pruned <- prune(tree, cp = bestcp)
plot(tree.pruned)
text(tree.pruned, cex = 0.8, use.n = TRUE, xpd = TRUE)
prp(tree.pruned, faclen = 0, cex = 0.8, extra = 1)
tot_count <- function(x, labs, digits, varlen)
{
paste(labs, "\n\nn =", x$frame$n)
}
prp(tree.pruned, faclen = 0, cex = 0.8, node.fun=tot_count)
only_count <- function(x, labs, digits, varlen)
{
paste(x$frame$n)
}
boxcols <- c("pink", "palegreen3")[tree.pruned$frame$yval]
par(xpd=TRUE)
prp(tree.pruned, faclen = 0, cex = 0.8, node.fun=only_count, box.col = boxcols)
legend("bottomleft", legend = c("died","survived"), fill = c("pink", "palegreen3"),
title = "Group")
binary.model <- rpart(CCS ~ ., data = dados_expl_2[c(-12)], cp = .02)
rpart.plot(binary.model)
anova.model <- rpart(CCS ~ ., data = dados_expl_2[c(-12)])
rpart.plot(anova.model)
m1 <- rpart(
formula = CCS ~ .,
data = dados_expl_2[c(-12)],
method = "anova")
m1
## n= 2430
##
## node), split, n, deviance, yval
## * denotes terminal node
##
## 1) root 2430 610151.700 35.46851
## 2) age< 21 758 103497.700 23.26084
## 4) cement< 354.5 562 45897.870 19.38107
## 8) age< 10.5 397 19549.520 15.77574 *
## 9) age>=10.5 165 8771.887 28.05570 *
## 5) cement>=354.5 196 24883.640 34.38551 *
## 3) age>=21 1672 342480.000 41.00284
## 6) cement< 354.5 1342 205711.500 37.50082
## 12) cement< 164.8 295 23568.650 26.85841
## 24) blast_furnace_slag< 0.15 62 3373.118 16.12774 *
## 25) blast_furnace_slag>=0.15 233 11156.730 29.71378 *
## 13) cement>=164.8 1047 139316.800 40.49940
## 26) water>=175.98 643 60500.700 36.51042
## 52) blast_furnace_slag< 13 336 18321.210 32.13935 *
## 53) blast_furnace_slag>=13 307 28733.640 41.29440 *
## 27) water< 175.98 404 52300.590 46.84819
## 54) blast_furnace_slag< 47.63 265 22795.440 42.54966
## 108) s.c>=3.984321 117 5204.612 36.49487 *
## 109) s.c< 3.984321 148 9910.706 47.33622 *
## 55) blast_furnace_slag>=47.63 139 15273.580 55.04324 *
## 7) cement>=354.5 330 53378.840 55.24439
## 14) water>=183.05 143 17260.350 46.73098 *
## 15) water< 183.05 187 17828.400 61.75465 *
rpart.plot(m1)
plotcp(m1)
m2 <- rpart(
formula = CCS ~ .,
data = dados_expl_2[c(-12)],
method = "anova",
control = list(cp = 0, xval = 10)
)
rpart.plot(m2)
plotcp(m2)
abline(v = 12, lty = "dashed")
## Tuning
m3 <- rpart(
formula = CCS ~ .,
data = dados_expl_2[c(-12)],
method = "anova",
control = list(minsplit = 10, maxdepth = 12, xval = 10))
m3$cptable
## CP nsplit rel error xerror xstd
## 1 0.26907068 0 1.0000000 1.0012305 0.02674818
## 2 0.13667044 1 0.7309293 0.7319427 0.02099423
## 3 0.07018919 2 0.5942589 0.5977102 0.01754427
## 4 0.05361979 3 0.5240697 0.5312068 0.01546320
## 5 0.04345721 4 0.4704499 0.4809029 0.01427868
## 6 0.02997631 5 0.4269927 0.4421931 0.01321163
## 7 0.02880672 6 0.3970164 0.4213547 0.01302233
## 8 0.02332465 7 0.3682097 0.3762610 0.01185103
## 9 0.02203690 8 0.3448850 0.3673092 0.01151320
## 10 0.01481403 9 0.3228481 0.3439957 0.01065425
## 11 0.01258723 10 0.3080341 0.3305256 0.01056481
## 12 0.01000000 11 0.2954468 0.3134969 0.01014127
rpart.plot(m3)
plotcp(m3)
hyper_grid <- expand.grid(
minsplit = seq(5, 20, 1),
maxdepth = seq(8, 15, 1))
head(hyper_grid)
## minsplit maxdepth
## 1 5 8
## 2 6 8
## 3 7 8
## 4 8 8
## 5 9 8
## 6 10 8
nrow(hyper_grid)
## [1] 128
models <- list()
for (i in 1:nrow(hyper_grid)) {
# get minsplit, maxdepth values at row i
minsplit <- hyper_grid$minsplit[i]
maxdepth <- hyper_grid$maxdepth[i]
# train a model and store in the list
models[[i]] <- rpart(
formula = CCS ~ .,
data = dados_expl_2[c(-12)],
method = "anova",
control = list(minsplit = minsplit, maxdepth = maxdepth)
)
}
# function to get optimal cp
get_cp <- function(x) {
min <- which.min(x$cptable[, "xerror"])
cp <- x$cptable[min, "CP"]
}
# function to get minimum error
get_min_error <- function(x) {
min <- which.min(x$cptable[, "xerror"])
xerror <- x$cptable[min, "xerror"]
}
hyper_grid %>%
mutate(
cp = purrr::map_dbl(models, get_cp),
error = purrr::map_dbl(models, get_min_error)
) %>%
arrange(error) %>%
top_n(-5, wt = error)
## minsplit maxdepth cp error
## 1 17 8 0.01 0.2991030
## 2 18 8 0.01 0.2995492
## 3 20 14 0.01 0.2996629
## 4 9 15 0.01 0.3015332
## 5 5 15 0.01 0.3018493
optimal_tree <- rpart(
formula = CCS ~ .,
data = dados_expl_2[c(-12)],
method = "anova",
control = list(minsplit = 11, maxdepth = 8, cp = 0.01)
)
pred <- predict(optimal_tree, newdata = dados_expl_2[c(-12)])
RMSE(dados_expl_2$CCS,pred)
## [1] 8.61302
library(DataExplorer)
plot_str(dados_expl_2[c(1,2,3,4,5,6,7,8,9)], type = "r", max_level = 1, print_network = TRUE,
fontSize = 40, width = 1000, margin = 10)
plot_str(dados_expl_2[c(1,2,3,4,5,6,7,8,9)], type = "d", max_level = 1, print_network = TRUE,
fontSize = 40, width = 800, margin = 10)
# install.packages("xray")
library(xray)
anomalies(dados_expl_2)
## Warning: `funs()` was deprecated in dplyr 0.8.0.
## Please use a list of either functions or lambdas:
##
## # Simple named list:
## list(mean = mean, median = median)
##
## # Auto named with `tibble::lst()`:
## tibble::lst(mean, median)
##
## # Using lambdas
## list(~ mean(., trim = .2), ~ median(., na.rm = TRUE))
## $variables
## Variable q qNA pNA qZero pZero qBlank pBlank qInf pInf
## 1 blast_furnace_slag 2430 0 - 1149 47.28% 0 - 0 -
## 2 fly_ash 2430 0 - 1104 45.43% 0 - 0 -
## 3 superplasticizer 2430 0 - 758 31.19% 0 - 0 -
## 4 cluster 2430 0 - 0 - 0 - 0 -
## 5 age 2430 0 - 0 - 0 - 0 -
## 6 water 2430 0 - 0 - 0 - 0 -
## 7 coarse_aggregate 2430 0 - 0 - 0 - 0 -
## 8 cement 2430 0 - 0 - 0 - 0 -
## 9 fine_aggregate 2430 0 - 0 - 0 - 0 -
## 10 s.c 2430 0 - 0 - 0 - 0 -
## 11 w.c 2430 0 - 0 - 0 - 0 -
## 12 CCS 2430 0 - 0 - 0 - 0 -
## qDistinct type anomalous_percent
## 1 238 Numeric 47.28%
## 2 236 Numeric 45.43%
## 3 176 Numeric 31.19%
## 4 7 Factor -
## 5 14 Numeric -
## 6 274 Numeric -
## 7 346 Numeric -
## 8 348 Numeric -
## 9 380 Numeric -
## 10 508 Numeric -
## 11 824 Numeric -
## 12 917 Numeric -
##
## $problem_variables
## [1] Variable q qNA pNA
## [5] qZero pZero qBlank pBlank
## [9] qInf pInf qDistinct type
## [13] anomalous_percent problems
## <0 rows> (or 0-length row.names)
distributions(dados_expl_2)
## ================================================================================
## Variable p_1 p_10 p_25 p_50 p_75 p_90 p_99
## 1 blast_furnace_slag 0 0 0 19 129.8 189 282.8
## 2 fly_ash 0 0 0 81.9 123 161 237.923
## 3 superplasticizer 0 0 0 6.86 10.1 12 23.4
## 4 age 3 3 14 28 56 100 365
## 5 water 126.716 155.52 164 182.9 192 203.5 228
## 6 coarse_aggregate 801 852.1 932 971.8 1040 1076.2 1125
## 7 cement 122.6 154.8 190.3 252 337.9 425 531.3
## 8 fine_aggregate 594 670 737 780.1 824 875.949 943.1
## 9 s.c 1.1676 1.7585 2.2312 3.0163 4.2165 4.9681 6.5261
## 10 w.c 0.3204 0.5333 0.7282 1.0996 1.5376 2.0699 3.1214
## 11 CCS 6.94 14.64 24 34.345 45.0275 56.62 74.99
# install.packages("visdat")
library(visdat)
vis_dat(dados_expl_2)
vis_guess(dados_expl_2)
vis_miss(dados_expl_2)
vis_cor(dados_expl_2[,1:8])
# install.packages("dlookr")
library(dlookr)
##
## Attaching package: 'dlookr'
## The following object is masked from 'package:rattle':
##
## binning
## The following objects are masked from 'package:PerformanceAnalytics':
##
## kurtosis, skewness
## The following object is masked from 'package:tidyr':
##
## extract
## The following object is masked from 'package:base':
##
## transform
eda_report(dados_expl_2, "CCS", output_format = "html")
##
##
## processing file: EDA_Report.Rmd
##
|
| | 0%
|
|.. | 3%
## inline R code fragments
##
##
|
|..... | 7%
## label: setup (with options)
## List of 1
## $ include: logi FALSE
##
##
|
|....... | 10%
## ordinary text without R code
##
##
|
|.......... | 14%
## label: enrironment (with options)
## List of 3
## $ echo : logi FALSE
## $ warning: logi FALSE
## $ message: logi FALSE
##
##
|
|............ | 17%
## ordinary text without R code
##
##
|
|.............. | 21%
## label: udf (with options)
## List of 3
## $ echo : logi FALSE
## $ warning: logi FALSE
## $ message: logi FALSE
##
##
|
|................. | 24%
## ordinary text without R code
##
##
|
|................... | 28%
## label: check_variables (with options)
## List of 4
## $ echo : logi FALSE
## $ warning: logi FALSE
## $ message: logi FALSE
## $ comment: chr ""
##
##
|
|...................... | 31%
## inline R code fragments
##
##
|
|........................ | 34%
## label: info_variables (with options)
## List of 5
## $ echo : logi FALSE
## $ warning: logi FALSE
## $ message: logi FALSE
## $ comment: chr ""
## $ results: chr "asis"
##
##
|
|........................... | 38%
## inline R code fragments
##
##
|
|............................. | 41%
## label: describe_univariate (with options)
## List of 4
## $ echo : logi FALSE
## $ warning: logi FALSE
## $ message: logi FALSE
## $ comment: chr ""
##
|
|............................... | 45%
## ordinary text without R code
##
##
|
|.................................. | 48%
## label: normality (with options)
## List of 7
## $ echo : logi FALSE
## $ warning : logi FALSE
## $ message : logi FALSE
## $ comment : chr ""
## $ fig.height: num 4
## $ fig.width : num 6
## $ results : chr "asis"
##
|
|.................................... | 52%
## ordinary text without R code
##
##
|
|....................................... | 55%
## label: correlations (with options)
## List of 4
## $ echo : logi FALSE
## $ warning: logi FALSE
## $ message: logi FALSE
## $ comment: chr ""
##
##
|
|......................................... | 59%
## ordinary text without R code
##
##
|
|........................................... | 62%
## label: plot_correlations (with options)
## List of 6
## $ echo : logi FALSE
## $ warning : logi FALSE
## $ message : logi FALSE
## $ comment : chr ""
## $ fig.height: num 8
## $ fig.width : num 8
##
|
|.............................................. | 66%
## ordinary text without R code
##
##
|
|................................................ | 69%
## label: create_target_by (with options)
## List of 3
## $ echo : logi FALSE
## $ warning: logi FALSE
## $ message: logi FALSE
##
##
|
|................................................... | 72%
## ordinary text without R code
##
##
|
|..................................................... | 76%
## label: numeric_variables (with options)
## List of 7
## $ echo : logi FALSE
## $ warning : logi FALSE
## $ message : logi FALSE
## $ comment : chr ""
## $ fig.height: num 6
## $ fig.width : num 7
## $ results : chr "asis"
##
|
|........................................................ | 79%
## ordinary text without R code
##
##
|
|.......................................................... | 83%
## label: category_variables (with options)
## List of 7
## $ echo : logi FALSE
## $ warning : logi FALSE
## $ message : logi FALSE
## $ comment : chr ""
## $ fig.height: num 4
## $ fig.width : num 7
## $ results : chr "asis"
##
|
|............................................................ | 86%
## ordinary text without R code
##
##
|
|............................................................... | 90%
## label: group_correlations (with options)
## List of 4
## $ echo : logi FALSE
## $ warning: logi FALSE
## $ message: logi FALSE
## $ comment: chr ""
##
##
|
|................................................................. | 93%
## ordinary text without R code
##
##
|
|.................................................................... | 97%
## label: plot_group_correlations (with options)
## List of 6
## $ echo : logi FALSE
## $ warning : logi FALSE
## $ message : logi FALSE
## $ comment : chr ""
## $ fig.height: num 8
## $ fig.width : num 8
##
##
|
|......................................................................| 100%
## ordinary text without R code
## output file: EDA_Report.knit.md
## "C:/Program Files/RStudio/bin/pandoc/pandoc" +RTS -K512m -RTS EDA_Report.utf8.md --to html4 --from markdown+autolink_bare_uris+tex_math_single_backslash --output pandoc33ac14f349e3.html --lua-filter "D:\Users\Fernando\Documents\R\win-library\4.0\rmarkdown\rmarkdown\lua\pagebreak.lua" --lua-filter "D:\Users\Fernando\Documents\R\win-library\4.0\rmarkdown\rmarkdown\lua\latex-div.lua" --self-contained --standalone --section-divs --table-of-contents --toc-depth 3 --template "D:/Users/Fernando/Documents/R/win-library/4.0/prettydoc/resources/templates/cayman.html" --highlight-style pygments --number-sections --include-in-header "C:\Users\Fernando\AppData\Local\Temp\RtmpI9H8QY\rmarkdown-str33ac2a392f91.html" --mathjax --variable "mathjax-url:https://mathjax.rstudio.com/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML" --css "C:\Users\Fernando\AppData\Local\Temp\RtmpI9H8QY\EDA_Report_files/style.css"
##
## Output created: C:\Users\Fernando\AppData\Local\Temp\RtmpI9H8QY/EDA_Report.html
# install.packages("DataExplorer")
library(DataExplorer)
introduce(dados_expl_2[c(1,2,3,4,5,6,7,8)])
## rows columns discrete_columns continuous_columns all_missing_columns
## 1 2430 8 0 8 0
## total_missing_values complete_rows total_observations memory_usage
## 1 0 2430 19440 158120
plot_intro(dados_expl_2[c(1,2,3,4,5,6,7,8)])
plot_missing(dados_expl_2[c(1,2,3,4,5,6,7,8)])
profile_missing(dados_expl_2[c(1,2,3,4,5,6,7,8)])
## feature num_missing pct_missing
## 1 cement 0 0
## 2 blast_furnace_slag 0 0
## 3 fly_ash 0 0
## 4 water 0 0
## 5 superplasticizer 0 0
## 6 coarse_aggregate 0 0
## 7 fine_aggregate 0 0
## 8 age 0 0
plot_histogram(dados_expl_2[c(1,2,3,4,5,6,7,8)])
plot_density(dados_expl_2[c(1,2,3,4,5,6,7,8)])
#plot_bar(dados_expl_2[c(1,2,3,4,5,6,7,8)])
plot_qq(dados_expl_2[c(1,2,3,4,5,6,7,8)])
plot_correlation(dados_expl_2[c(1,2,3,4,5,6,7,8)])
plot_prcomp(dados_expl_2[c(1,2,3,4,5,6,7,8)])
plot_scatterplot(dados_expl_2[c(1,2,3,4,5,6,7,8,9)], by = "CCS")
plot_boxplot(dados_expl_2[c(1,2,3,4,5,6,7,8,9)], by = "CCS")
plot_str(dados_expl_2[c(1,2,3,4,5,6,7,8,9)])
# install.packages("DataExplorer")
library(DataExplorer)
introduce(dados_expl_2[c(1,2,3,4,5,6,7,8)])
## rows columns discrete_columns continuous_columns all_missing_columns
## 1 2430 8 0 8 0
## total_missing_values complete_rows total_observations memory_usage
## 1 0 2430 19440 158120
plot_intro(dados_expl_2[c(1,2,3,4,5,6,7,8)])
plot_missing(dados_expl_2[c(1,2,3,4,5,6,7,8)])
profile_missing(dados_expl_2[c(1,2,3,4,5,6,7,8)])
## feature num_missing pct_missing
## 1 cement 0 0
## 2 blast_furnace_slag 0 0
## 3 fly_ash 0 0
## 4 water 0 0
## 5 superplasticizer 0 0
## 6 coarse_aggregate 0 0
## 7 fine_aggregate 0 0
## 8 age 0 0
plot_histogram(dados_expl_2[c(1,2,3,4,5,6,7,8)])
plot_density(dados_expl_2[c(1,2,3,4,5,6,7,8)])
#plot_bar(dados_expl_2[c(1,2,3,4,5,6,7,8)])
plot_qq(dados_expl_2[c(1,2,3,4,5,6,7,8)])
plot_correlation(dados_expl_2[c(1,2,3,4,5,6,7,8)])
plot_prcomp(dados_expl_2[c(1,2,3,4,5,6,7,8)])
plot_scatterplot(dados_expl_2[c(1,2,3,4,5,6,7,8,9)], by = "CCS")
plot_boxplot(dados_expl_2[c(1,2,3,4,5,6,7,8,9)], by = "CCS")
plot_str(dados_expl_2[c(1,2,3,4,5,6,7,8,9)])
create_report(dados_expl_2[c(1,2,3,4,5,6,7,8,9)])
##
##
## processing file: report.rmd
##
|
| | 0%
|
|.. | 2%
## inline R code fragments
##
##
|
|... | 5%
## label: global_options (with options)
## List of 1
## $ include: logi FALSE
##
##
|
|..... | 7%
## ordinary text without R code
##
##
|
|....... | 10%
## label: introduce
##
|
|........ | 12%
## ordinary text without R code
##
##
|
|.......... | 14%
## label: plot_intro
##
|
|............ | 17%
## ordinary text without R code
##
##
|
|............. | 19%
## label: data_structure
##
|
|............... | 21%
## ordinary text without R code
##
##
|
|................. | 24%
## label: missing_profile
##
|
|.................. | 26%
## ordinary text without R code
##
##
|
|.................... | 29%
## label: univariate_distribution_header
##
|
|...................... | 31%
## ordinary text without R code
##
##
|
|....................... | 33%
## label: plot_histogram
##
|
|......................... | 36%
## ordinary text without R code
##
##
|
|........................... | 38%
## label: plot_density
##
|
|............................ | 40%
## ordinary text without R code
##
##
|
|.............................. | 43%
## label: plot_frequency_bar
##
|
|................................ | 45%
## ordinary text without R code
##
##
|
|................................. | 48%
## label: plot_response_bar
##
|
|................................... | 50%
## ordinary text without R code
##
##
|
|..................................... | 52%
## label: plot_with_bar
##
|
|...................................... | 55%
## ordinary text without R code
##
##
|
|........................................ | 57%
## label: plot_normal_qq
##
|
|.......................................... | 60%
## ordinary text without R code
##
##
|
|........................................... | 62%
## label: plot_response_qq
##
|
|............................................. | 64%
## ordinary text without R code
##
##
|
|............................................... | 67%
## label: plot_by_qq
##
|
|................................................ | 69%
## ordinary text without R code
##
##
|
|.................................................. | 71%
## label: correlation_analysis
##
|
|.................................................... | 74%
## ordinary text without R code
##
##
|
|..................................................... | 76%
## label: principal_component_analysis
##
|
|....................................................... | 79%
## ordinary text without R code
##
##
|
|......................................................... | 81%
## label: bivariate_distribution_header
##
|
|.......................................................... | 83%
## ordinary text without R code
##
##
|
|............................................................ | 86%
## label: plot_response_boxplot
##
|
|.............................................................. | 88%
## ordinary text without R code
##
##
|
|............................................................... | 90%
## label: plot_by_boxplot
##
|
|................................................................. | 93%
## ordinary text without R code
##
##
|
|................................................................... | 95%
## label: plot_response_scatterplot
##
|
|.................................................................... | 98%
## ordinary text without R code
##
##
|
|......................................................................| 100%
## label: plot_by_scatterplot
## output file: D:/Docencia/UPF/PPGEng/Redes Neurais - Machine Learn/Portifolio/report.knit.md
## "C:/Program Files/RStudio/bin/pandoc/pandoc" +RTS -K512m -RTS "D:/Docencia/UPF/PPGEng/Redes Neurais - Machine Learn/Portifolio/report.utf8.md" --to html4 --from markdown+autolink_bare_uris+tex_math_single_backslash --output pandoc33ac119a614e.html --lua-filter "D:\Users\Fernando\Documents\R\win-library\4.0\rmarkdown\rmarkdown\lua\pagebreak.lua" --lua-filter "D:\Users\Fernando\Documents\R\win-library\4.0\rmarkdown\rmarkdown\lua\latex-div.lua" --self-contained --variable bs3=TRUE --standalone --section-divs --table-of-contents --toc-depth 6 --template "D:\Users\Fernando\Documents\R\win-library\4.0\rmarkdown\rmd\h\default.html" --no-highlight --variable highlightjs=1 --variable theme=yeti --include-in-header "C:\Users\Fernando\AppData\Local\Temp\RtmpI9H8QY\rmarkdown-str33ac6fe62952.html" --mathjax --variable "mathjax-url:https://mathjax.rstudio.com/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML"
##
## Output created: report.html
by Ramires Engenharia