Este relatório aplica técnicas de Análise Multivariada a dados agrometeorológicos, incluindo estatística descritiva, análise de correlação, Análise de Componentes Principais (PCA), métodos de clusterização e visualizações bidimensionais e tridimensionais, com ênfase na interpretação integrada de variáveis climáticas e produtivas.
knitr::opts_chunk$set(echo = TRUE)
library(tidyverse)
library(FactoMineR)
library(factoextra)
library(cluster)
library(corrplot)
library(plotly)
library(htmlwidgets)
dados <- read.csv("https://raw.githubusercontent.com/leocbc/ANALISE-MULTIVARIADA/main/dataset_agroamazonia_novo500.csv")
head(dados)
summary(dados)
## Area_ID Latitude Longitude Especie
## Length:500 Min. :-3.500 Min. :-60.50 Length:500
## Class :character 1st Qu.:-3.120 1st Qu.:-59.95 Class :character
## Mode :character Median :-2.739 Median :-59.48 Mode :character
## Mean :-2.743 Mean :-59.50
## 3rd Qu.:-2.341 3rd Qu.:-59.02
## Max. :-2.000 Max. :-58.50
## Solo Prod_Anual_Ton Biomassa_t_ha Teor_Carbono_pct
## Length:500 Min. : 1.382 Min. : 5.741 Min. :36.26
## Class :character 1st Qu.:24.387 1st Qu.:16.472 1st Qu.:45.83
## Mode :character Median :28.663 Median :19.360 Median :49.02
## Mean :28.785 Mean :19.213 Mean :49.06
## 3rd Qu.:33.034 3rd Qu.:21.670 3rd Qu.:52.48
## Max. :48.246 Max. :30.950 Max. :61.89
## Indice_Sustentabilidade Custo_Producao_R..ha Preco_Venda_R..Ton
## Min. :0.1522 Min. :1581 Min. : 527.4
## 1st Qu.:0.5511 1st Qu.:3161 1st Qu.: 819.1
## Median :0.6795 Median :3593 Median : 890.7
## Mean :0.6770 Mean :3619 Mean : 896.2
## 3rd Qu.:0.8094 3rd Qu.:4125 3rd Qu.: 967.0
## Max. :1.0000 Max. :5674 Max. :1247.9
## CH4_emissao_kg_ha N2O_emissao_kg_ha Temp_Solo_C Umidade_Solo_pct
## Min. :24.94 Min. :0.9063 Min. :22.09 Min. :20.36
## 1st Qu.:52.68 1st Qu.:3.7445 1st Qu.:26.57 1st Qu.:39.19
## Median :61.24 Median :4.6303 Median :28.02 Median :43.86
## Mean :60.90 Mean :4.6472 Mean :28.02 Mean :43.62
## 3rd Qu.:68.88 3rd Qu.:5.5133 3rd Qu.:29.35 3rd Qu.:48.18
## Max. :97.19 Max. :8.5953 Max. :34.20 Max. :63.93
## Precipitacao_mm
## Min. :1490
## 1st Qu.:2188
## Median :2390
## Mean :2381
## 3rd Qu.:2570
## Max. :3204
Resultados
A estatística descritiva evidencia ampla variabilidade entre as
variáveis agrometeorológicas e produtivas analisadas, indicando a
existência de diferentes condições ambientais e contextos produtivos no
conjunto de dados.
dados_num <- dados %>% select(where(is.numeric))
cor_mat <- cor(dados_num, use = "pairwise.complete.obs")
corrplot(cor_mat, method = "color", type = "upper",
tl.col = "black", tl.srt = 45)
dados_scaled <- scale(dados_num)
pca <- prcomp(dados_scaled, center = TRUE, scale. = TRUE)
summary(pca)
## Importance of components:
## PC1 PC2 PC3 PC4 PC5 PC6 PC7
## Standard deviation 1.10775 1.09433 1.08755 1.07550 1.04113 1.02688 0.99308
## Proportion of Variance 0.09439 0.09212 0.09098 0.08898 0.08338 0.08111 0.07586
## Cumulative Proportion 0.09439 0.18651 0.27749 0.36647 0.44985 0.53097 0.60683
## PC8 PC9 PC10 PC11 PC12 PC13
## Standard deviation 0.96723 0.94552 0.93910 0.90844 0.89405 0.88046
## Proportion of Variance 0.07196 0.06877 0.06784 0.06348 0.06149 0.05963
## Cumulative Proportion 0.67879 0.74756 0.81540 0.87888 0.94037 1.00000
fviz_pca_biplot(pca, label = "var", repel = TRUE)
## Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
## ℹ Please use `linewidth` instead.
## ℹ The deprecated feature was likely used in the ggpubr package.
## Please report the issue at <https://github.com/kassambara/ggpubr/issues>.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
pca_3d <- plot_ly(
x = pca$x[,1],
y = pca$x[,2],
z = pca$x[,3],
type = "scatter3d",
mode = "markers"
)
pca_3d
saveWidget(pca_3d, "PCA_3D.html", selfcontained = TRUE)
dist_eucl <- dist(dados_scaled)
hc_ward <- hclust(dist_eucl, method = "ward.D2")
plot(hc_ward)
rect.hclust(hc_ward, k = 3, border = 2:4)
set.seed(123)
k3 <- kmeans(dados_scaled, centers = 3, nstart = 25)
k3
## K-means clustering with 3 clusters of sizes 166, 148, 186
##
## Cluster means:
## Latitude Longitude Prod_Anual_Ton Biomassa_t_ha Teor_Carbono_pct
## 1 0.06462893 0.2902814 0.7226764 0.09378667 -0.1915090
## 2 0.22487249 0.2300694 -0.5299062 0.15100698 -0.3201724
## 3 -0.23661039 -0.4421343 -0.2233234 -0.20385817 0.4256775
## Indice_Sustentabilidade Custo_Producao_R..ha Preco_Venda_R..Ton
## 1 0.04005055 -0.08266543 0.41671988
## 2 -0.11145546 0.58544398 0.08873687
## 3 0.05294095 -0.39206047 -0.44251912
## CH4_emissao_kg_ha N2O_emissao_kg_ha Temp_Solo_C Umidade_Solo_pct
## 1 -0.1777679 -0.2175820 0.5847690 0.2918041
## 2 -0.3960442 -0.1397046 -0.3930965 -0.6316615
## 3 0.4737850 0.3053489 -0.2091041 0.2421851
## Precipitacao_mm
## 1 -0.2722695
## 2 0.1507802
## 3 0.1230176
##
## Clustering vector:
## [1] 1 1 3 2 3 3 3 3 3 3 3 1 3 2 1 2 3 1 3 3 3 3 2 2 2 3 2 2 1 3 3 1 1 3 3 1 1
## [38] 1 1 2 2 3 3 2 2 1 1 2 1 1 2 2 3 2 3 2 1 1 2 3 3 3 2 1 1 2 3 3 3 3 1 1 3 3
## [75] 1 2 3 2 1 3 1 3 2 3 3 3 3 2 2 2 3 3 1 3 2 2 1 1 1 3 1 2 2 1 1 2 1 1 2 3 3
## [112] 3 1 1 2 1 3 3 3 3 3 1 1 2 3 1 2 1 2 2 1 2 1 1 3 1 1 1 2 1 3 2 2 2 3 3 1 2
## [149] 1 3 3 3 3 3 3 2 3 3 1 2 3 2 1 1 1 1 2 3 1 1 3 1 3 3 1 2 3 1 2 1 2 2 3 1 1
## [186] 3 2 3 1 1 2 2 3 3 2 2 1 1 1 2 2 1 3 1 1 2 2 2 1 1 3 3 2 1 1 3 1 1 3 2 2 3
## [223] 2 3 1 2 3 1 1 1 3 3 2 1 3 2 1 1 3 1 2 3 1 2 2 1 2 3 3 3 3 1 1 2 3 2 1 3 2
## [260] 3 1 2 3 1 1 3 2 3 3 1 1 3 2 3 2 3 1 3 3 1 3 1 3 3 3 3 3 1 2 3 3 3 3 1 1 1
## [297] 1 3 3 3 3 2 2 2 3 1 1 3 3 1 2 1 2 3 3 1 1 1 2 1 3 2 3 2 3 2 1 2 2 1 1 3 3
## [334] 3 3 1 2 2 3 1 2 3 3 1 2 2 2 3 2 3 1 1 2 1 3 2 2 2 1 2 3 2 2 1 1 3 3 2 1 3
## [371] 3 1 2 3 3 1 2 1 3 2 2 3 1 3 3 3 2 1 2 3 1 2 3 2 3 2 2 2 3 1 2 2 1 3 3 1 3
## [408] 1 1 3 1 2 2 3 2 1 1 2 1 3 1 3 3 3 2 2 3 3 1 2 1 1 3 2 3 3 1 2 3 3 2 1 1 3
## [445] 2 2 3 2 3 1 1 1 1 2 1 1 1 1 2 2 2 1 2 3 1 2 3 1 2 3 3 3 2 2 2 1 3 2 3 2 3
## [482] 3 1 1 2 3 1 1 1 3 2 3 3 3 3 1 1 2 1 2
##
## Within cluster sum of squares by cluster:
## [1] 1999.289 1659.492 2108.247
## (between_SS / total_SS = 11.1 %)
##
## Available components:
##
## [1] "cluster" "centers" "totss" "withinss" "tot.withinss"
## [6] "betweenss" "size" "iter" "ifault"
fviz_cluster(k3, data = dados_scaled)
Os resultados indicam que a variabilidade agrometeorológica e produtiva pode ser explicada por poucos fatores latentes, associados principalmente às condições climáticas e às respostas produtivas dos sistemas agrícolas.
As técnicas multivariadas aplicadas mostraram-se eficientes para a análise integrada de dados agrometeorológicos complexos.