Partição dos Componentes Principais - cultivos

Author

Luana

Serviço Ecossistêmico: agricultura (provisão)

Os dados foram obtidos a partir do mapa interativo do Censo agropecuário 2017.

  • Foi extraído o shapefile do estado do Paraná, com a delimitação municipal. Com o auxílio do QGis, extrai a tabela de atributos. E feito o download da tabela de dados da webpage Censo Agropecuário 2017

  • A tabela de atributos tem dados do Rendimento médio(kg/ha) dos produtos: arroz, cana, mandioca, milho, soja, trigo, cacau, café, laranja e uva. Para o feijão, extraído pela ferramenta do site e calculado o rendimento (produção_kg/area colhida_ha).

  • Rendimento médio = quociente entre produção e área colhida.

Redução da dimensionalidade dos dados

Reduzir o número de variáveis, tentar representar as 11 variáveis em uma ou duas.

setwd("~/Mestrado PPG ECO/Dissertação")
library(readr)
library(dplyr)

Attaching package: 'dplyr'
The following objects are masked from 'package:stats':

    filter, lag
The following objects are masked from 'package:base':

    intersect, setdiff, setequal, union
library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ forcats   1.0.0     ✔ stringr   1.5.0
✔ ggplot2   3.4.2     ✔ tibble    3.2.1
✔ lubridate 1.9.2     ✔ tidyr     1.3.0
✔ purrr     1.0.1     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(tidyr)
library(vegan)
Warning: package 'vegan' was built under R version 4.3.1
Carregando pacotes exigidos: permute
Warning: package 'permute' was built under R version 4.3.1
Carregando pacotes exigidos: lattice
This is vegan 2.6-4
library(ggplot2)


cultivos_num<-readr::read_delim("pr_limpo_delim_num.csv") #separado por ";" e nao fica com caracteres especiais
Rows: 399 Columns: 11
── Column specification ────────────────────────────────────────────────────────
Delimiter: ";"
dbl (11): municipio, arroz, cana, mandioca, milho, soja, trigo, cacau, café,...

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
#log transformacao 
cbind(log(cultivos_num[-1]+1))->cultivos.log


#padronizaçao
decostand(cultivos.log, method = "stand")->cultivos.std

#sd de todas as variaveis = 1
sd(cultivos.std$arroz)
[1] 1
sd(cultivos.std$laranja)
[1] 1
pca.std<-princomp(cultivos.std)
pca.std
Call:
princomp(x = cultivos.std)

Standard deviations:
   Comp.1    Comp.2    Comp.3    Comp.4    Comp.5    Comp.6    Comp.7    Comp.8 
1.5275814 1.2500436 1.0660519 1.0159031 0.9748792 0.8988507 0.8322728 0.7457837 
   Comp.9   Comp.10 
0.7057169 0.6364497 

 10  variables and  399 observations.
#standard deviation: variacao do componente
#proportion of variance: quanto o eixo sumarizou a variacao
#cumulative proportion: quanto das variaveis podem ser interpretadas pelos componentes
summary(pca.std)
Importance of components:
                          Comp.1    Comp.2    Comp.3    Comp.4     Comp.5
Standard deviation     1.5275814 1.2500436 1.0660519 1.0159031 0.97487915
Proportion of Variance 0.2339368 0.1566535 0.1139322 0.1034652 0.09527773
Cumulative Proportion  0.2339368 0.3905903 0.5045225 0.6079877 0.70326546
                           Comp.6     Comp.7     Comp.8     Comp.9   Comp.10
Standard deviation     0.89885072 0.83227281 0.74578367 0.70571686 0.6364497
Proportion of Variance 0.08099626 0.06944184 0.05575907 0.04992876 0.0406086
Cumulative Proportion  0.78426172 0.85370356 0.90946264 0.95939140 1.0000000
#loadings: importancia da variavel para o componente
loadings(pca.std)

Loadings:
         Comp.1 Comp.2 Comp.3 Comp.4 Comp.5 Comp.6 Comp.7 Comp.8 Comp.9 Comp.10
arroz     0.255         0.667  0.211         0.409  0.154  0.314  0.381        
cana     -0.208 -0.522  0.315  0.187  0.122  0.151  0.218 -0.671               
mandioca -0.179 -0.384         0.662  0.133 -0.257         0.501 -0.171        
milho     0.449 -0.297 -0.218  0.157 -0.210        -0.280         0.206 -0.685 
soja      0.419 -0.407 -0.239        -0.108        -0.105         0.288  0.695 
trigo     0.488               -0.106 -0.162  0.201  0.374  0.116 -0.726        
cacau    -0.178         0.224  0.215 -0.903 -0.137                       0.114 
café     -0.293 -0.354 -0.172 -0.422 -0.249         0.557  0.296  0.301 -0.151 
laranja  -0.103 -0.425  0.420 -0.456               -0.539  0.273 -0.222        
uva       0.341         0.292 -0.114        -0.814  0.292         0.148        

               Comp.1 Comp.2 Comp.3 Comp.4 Comp.5 Comp.6 Comp.7 Comp.8 Comp.9
SS loadings       1.0    1.0    1.0    1.0    1.0    1.0    1.0    1.0    1.0
Proportion Var    0.1    0.1    0.1    0.1    0.1    0.1    0.1    0.1    0.1
Cumulative Var    0.1    0.2    0.3    0.4    0.5    0.6    0.7    0.8    0.9
               Comp.10
SS loadings        1.0
Proportion Var     0.1
Cumulative Var     1.0
#autovalores: representam a porcentagem de explicação de cada eixo
eigenvals(pca.std) 
   Comp.1    Comp.2    Comp.3    Comp.4    Comp.5    Comp.6    Comp.7    Comp.8 
2.3335048 1.5626089 1.1364666 1.0320592 0.9503894 0.8079326 0.6926780 0.5561933 
   Comp.9   Comp.10 
0.4980363 0.4050683 
bstick(pca.std) #autovalores do modelo nulo
    Comp.1     Comp.2     Comp.3     Comp.4     Comp.5     Comp.6     Comp.7 
2.92162748 1.92413375 1.42538688 1.09288897 0.84351553 0.64401679 0.47776783 
    Comp.8     Comp.9    Comp.10 
0.33526873 0.21058201 0.09974937 
#componentes 1,2,3... sao interpretados pq superam o valor esperado pelo modelo nulo(?)


biplot(pca.std)

#correlacao de pearson, deve dar proximo de zero (baixa correlacao entre as variaveis)

cor(pca.std$scores[,1],pca.std$scores[,2]) 
[1] 1.823879e-17
#pca.std$scores 

Nova tentativa: sem cacau e acrescentando o feijão

cultivos_new<-readr::read_delim("cultivos por mun PR_edit2.csv")
Rows: 399 Columns: 11
── Column specification ────────────────────────────────────────────────────────
Delimiter: ";"
chr  (1): municipio
dbl (10): arroz, cana, mandioca, milho, soja, trigo, café, laranja, uva, feijao

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
dados.num <- mutate_if(cultivos_new, is.character, as.numeric)
Warning: There was 1 warning in `mutate()`.
ℹ In argument: `municipio = .Primitive("as.double")(municipio)`.
Caused by warning:
! NAs introduzidos por coerção
#log transformacao 
cbind(log(dados.num[-1]+1))->cultivos.log

#padronizaçao
decostand(cultivos.log, method = "stand")->cultivos.std

#sd de todas as variaveis = 1
sd(cultivos.std$arroz)
[1] 1
sd(cultivos.std$laranja)
[1] 1
sd(cultivos.std$feijao)
[1] 1
pca.std<-princomp(cultivos.std)


pca.std
Call:
princomp(x = cultivos.std)

Standard deviations:
   Comp.1    Comp.2    Comp.3    Comp.4    Comp.5    Comp.6    Comp.7    Comp.8 
1.5792313 1.2478330 1.1217892 1.0138262 0.9278813 0.8477622 0.7731567 0.7474448 
   Comp.9   Comp.10 
0.7032528 0.6379262 

 10  variables and  399 observations.
#standard deviation: variacao do componente
#proportion of variance: quanto o eixo sumarizou a variacao
#cumulative proportion: quanto das variaveis podem ser interpretadas pelos componentes
summary(pca.std)
Importance of components:
                          Comp.1    Comp.2    Comp.3    Comp.4    Comp.5
Standard deviation     1.5792313 1.2478330 1.1217892 1.0138262 0.9278813
Proportion of Variance 0.2500238 0.1561000 0.1261573 0.1030426 0.0863127
Cumulative Proportion  0.2500238 0.4061237 0.5322810 0.6353236 0.7216363
                           Comp.6     Comp.7     Comp.8     Comp.9    Comp.10
Standard deviation     0.84776224 0.77315666 0.74744484 0.70325282 0.63792615
Proportion of Variance 0.07205066 0.05992732 0.05600775 0.04958071 0.04079723
Cumulative Proportion  0.79368699 0.85361431 0.90962206 0.95920277 1.00000000
#loadings: importancia da variavel para o componente
loadings(pca.std)

Loadings:
         Comp.1 Comp.2 Comp.3 Comp.4 Comp.5 Comp.6 Comp.7 Comp.8 Comp.9 Comp.10
arroz     0.305         0.554  0.223         0.181  0.540  0.373  0.267        
cana     -0.176 -0.518  0.355  0.211                0.240 -0.665         0.109 
mandioca -0.185 -0.381         0.672 -0.149 -0.210 -0.190  0.475 -0.171        
milho     0.407 -0.317 -0.330  0.103         0.231 -0.135         0.285  0.676 
soja      0.363 -0.418 -0.338                0.147  0.128         0.219 -0.694 
trigo     0.476               -0.138  0.207 -0.222  0.210        -0.769        
café     -0.284 -0.347        -0.455  0.302 -0.512  0.199  0.320  0.282  0.120 
laranja         -0.423  0.362 -0.447 -0.287  0.474 -0.258  0.255 -0.211        
uva       0.327               -0.116 -0.783 -0.473                0.152        
feijao    0.352         0.440         0.355 -0.287 -0.654         0.154 -0.134 

               Comp.1 Comp.2 Comp.3 Comp.4 Comp.5 Comp.6 Comp.7 Comp.8 Comp.9
SS loadings       1.0    1.0    1.0    1.0    1.0    1.0    1.0    1.0    1.0
Proportion Var    0.1    0.1    0.1    0.1    0.1    0.1    0.1    0.1    0.1
Cumulative Var    0.1    0.2    0.3    0.4    0.5    0.6    0.7    0.8    0.9
               Comp.10
SS loadings        1.0
Proportion Var     0.1
Cumulative Var     1.0
#autovalores: representam a porcentagem de explicação de cada eixo
eigenvals(pca.std) 
   Comp.1    Comp.2    Comp.3    Comp.4    Comp.5    Comp.6    Comp.7    Comp.8 
2.4939716 1.5570873 1.2584110 1.0278435 0.8609637 0.7187008 0.5977712 0.5586738 
   Comp.9   Comp.10 
0.4945645 0.4069498 
bstick(pca.std) #autovalores do modelo nulo
    Comp.1     Comp.2     Comp.3     Comp.4     Comp.5     Comp.6     Comp.7 
2.92162748 1.92413375 1.42538688 1.09288897 0.84351553 0.64401679 0.47776783 
    Comp.8     Comp.9    Comp.10 
0.33526873 0.21058201 0.09974937 
#componentes 1,2,3... sao interpretados pq superam o valor esperado pelo modelo nulo(?)


biplot(pca.std)

#correlacao de pearson, deve dar proximo de zero (baixa correlacao entre as variaveis) pq estao padronizados e logaritmizados

cor(pca.std$scores[,1],pca.std$scores[,2]) 
[1] 8.96492e-17
#pca.std$scores

Voltando uma etapa: checar alguma possível correlação entre as variáveis

cultivos_num<-readr::read_delim("cultivos por mun PR_edit3.csv")
Rows: 399 Columns: 11
── Column specification ────────────────────────────────────────────────────────
Delimiter: ";"
dbl (11): municipio, arroz, cana, mandioca, milho, soja, trigo, café, laranj...

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
cor2<-cor(cultivos_num)
cor2
            municipio        arroz        cana     mandioca        milho
municipio  1.00000000  0.023677423  0.03625259  0.078877131 -0.020736036
arroz      0.02367742  1.000000000 -0.02387765 -0.006789495 -0.008719485
cana       0.03625259 -0.023877653  1.00000000  0.310009773 -0.239604681
mandioca   0.07887713 -0.006789495  0.31000977  1.000000000 -0.179414937
milho     -0.02073604 -0.008719485 -0.23960468 -0.179414937  1.000000000
soja       0.01291827 -0.061174944 -0.05815697 -0.180677302  0.575027528
trigo     -0.04465805 -0.020602614 -0.25380511 -0.354991887  0.563563984
café       0.01785955 -0.042380786  0.07552606 -0.022911559 -0.011784067
laranja    0.08505643 -0.036783876  0.08880063  0.029528183 -0.008225061
uva       -0.09037241  0.042667902 -0.04296605 -0.139536963  0.125972612
feijao    -0.01923369  0.053294214 -0.14820777 -0.192037252  0.321807945
                  soja        trigo        café      laranja         uva
municipio  0.012918274 -0.044658046  0.01785955  0.085056430 -0.09037241
arroz     -0.061174944 -0.020602614 -0.04238079 -0.036783876  0.04266790
cana      -0.058156970 -0.253805110  0.07552606  0.088800633 -0.04296605
mandioca  -0.180677302 -0.354991887 -0.02291156  0.029528183 -0.13953696
milho      0.575027528  0.563563984 -0.01178407 -0.008225061  0.12597261
soja       1.000000000  0.526352539 -0.02186843  0.007833834  0.11313889
trigo      0.526352539  1.000000000 -0.08077504  0.009606607  0.16421666
café      -0.021868435 -0.080775041  1.00000000  0.052642446 -0.04427910
laranja    0.007833834  0.009606607  0.05264245  1.000000000 -0.05077613
uva        0.113138885  0.164216662 -0.04427910 -0.050776127  1.00000000
feijao     0.263962162  0.354718477 -0.05657509 -0.051731091  0.16334022
               feijao
municipio -0.01923369
arroz      0.05329421
cana      -0.14820777
mandioca  -0.19203725
milho      0.32180794
soja       0.26396216
trigo      0.35471848
café      -0.05657509
laranja   -0.05173109
uva        0.16334022
feijao     1.00000000
library(corrplot)
Warning: package 'corrplot' was built under R version 4.3.1
corrplot 0.92 loaded
corrplot(cor2, method="circle")

O milho, a soja e o trigo apresentam correlação.

PCA sem trigo e soja

  1. correlação do milho com trigo e soja foi moderada (cor=<0.55)

  2. retirei os dados de trigo e soja

cultivos_sts<-readr::read_delim("cultivos por mun PR_edit4.csv")
Rows: 399 Columns: 9
── Column specification ────────────────────────────────────────────────────────
Delimiter: ";"
dbl (9): municipio, arroz, cana, mandioca, milho, café, laranja, uva, feijao

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
str(cultivos_sts)
spc_tbl_ [399 × 9] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
 $ municipio: num [1:399] 1 2 3 4 5 6 7 8 9 10 ...
 $ arroz    : num [1:399] 496 959 248 0 496 ...
 $ cana     : num [1:399] 34846 26137 0 0 1136 ...
 $ mandioca : num [1:399] 5384 2908 1197 3910 3127 ...
 $ milho    : num [1:399] 4596 1701 5912 5907 5455 ...
 $ café     : num [1:399] 2056 236 0 0 0 ...
 $ laranja  : num [1:399] 15625 6128 0 0 8683 ...
 $ uva      : num [1:399] 0 0 7749 2066 789 ...
 $ feijao   : num [1:399] 868 358 1409 957 957 ...
 - attr(*, "spec")=
  .. cols(
  ..   municipio = col_double(),
  ..   arroz = col_double(),
  ..   cana = col_double(),
  ..   mandioca = col_double(),
  ..   milho = col_double(),
  ..   café = col_double(),
  ..   laranja = col_double(),
  ..   uva = col_double(),
  ..   feijao = col_double()
  .. )
 - attr(*, "problems")=<externalptr> 
cor3<-cor(cultivos_sts)
cor3
            municipio        arroz        cana     mandioca        milho
municipio  1.00000000  0.023677423  0.03625259  0.078877131 -0.020736036
arroz      0.02367742  1.000000000 -0.02387765 -0.006789495 -0.008719485
cana       0.03625259 -0.023877653  1.00000000  0.310009773 -0.239604681
mandioca   0.07887713 -0.006789495  0.31000977  1.000000000 -0.179414937
milho     -0.02073604 -0.008719485 -0.23960468 -0.179414937  1.000000000
café       0.01785955 -0.042380786  0.07552606 -0.022911559 -0.011784067
laranja    0.08505643 -0.036783876  0.08880063  0.029528183 -0.008225061
uva       -0.09037241  0.042667902 -0.04296605 -0.139536963  0.125972612
feijao    -0.01923369  0.053294214 -0.14820777 -0.192037252  0.321807945
                 café      laranja         uva      feijao
municipio  0.01785955  0.085056430 -0.09037241 -0.01923369
arroz     -0.04238079 -0.036783876  0.04266790  0.05329421
cana       0.07552606  0.088800633 -0.04296605 -0.14820777
mandioca  -0.02291156  0.029528183 -0.13953696 -0.19203725
milho     -0.01178407 -0.008225061  0.12597261  0.32180794
café       1.00000000  0.052642446 -0.04427910 -0.05657509
laranja    0.05264245  1.000000000 -0.05077613 -0.05173109
uva       -0.04427910 -0.050776127  1.00000000  0.16334022
feijao    -0.05657509 -0.051731091  0.16334022  1.00000000
library(corrplot)
corrplot(cor3, method="circle")

#log transformacao 
cbind(log(cultivos_sts[-1]+1))->cultivos.log_sts

#padronizaçao
decostand(cultivos.log_sts, method = "stand")->cultivos.std_sts

#sd de todas as variaveis = 1
sd(cultivos.std_sts$arroz)
[1] 1
sd(cultivos.std_sts$laranja)
[1] 1
sd(cultivos.std_sts$feijao)
[1] 1
pca.std_sts<-princomp(cultivos.std_sts)

pca.std_sts
Call:
princomp(x = cultivos.std_sts)

Standard deviations:
   Comp.1    Comp.2    Comp.3    Comp.4    Comp.5    Comp.6    Comp.7    Comp.8 
1.3944369 1.1809533 1.0260774 0.9779766 0.8971533 0.8357102 0.7560618 0.7460876 

 8  variables and  399 observations.
#standard deviation: variacao do componente
#proportion of variance: quanto o eixo sumarizou a variacao
#cumulative proportion: quanto das variaveis podem ser interpretadas pelos componentes
summary(pca.std_sts)
Importance of components:
                          Comp.1    Comp.2    Comp.3    Comp.4    Comp.5
Standard deviation     1.3944369 1.1809533 1.0260774 0.9779766 0.8971533
Proportion of Variance 0.2436675 0.1747694 0.1319350 0.1198552 0.1008633
Cumulative Proportion  0.2436675 0.4184368 0.5503719 0.6702270 0.7710903
                           Comp.6     Comp.7     Comp.8
Standard deviation     0.83571019 0.75606183 0.74608758
Proportion of Variance 0.08752079 0.07163322 0.06975566
Cumulative Proportion  0.85861112 0.93024434 1.00000000
#loadings: importancia da variavel para o componente
loadings(pca.std_sts)

Loadings:
         Comp.1 Comp.2 Comp.3 Comp.4 Comp.5 Comp.6 Comp.7 Comp.8
arroz     0.413  0.351  0.130  0.454  0.110  0.135  0.573  0.353
cana     -0.316  0.576 -0.102  0.211                0.163 -0.692
mandioca -0.297  0.257 -0.692  0.170  0.205 -0.117 -0.229  0.479
milho     0.292  0.195 -0.490 -0.477 -0.536  0.249  0.238       
café     -0.461  0.131  0.272 -0.232 -0.359 -0.524  0.369  0.320
laranja  -0.191  0.514  0.394 -0.349  0.123  0.519 -0.283  0.239
uva       0.368  0.185        -0.522  0.588 -0.450              
feijao    0.414  0.355  0.152  0.211 -0.407 -0.393 -0.561       

               Comp.1 Comp.2 Comp.3 Comp.4 Comp.5 Comp.6 Comp.7 Comp.8
SS loadings     1.000  1.000  1.000  1.000  1.000  1.000  1.000  1.000
Proportion Var  0.125  0.125  0.125  0.125  0.125  0.125  0.125  0.125
Cumulative Var  0.125  0.250  0.375  0.500  0.625  0.750  0.875  1.000
#autovalores: representam a porcentagem de explicação de cada eixo
eigenvals(pca.std_sts) 
   Comp.1    Comp.2    Comp.3    Comp.4    Comp.5    Comp.6    Comp.7    Comp.8 
1.9444542 1.3946508 1.0528349 0.9564382 0.8048841 0.6984115 0.5716295 0.5566467 
bstick(pca.std_sts) #autovalores do modelo nulo
   Comp.1    Comp.2    Comp.3    Comp.4    Comp.5    Comp.6    Comp.7    Comp.8 
2.7110455 1.7135517 1.2148049 0.8823070 0.6329335 0.4334348 0.2671858 0.1246867 
biplot(pca.std_sts)

#correlacao de pearson, deve dar proximo de zero (baixa correlacao entre as variaveis)

cor(pca.std_sts$scores[,1],pca.std_sts$scores[,2]) 
[1] -1.561922e-16
#pca.std_sts$scores

Feijão também teve correlação (mais fraca) com milho (cor = 0.32)

PCA sem trigo, soja e feijão

cultivos_sf<-readr::read_delim("cultivos por mun PR_edit5.csv")
Rows: 399 Columns: 8
── Column specification ────────────────────────────────────────────────────────
Delimiter: ";"
dbl (8): municipio, arroz, cana, mandioca, milho, café, laranja, uva

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
cor4<-cor(cultivos_sf)
cor4
            municipio        arroz        cana     mandioca        milho
municipio  1.00000000  0.023677423  0.03625259  0.078877131 -0.020736036
arroz      0.02367742  1.000000000 -0.02387765 -0.006789495 -0.008719485
cana       0.03625259 -0.023877653  1.00000000  0.310009773 -0.239604681
mandioca   0.07887713 -0.006789495  0.31000977  1.000000000 -0.179414937
milho     -0.02073604 -0.008719485 -0.23960468 -0.179414937  1.000000000
café       0.01785955 -0.042380786  0.07552606 -0.022911559 -0.011784067
laranja    0.08505643 -0.036783876  0.08880063  0.029528183 -0.008225061
uva       -0.09037241  0.042667902 -0.04296605 -0.139536963  0.125972612
                 café      laranja         uva
municipio  0.01785955  0.085056430 -0.09037241
arroz     -0.04238079 -0.036783876  0.04266790
cana       0.07552606  0.088800633 -0.04296605
mandioca  -0.02291156  0.029528183 -0.13953696
milho     -0.01178407 -0.008225061  0.12597261
café       1.00000000  0.052642446 -0.04427910
laranja    0.05264245  1.000000000 -0.05077613
uva       -0.04427910 -0.050776127  1.00000000
corrplot(cor4, method="circle")

#log
cbind(log(cultivos_sf[-1]+1))->cultivos.log_sf


#padronizaçao
decostand(cultivos.log_sf, method = "stand")->cultivos.std_sf

pca.std_sf<-princomp(cultivos.std_sf)


pca.std_sf
Call:
princomp(x = cultivos.std_sf)

Standard deviations:
   Comp.1    Comp.2    Comp.3    Comp.4    Comp.5    Comp.6    Comp.7 
1.3312714 1.1292297 1.0198270 0.9672906 0.8710623 0.8024318 0.7461022 

 7  variables and  399 observations.
summary(pca.std_sf)
Importance of components:
                          Comp.1    Comp.2    Comp.3    Comp.4    Comp.5
Standard deviation     1.3312714 1.1292297 1.0198270 0.9672906 0.8710623
Proportion of Variance 0.2538195 0.1826234 0.1489515 0.1340003 0.1086651
Cumulative Proportion  0.2538195 0.4364429 0.5853943 0.7193946 0.8280597
                           Comp.6     Comp.7
Standard deviation     0.80243178 0.74610225
Proportion of Variance 0.09221637 0.07972389
Cumulative Proportion  0.92027611 1.00000000

Cana teve correlação (correlação foi fraca) com milho (cor = 0.31)

PCA sem trigo, soja, feijão e mandioca

cultivos_sm<-readr::read_delim("cultivos por mun PR_edit6.csv")
Rows: 399 Columns: 7
── Column specification ────────────────────────────────────────────────────────
Delimiter: ";"
dbl (7): municipio, arroz, cana, milho, café, laranja, uva

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
cor5<-cor(cultivos_sm)
cor5
            municipio        arroz        cana        milho        café
municipio  1.00000000  0.023677423  0.03625259 -0.020736036  0.01785955
arroz      0.02367742  1.000000000 -0.02387765 -0.008719485 -0.04238079
cana       0.03625259 -0.023877653  1.00000000 -0.239604681  0.07552606
milho     -0.02073604 -0.008719485 -0.23960468  1.000000000 -0.01178407
café       0.01785955 -0.042380786  0.07552606 -0.011784067  1.00000000
laranja    0.08505643 -0.036783876  0.08880063 -0.008225061  0.05264245
uva       -0.09037241  0.042667902 -0.04296605  0.125972612 -0.04427910
               laranja         uva
municipio  0.085056430 -0.09037241
arroz     -0.036783876  0.04266790
cana       0.088800633 -0.04296605
milho     -0.008225061  0.12597261
café       0.052642446 -0.04427910
laranja    1.000000000 -0.05077613
uva       -0.050776127  1.00000000
corrplot(cor5, method="circle")

#log
cbind(log(cultivos_sm[-1]+1))->cultivos.log_sm


#padronizaçao
decostand(cultivos.log_sm, method = "stand")->cultivos.std_sm

pca.std_sm<-princomp(cultivos.std_sm)


pca.std_sm
Call:
princomp(x = cultivos.std_sm)

Standard deviations:
   Comp.1    Comp.2    Comp.3    Comp.4    Comp.5    Comp.6 
1.2949127 1.1190960 0.9683417 0.9048732 0.8215723 0.7901426 

 6  variables and  399 observations.
summary(pca.std_sm)
Importance of components:
                          Comp.1    Comp.2    Comp.3    Comp.4    Comp.5
Standard deviation     1.2949127 1.1190960 0.9683417 0.9048732 0.8215723
Proportion of Variance 0.2801687 0.2092537 0.1566736 0.1368088 0.1127795
Cumulative Proportion  0.2801687 0.4894224 0.6460960 0.7829048 0.8956843
                          Comp.6
Standard deviation     0.7901426
Proportion of Variance 0.1043157
Cumulative Proportion  1.0000000
biplot(pca.std_sm)

Mandioca teve correlação (bem fraca) com cana (cor = -0.23)

PCA sem trigo, soja, feijão, mandioca e cana

cultivos_sc<-readr::read_delim("cultivos por mun PR_edit7.csv")
Rows: 399 Columns: 6
── Column specification ────────────────────────────────────────────────────────
Delimiter: ";"
dbl (6): municipio, arroz, milho, café, laranja, uva

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
cor6<-cor(cultivos_sc)
cor6
            municipio        arroz        milho        café      laranja
municipio  1.00000000  0.023677423 -0.020736036  0.01785955  0.085056430
arroz      0.02367742  1.000000000 -0.008719485 -0.04238079 -0.036783876
milho     -0.02073604 -0.008719485  1.000000000 -0.01178407 -0.008225061
café       0.01785955 -0.042380786 -0.011784067  1.00000000  0.052642446
laranja    0.08505643 -0.036783876 -0.008225061  0.05264245  1.000000000
uva       -0.09037241  0.042667902  0.125972612 -0.04427910 -0.050776127
                  uva
municipio -0.09037241
arroz      0.04266790
milho      0.12597261
café      -0.04427910
laranja   -0.05077613
uva        1.00000000
corrplot(cor6, method="circle")

#log
cbind(log(cultivos_sc[-1]+1))->cultivos.log_sc


#padronizaçao
decostand(cultivos.log_sc, method = "stand")->cultivos.std_sc

pca.std_sc<-princomp(cultivos.std_sc)

loadings(pca.std_sc)

Loadings:
        Comp.1 Comp.2 Comp.3 Comp.4 Comp.5
arroz    0.487  0.156  0.631  0.357  0.462
milho    0.413  0.213 -0.718  0.513       
café    -0.575  0.319 -0.165         0.735
laranja -0.192  0.828  0.195  0.156 -0.464
uva      0.474  0.378 -0.144 -0.765  0.161

               Comp.1 Comp.2 Comp.3 Comp.4 Comp.5
SS loadings       1.0    1.0    1.0    1.0    1.0
Proportion Var    0.2    0.2    0.2    0.2    0.2
Cumulative Var    0.2    0.4    0.6    0.8    1.0
pca.std_sc
Call:
princomp(x = cultivos.std_sc)

Standard deviations:
   Comp.1    Comp.2    Comp.3    Comp.4    Comp.5 
1.2501002 1.0450334 0.9588598 0.8811290 0.7980119 

 5  variables and  399 observations.
summary(pca.std_sc)
Importance of components:
                          Comp.1    Comp.2    Comp.3    Comp.4    Comp.5
Standard deviation     1.2501002 1.0450334 0.9588598 0.8811290 0.7980119
Proportion of Variance 0.3133354 0.2189677 0.1843444 0.1556678 0.1276846
Cumulative Proportion  0.3133354 0.5323031 0.7166476 0.8723154 1.0000000
biplot(pca.std_sc)

E agora?

Componentes 1, 2 e 3 juntos = 71%

Componentes 1 e 2 juntos = 53%

Componente 1 = 31%