HTML de saída dados Píaui

Para iniciar a nossa ánalise, iremos carregar nossos pacotes.

library(geoR)
library(MASS)
library(moments)
library(fBasics)
library(akima)
library(readxl)
library(skimr)
library(terra)
library(rmdformats)

Importando os dados

solo = read_xlsx("Dados.xlsx")
head(solo)
## # A tibble: 6 × 9
##     X...1    Y...2 X...3 Y...4  X_km  Y_km    Ca Carbono    pH
##     <dbl>    <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>   <dbl> <dbl>
## 1 468785. 9157565.  85.5  147. -7.62  56.7 0       1.50   4.22
## 2 460290. 9136197.  85.5  147. -7.81  56.6 0       0.908  4.4 
## 3 733695. 9071127.  85.5  147. -8.40  59.1 0       0.832  3.8 
## 4 789261. 8962017.  85.5  147. -9.38  59.6 0.532   0.397  5.16
## 5 783997. 8965400.  85.5  147. -9.35  59.6 1.72    0.457  5.45
## 6 772256. 8981080.  85.5  147. -9.21  59.5 0.835   0.525  5.43

Explorando os dados

dim(solo)
## [1] 243   9
str(solo)
## tibble [243 × 9] (S3: tbl_df/tbl/data.frame)
##  $ X...1  : num [1:243] 468785 460290 733695 789261 783997 ...
##  $ Y...2  : num [1:243] 9157565 9136197 9071127 8962017 8965400 ...
##  $ X...3  : num [1:243] 85.5 85.5 85.5 85.5 85.5 ...
##  $ Y...4  : num [1:243] 147 147 147 147 147 ...
##  $ X_km   : num [1:243] -7.62 -7.81 -8.4 -9.38 -9.35 ...
##  $ Y_km   : num [1:243] 56.7 56.6 59.1 59.6 59.6 ...
##  $ Ca     : num [1:243] 0 0 0 0.532 1.716 ...
##  $ Carbono: num [1:243] 1.496 0.908 0.832 0.397 0.457 ...
##  $ pH     : num [1:243] 4.22 4.4 3.8 5.16 5.45 5.43 4.97 5 5.77 5.51 ...
skim(solo)
Data summary
Name solo
Number of rows 243
Number of columns 9
_______________________
Column type frequency:
numeric 9
________________________
Group variables None

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100 hist
X…1 0 1 738572.54 161806.63 420797.88 575551.50 787054.27 871241.31 985985.89 ▅▅▃▇▆
Y…2 0 1 9216843.17 247552.56 8806075.86 9034581.69 9157607.27 9441701.97 9686915.78 ▃▇▅▂▅
X…3 0 1 85.53 0.00 85.52 85.52 85.53 85.53 85.53 ▃▅▂▆▇
Y…4 0 1 146.82 0.03 146.77 146.79 146.81 146.84 146.87 ▃▇▅▂▅
X_km 0 1 -6.69 6.35 -10.80 -8.73 -7.60 -5.02 85.53 ▇▁▁▁▁
Y_km 0 1 59.52 5.81 56.28 57.71 59.60 60.36 146.86 ▇▁▁▁▁
Ca 0 1 1.64 3.01 0.00 0.00 0.40 1.90 22.89 ▇▁▁▁▁
Carbono 0 1 0.87 0.68 0.01 0.38 0.64 1.28 2.50 ▇▇▂▂▂
pH 0 1 4.72 1.08 3.11 3.80 4.55 5.42 8.57 ▇▇▅▂▁
class(solo)
## [1] "tbl_df"     "tbl"        "data.frame"
#View(Solo)

Análise Descritiva Profunda

geosolo <- as.geodata(solo, coords.col=c(1, 2), data.col=7)
class(geosolo)
## [1] "geodata"
attach(geosolo)
skim(geosolo)
Data summary
Name geosolo
Number of rows 243
Number of columns 3
_______________________
Column type frequency:
numeric 3
________________________
Group variables None

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100 hist
X…1 0 1 738572.54 161806.63 420797.9 575551.5 787054.3 871241.3 985985.89 ▅▅▃▇▆
Y…2 0 1 9216843.17 247552.56 8806075.9 9034581.7 9157607.3 9441702.0 9686915.78 ▃▇▅▂▅
data 0 1 1.64 3.01 0.0 0.0 0.4 1.9 22.89 ▇▁▁▁▁
plot(geosolo, low = T, trend = "2nd")

Podemos notar pelos dados que não temos tendência nos nossos dados.

Testando a tendencia dos dados para coordenada X

geosolo <- as.geodata(solo, coords.col=c(1, 2), data.col=7)
class(geosolo)
## [1] "geodata"
attach(geosolo)
skim(geosolo)
Data summary
Name geosolo
Number of rows 243
Number of columns 3
_______________________
Column type frequency:
numeric 3
________________________
Group variables None

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100 hist
X…1 0 1 738572.54 161806.63 420797.9 575551.5 787054.3 871241.3 985985.89 ▅▅▃▇▆
Y…2 0 1 9216843.17 247552.56 8806075.9 9034581.7 9157607.3 9441702.0 9686915.78 ▃▇▅▂▅
data 0 1 1.64 3.01 0.0 0.0 0.4 1.9 22.89 ▇▁▁▁▁
plot(geosolo, low = T, trend = ~X...1)

Colocando a relação a coordenada X podemos notar que nossos dados começam a apresentar tendência

Testando a tendencia dos dados para coordenada Y

geosolo <- as.geodata(solo, coords.col=c(1, 2), data.col=7)
class(geosolo)
## [1] "geodata"
attach(geosolo)
skim(geosolo)
Data summary
Name geosolo
Number of rows 243
Number of columns 3
_______________________
Column type frequency:
numeric 3
________________________
Group variables None

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100 hist
X…1 0 1 738572.54 161806.63 420797.9 575551.5 787054.3 871241.3 985985.89 ▅▅▃▇▆
Y…2 0 1 9216843.17 247552.56 8806075.9 9034581.7 9157607.3 9441702.0 9686915.78 ▃▇▅▂▅
data 0 1 1.64 3.01 0.0 0.0 0.4 1.9 22.89 ▇▁▁▁▁
plot(geosolo, low = T, trend = ~Y...2)

Na cordenada Y podemos notar uma situação um pouco melhor que na coordenada X, entretanto, a partir das distâncias finais apresentam têndencia.

Testando a tendencia dos dados para coordenada X + Y

geosolo <- as.geodata(solo, coords.col=c(1, 2), data.col=7)
class(geosolo)
## [1] "geodata"
attach(geosolo)
## The following objects are masked from geosolo (pos = 3):
## 
##     coords, data
## The following objects are masked from geosolo (pos = 4):
## 
##     coords, data
## The following objects are masked from geosolo (pos = 5):
## 
##     coords, data
skim(geosolo)
Data summary
Name geosolo
Number of rows 243
Number of columns 3
_______________________
Column type frequency:
numeric 3
________________________
Group variables None

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100 hist
X…1 0 1 738572.54 161806.63 420797.9 575551.5 787054.3 871241.3 985985.89 ▅▅▃▇▆
Y…2 0 1 9216843.17 247552.56 8806075.9 9034581.7 9157607.3 9441702.0 9686915.78 ▃▇▅▂▅
data 0 1 1.64 3.01 0.0 0.0 0.4 1.9 22.89 ▇▁▁▁▁
plot(geosolo, low = T, trend = ~X...1 + Y...2)

Podemos notar uma leve têdencia para as coordenadas X & Y.

Colocando a relação as coordenadas X & Y podemos notar que nossos dados começam a apresentar tendência

Definindo uma borda

borda = read.table('bord.txt')
geosolo$borders <- with(borda, cbind(V1,V2))
points(geosolo, col=gray(seq(1,0.1,l=90)), xlab="Sul", ylab="oeste")
polygon(borda)

Descritivas

Mean_ <- mean(solo$Ca)
Minimum_ <- min(solo$Ca)
Maximum_ <- max(solo$Ca)
SD_ <- sd(solo$Ca)
CoV_ <- 100*(sd(solo$Ca)/mean(solo$Ca))
Kur_ <- kurtosis(solo$Ca)
Skew_ <- skewness(solo$Ca)
Analise_descritiva <- data.frame(Mean_, Minimum_, Maximum_, SD_, CoV_, Kur_, Skew_)
Analise_descritiva
##      Mean_ Minimum_ Maximum_      SD_     CoV_    Kur_    Skew_
## 1 1.644601        0    22.89 3.005855 182.7711 18.2141 3.652054

Mais manipulações

#View(geosolo)
cord = solo[3:4]
#View(cord)
coords = data.matrix(cord)
#View(coords)

Variograma

v1 = variog(geosolo, coords = coords , max.dist=300000)
## variog: computing omnidirectional variogram
plot(v1)

Ajuste (visual) de modelo para o variograma

#ef1_circular = eyefit(v1)
#summary(ef1_circular)

Estimação dos Parâmetros

ml1  <- likfit(geosolo, ini=c(0.07,400), nug=0.05,cov.model="spherical")
## kappa not used for the spherical correlation function
## ---------------------------------------------------------------
## likfit: likelihood maximisation using the function optim.
## likfit: Use control() to pass additional
##          arguments for the maximisation function.
##         For further details see documentation for optim.
## likfit: It is highly advisable to run this function several
##         times with different initial values for the parameters.
## likfit: WARNING: This step can be time demanding!
## ---------------------------------------------------------------
## likfit: end of numerical maximisation.
ml1
## likfit: estimated model parameters:
##       beta      tausq    sigmasq        phi 
## "   1.609" "   0.000" "   8.892" "7698.396" 
## Practical Range with cor=0.05 for asymptotic range: 7698.396
## 
## likfit: maximised log-likelihood = -604.6
ml2  <- likfit(geosolo, ini=c(0.05,400), nug=0.05, kappa=0.5, cov.model="matern")
## ---------------------------------------------------------------
## likfit: likelihood maximisation using the function optim.
## likfit: Use control() to pass additional
##          arguments for the maximisation function.
##         For further details see documentation for optim.
## likfit: It is highly advisable to run this function several
##         times with different initial values for the parameters.
## likfit: WARNING: This step can be time demanding!
## ---------------------------------------------------------------
## likfit: end of numerical maximisation.
ml2
## likfit: estimated model parameters:
##       beta      tausq    sigmasq        phi 
## "   1.575" "   0.000" "   9.108" "5507.003" 
## Practical Range with cor=0.05 for asymptotic range: 16497.51
## 
## likfit: maximised log-likelihood = -604.2
ml3  <- likfit(geosolo, ini=c(0.05,400), nug=0.05, cov.model="gaussian")
## kappa not used for the gaussian correlation function
## ---------------------------------------------------------------
## likfit: likelihood maximisation using the function optim.
## likfit: Use control() to pass additional
##          arguments for the maximisation function.
##         For further details see documentation for optim.
## likfit: It is highly advisable to run this function several
##         times with different initial values for the parameters.
## likfit: WARNING: This step can be time demanding!
## ---------------------------------------------------------------
## likfit: end of numerical maximisation.
ml3
## likfit: estimated model parameters:
##        beta       tausq     sigmasq         phi 
## "   1.6259" "   0.0293" "   8.9207" "2347.7468" 
## Practical Range with cor=0.05 for asymptotic range: 4063.523
## 
## likfit: maximised log-likelihood = -605

Checando os modelos

print("====================")
## [1] "===================="
print("=========ML1=======")
## [1] "=========ML1======="
print("====================")
## [1] "===================="
summary(ml1)
## Summary of the parameter estimation
## -----------------------------------
## Estimation method: maximum likelihood 
## 
## Parameters of the mean component (trend):
##   beta 
## 1.6095 
## 
## Parameters of the spatial component:
##    correlation function: spherical
##       (estimated) variance parameter sigmasq (partial sill) =  8.892
##       (estimated) cor. fct. parameter phi (range parameter)  =  7698
##    anisotropy parameters:
##       (fixed) anisotropy angle = 0  ( 0 degrees )
##       (fixed) anisotropy ratio = 1
## 
## Parameter of the error component:
##       (estimated) nugget =  0
## 
## Transformation parameter:
##       (fixed) Box-Cox parameter = 1 (no transformation)
## 
## Practical Range with cor=0.05 for asymptotic range: 7698.396
## 
## Maximised Likelihood:
##    log.L n.params      AIC      BIC 
## "-604.6"      "4"   "1217"   "1231" 
## 
## non spatial model:
##    log.L n.params      AIC      BIC 
## "-611.7"      "2"   "1227"   "1234" 
## 
## Call:
## likfit(geodata = geosolo, ini.cov.pars = c(0.07, 400), nugget = 0.05, 
##     cov.model = "spherical")
print("====================")
## [1] "===================="
print("=========ML2=======")
## [1] "=========ML2======="
print("====================")
## [1] "===================="
summary(ml2)
## Summary of the parameter estimation
## -----------------------------------
## Estimation method: maximum likelihood 
## 
## Parameters of the mean component (trend):
##   beta 
## 1.5754 
## 
## Parameters of the spatial component:
##    correlation function: exponential
##       (estimated) variance parameter sigmasq (partial sill) =  9.108
##       (estimated) cor. fct. parameter phi (range parameter)  =  5507
##    anisotropy parameters:
##       (fixed) anisotropy angle = 0  ( 0 degrees )
##       (fixed) anisotropy ratio = 1
## 
## Parameter of the error component:
##       (estimated) nugget =  0
## 
## Transformation parameter:
##       (fixed) Box-Cox parameter = 1 (no transformation)
## 
## Practical Range with cor=0.05 for asymptotic range: 16497.51
## 
## Maximised Likelihood:
##    log.L n.params      AIC      BIC 
## "-604.2"      "4"   "1216"   "1230" 
## 
## non spatial model:
##    log.L n.params      AIC      BIC 
## "-611.7"      "2"   "1227"   "1234" 
## 
## Call:
## likfit(geodata = geosolo, ini.cov.pars = c(0.05, 400), nugget = 0.05, 
##     kappa = 0.5, cov.model = "matern")
print("====================")
## [1] "===================="
print("=========ML3=======")
## [1] "=========ML3======="
print("====================")
## [1] "===================="
summary(ml3)
## Summary of the parameter estimation
## -----------------------------------
## Estimation method: maximum likelihood 
## 
## Parameters of the mean component (trend):
##   beta 
## 1.6259 
## 
## Parameters of the spatial component:
##    correlation function: gaussian
##       (estimated) variance parameter sigmasq (partial sill) =  8.921
##       (estimated) cor. fct. parameter phi (range parameter)  =  2348
##    anisotropy parameters:
##       (fixed) anisotropy angle = 0  ( 0 degrees )
##       (fixed) anisotropy ratio = 1
## 
## Parameter of the error component:
##       (estimated) nugget =  0.0293
## 
## Transformation parameter:
##       (fixed) Box-Cox parameter = 1 (no transformation)
## 
## Practical Range with cor=0.05 for asymptotic range: 4063.523
## 
## Maximised Likelihood:
##    log.L n.params      AIC      BIC 
##   "-605"      "4"   "1218"   "1232" 
## 
## non spatial model:
##    log.L n.params      AIC      BIC 
## "-611.7"      "2"   "1227"   "1234" 
## 
## Call:
## likfit(geodata = geosolo, ini.cov.pars = c(0.05, 400), nugget = 0.05, 
##     cov.model = "gaussian")

Podemos notar pelo AIC que o melhor modelo é o 2.

Plotando o semivariograma para diferentes distâncias

plot(variog(geosolo, max.dist=300000))
## variog: computing omnidirectional variogram
lines.variomodel(ml1,col="red")
lines.variomodel(ml2,col="blue")
lines.variomodel(ml3,col="green")

plot(variog(geosolo, max.dist=350000,kappa=0.5))
## variog: computing omnidirectional variogram
lines.variomodel(ml1,col="red")
lines.variomodel(ml2,col="blue")
lines.variomodel(ml3,col="green")

plot(variog(geosolo, max.dist=400000,kappa=0.5))
## variog: computing omnidirectional variogram
lines.variomodel(ml1,col="red")
lines.variomodel(ml2,col="blue")
lines.variomodel(ml3,col="green")

plot(variog(geosolo, max.dist=450000, kappa=0.5))
## variog: computing omnidirectional variogram
lines.variomodel(ml1,col="red")
lines.variomodel(ml2,col="blue")
lines.variomodel(ml3,col="green")

Definindo um grid para predição espacial

GR <- pred_grid(geosolo$borders, by=6500)
points(geosolo)
points(GR, pch=".",cex=1.5)

Predição espacial: krigagem kappa=0.5

geosolo.kr <- krige.conv(geosolo, loc=GR, bor=geosolo$bor, krige=krige.control(obj.model=ml2))
## krige.conv: results will be returned only for prediction locations inside the borders
## krige.conv: model with constant mean
## krige.conv: Kriging performed using global neighbourhood
image(geosolo.kr,col=terrain.colors(15),x.leg=c(607500, 607800), y.leg=c(7250150, 7250200))

É possível notar uma concentração de calcio no leste e no centro do estado.

median(solo$Ca)
## [1] 0.402
gr = pred_grid(geosolo$borders, by=12500)

s.out = output.control(n.predictive = 1000, n.post=1000, quant=0.95,  thres=0.402)

geosolo.kc = krige.conv(geosolo, loc=gr, krige=krige.control(obj=ml2), output = s.out)
## krige.conv: results will be returned only for prediction locations inside the borders
## krige.conv: model with constant mean
## krige.conv: sampling from the predictive distribution (conditional simulations)
## krige.conv: Kriging performed using global neighbourhood
image(geosolo.kc, border=geosolo$borders, loc=gr, 
      main="Mapa dos quantis",
      col=terrain.colors(200),val=geosolo.kc$quan, 
      x.leg=c(1000000, 1250000), y.leg=c(8800000, 8890000))

image(geosolo.kc, col = terrain.colors(200), main="Mapa da probabilidade", val=(1-geosolo.kc$probabilities.simulations),
      x.leg=c(1000000, 1250000), y.leg=c(8800000, 8890000))

Podemos notar que há uma probabilidade média de que nossos dados tenham concentração maior que 0.7 na região oeste do estado, o que é justificado no mapa dos quantis, onde é possível notar uma concentração maior(ainda que baixa) nessa região.