Curso: Regresión aplicada a las ciencia ambientales

Responsable del curso: Dr. Jorge Mendez González

Objetivo: Aplicar regresión dummy para generar un modelo de biomasa aérea de Prosopis glandulosa Torr.

Datos: Los datos utilizados corresponde a datos de biomasa aerea de Prosopis grandulosa, donde:Tipo = procedencia de las muestras, natural y plantación, Db = diámetro a la base (cm), Dn = diámetro normal (cm), DC1 y DC2 = diámetros de copa (m), H = altura total del árbol (m), B = biomasa aérea (kg).

LIBRERIAS DE TRABAJO

library(tidyr)
library(pastecs)
library(correlation)
library(boot)
library(corrplot)
library(qgraph)

Base de datos

Datos <- read.csv("Base prosopis.csv")
Datos

attach(Datos)

names(Datos)

## [1] "Sitio" "Tipo"  "Db"    "Dn"    "DC1"   "DC2"   "H"     "B"

Modelo a ajustar

# lnB = β0 + (β1*lnDn) + ε
# lnB = β0 + (β1*lnH) + ε
# lnB = β0 + (β1*lnDnH) + ε
# lnB = β0 + (β1*lnDn2H) + ε
# B = β0 + (β1*lnDn2H) + ε
# B = β0 + (β1*Db) + ε

Generación de variables

Datos$DnH<-Dn*H; attach(Datos)
Datos$Dn2<-Dn*Dn; attach(Datos)
Datos$Dn2H<-Dn2*H; attach(Datos)
Datos$lnB<-log(B); attach(Datos)
Datos$lnDn<-log(Dn); attach(Datos)
Datos$lnH<-log(H); attach(Datos)
Datos$lnDnH<-log(DnH); attach(Datos)
Datos$lnDn2H<-log(Dn2H); attach(Datos)

View(Datos)
attach(Datos)

library(DT)
datatable(Datos, extensions = "Buttons", 
          options = list(dom = "Bfrtip",
                         buttons = c("copy", "csv", "excel", "pdf")))

Correlación

library(psych)
library(colorRamps)
library(corrr)
res.cor <- correlate(Datos); res.cor

corPlot(Datos[, 3:16],
        gr = colorRampPalette(heat.colors(40)))

Ajuste de los modelos

mod_1 <- lm(lnB ~ lnDn, data=Datos); summary(mod_1)

## 
## Call:
## lm(formula = lnB ~ lnDn, data = Datos)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -1.33277 -0.26083 -0.00794  0.26499  1.57242 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -1.43556    0.20799  -6.902 1.86e-10 ***
## lnDn         2.19709    0.08653  25.391  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.4291 on 134 degrees of freedom
## Multiple R-squared:  0.8279, Adjusted R-squared:  0.8266 
## F-statistic: 644.7 on 1 and 134 DF,  p-value: < 2.2e-16

mod_2 <- lm(lnB ~ lnH, data=Datos); summary(mod_2)

## 
## Call:
## lm(formula = lnB ~ lnH, data = Datos)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -1.91262 -0.41642 -0.00232  0.48153  1.54759 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  -0.5066     0.3368  -1.504    0.135    
## lnH           2.5164     0.1954  12.875   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.6916 on 134 degrees of freedom
## Multiple R-squared:  0.553,  Adjusted R-squared:  0.5497 
## F-statistic: 165.8 on 1 and 134 DF,  p-value: < 2.2e-16

mod_3 <- lm(lnB ~ lnDnH, data=Datos); summary(mod_3)

## 
## Call:
## lm(formula = lnB ~ lnDnH, data = Datos)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -1.40929 -0.29333 -0.06071  0.27605  1.23295 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -1.90089    0.22306  -8.522 2.87e-14 ***
## lnDnH        1.39412    0.05418  25.733  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.4244 on 134 degrees of freedom
## Multiple R-squared:  0.8317, Adjusted R-squared:  0.8304 
## F-statistic: 662.2 on 1 and 134 DF,  p-value: < 2.2e-16

mod_4 <- lm(lnB ~ lnDn2H, data=Datos); summary(mod_4)

## 
## Call:
## lm(formula = lnB ~ lnDn2H, data = Datos)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -1.34613 -0.27591 -0.04486  0.26654  1.21860 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -1.86683    0.20517  -9.099  1.1e-15 ***
## lnDn2H       0.87572    0.03148  27.822  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.3974 on 134 degrees of freedom
## Multiple R-squared:  0.8524, Adjusted R-squared:  0.8513 
## F-statistic:   774 on 1 and 134 DF,  p-value: < 2.2e-16

mod_5 <- lm(B ~ lnDn2H, data=Datos); summary(mod_5)

## 
## Call:
## lm(formula = B ~ lnDn2H, data = Datos)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -58.812 -27.451  -9.208  10.999 151.311 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -267.864     21.223  -12.62   <2e-16 ***
## lnDn2H        52.422      3.256   16.10   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 41.1 on 134 degrees of freedom
## Multiple R-squared:  0.6592, Adjusted R-squared:  0.6567 
## F-statistic: 259.2 on 1 and 134 DF,  p-value: < 2.2e-16

mod_6 <- lm(B ~ Db, data=Datos); summary(mod_6)

## 
## Call:
## lm(formula = B ~ Db, data = Datos)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -52.711 -15.642  -3.823   9.966 103.477 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -80.6310     5.4130  -14.90   <2e-16 ***
## Db           10.9861     0.3641   30.17   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 25.22 on 134 degrees of freedom
## Multiple R-squared:  0.8717, Adjusted R-squared:  0.8707 
## F-statistic: 910.4 on 1 and 134 DF,  p-value: < 2.2e-16

Seleccionando el mejor modelo

library(performance)
Seleccion<-compare_performance(mod_1, mod_2, mod_3, mod_4, mod_5, mod_6,
                               rank = TRUE, 
                               verbose = TRUE, metrics = "all");
Seleccion

library(DT)
datatable(Seleccion, extensions = "Buttons", 
          options = list(dom = "Bfrtip",
                         buttons = c("copy", "csv", "excel", "pdf")))

Visualizando todos los modelos

library(see)
plot(compare_performance(mod_1, mod_2, mod_3, mod_4, mod_5, mod_6, rank = TRUE))

Verificando el cumplimiento o no de los supuestos

library(patchwork)
library(performance) 
library(car)

modelo_Sup1<-mod_1            

check_model(modelo_Sup1)

outlierTest(modelo_Sup1, cutoff=Inf, n.max=5)  # prueba de bonferroni para detectar outliers

##      rstudent unadjusted p-value Bonferroni p
## 1    3.991297         0.00010806     0.014696
## 3   -3.250004         0.00146180     0.198800
## 14  -3.067857         0.00261300     0.355370
## 121 -2.077962         0.03963600           NA
## 126 -2.065112         0.04085500           NA

check_heteroskedasticity (modelo_Sup1)         # los errores deben mostrar homogeneidad

## Warning: Heteroscedasticity (non-constant error variance) detected (p < .001).

check_autocorrelation (modelo_Sup1)            # residualaes no correlacionados

## OK: Residuals appear to be independent and not autocorrelated (p = 0.136).

check_collinearity (modelo_Sup1)               # variables (x) independienets no correlacioandas

## NULL

check_normality (modelo_Sup1)                  # normalidad de los residuales (errores)

## OK: residuals appear as normally distributed (p = 0.065).

modelo_Sup3<-mod_3            

check_model(modelo_Sup3)

outlierTest(modelo_Sup3, cutoff=Inf, n.max=5)

##      rstudent unadjusted p-value Bonferroni p
## 3   -3.491705         0.00065187     0.088654
## 134  3.005025         0.00317470     0.431760
## 1    2.191353         0.03016800           NA
## 127  2.071453         0.04024900           NA
## 14  -1.973062         0.05056200           NA

check_heteroskedasticity (modelo_Sup3)

## Warning: Heteroscedasticity (non-constant error variance) detected (p < .001).

check_autocorrelation (modelo_Sup3)

## Warning: Autocorrelated residuals detected (p = 0.006).

check_collinearity (modelo_Sup3)

## NULL

check_normality (modelo_Sup3)

## OK: residuals appear as normally distributed (p = 0.126).

modelo_Sup4<-mod_4            

check_model(modelo_Sup4)

outlierTest(modelo_Sup4, cutoff=Inf, n.max=5)

##      rstudent unadjusted p-value Bonferroni p
## 3   -3.570384         0.00049675     0.067558
## 1    3.252948         0.00144780     0.196910
## 134  2.799761         0.00587610     0.799150
## 14  -2.514396         0.01311600           NA
## 93   2.025415         0.04482600           NA

check_heteroskedasticity (modelo_Sup4)

## Warning: Heteroscedasticity (non-constant error variance) detected (p < .001).

check_autocorrelation (modelo_Sup4)

## OK: Residuals appear to be independent and not autocorrelated (p = 0.082).

check_collinearity (modelo_Sup4)

## NULL

check_normality (modelo_Sup4)

## OK: residuals appear as normally distributed (p = 0.095).

modelo_Sup6<-mod_6            

check_model(modelo_Sup6)

outlierTest(modelo_Sup6, cutoff=Inf, n.max=5)

##     rstudent unadjusted p-value Bonferroni p
## 88  4.402278         2.1806e-05    0.0029656
## 93  3.529768         5.7187e-04    0.0777740
## 100 2.800252         5.8677e-03    0.7980000
## 116 2.326671         2.1494e-02           NA
## 102 2.317569         2.2000e-02           NA

check_heteroskedasticity (modelo_Sup6)

## Warning: Heteroscedasticity (non-constant error variance) detected (p < .001).

check_autocorrelation (modelo_Sup6)

## Warning: Autocorrelated residuals detected (p = 0.018).

check_collinearity (modelo_Sup6)

## NULL

check_normality (modelo_Sup6)

## Warning: Non-normality of residuals detected (p < .001).

Modelo con mejor desempeño

library("effects")
plot(predictorEffects(mod_4))

Mod4_MDes<-lm(lnB ~ lnDn2H, data =Datos); summary(Mod4_MDes)

## 
## Call:
## lm(formula = lnB ~ lnDn2H, data = Datos)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -1.34613 -0.27591 -0.04486  0.26654  1.21860 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -1.86683    0.20517  -9.099  1.1e-15 ***
## lnDn2H       0.87572    0.03148  27.822  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.3974 on 134 degrees of freedom
## Multiple R-squared:  0.8524, Adjusted R-squared:  0.8513 
## F-statistic:   774 on 1 and 134 DF,  p-value: < 2.2e-16

plot(Mod4_MDes$fitted.values, lnB)

library(car)
Anova<-Anova(Mod4_MDes)
Anova

str(Mod4_MDes)

## List of 12
##  $ coefficients : Named num [1:2] -1.867 0.876
##   ..- attr(*, "names")= chr [1:2] "(Intercept)" "lnDn2H"
##  $ residuals    : Named num [1:136] 1.219 0.492 -1.346 -0.308 0.623 ...
##   ..- attr(*, "names")= chr [1:136] "1" "2" "3" "4" ...
##  $ effects      : Named num [1:136] -43.874 11.056 -1.479 -0.431 0.497 ...
##   ..- attr(*, "names")= chr [1:136] "(Intercept)" "lnDn2H" "" "" ...
##  $ rank         : int 2
##  $ fitted.values: Named num [1:136] 1.54 2.51 2.48 2.76 2.67 ...
##   ..- attr(*, "names")= chr [1:136] "1" "2" "3" "4" ...
##  $ assign       : int [1:2] 0 1
##  $ qr           :List of 5
##   ..$ qr   : num [1:136, 1:2] -11.6619 0.0857 0.0857 0.0857 0.0857 ...
##   .. ..- attr(*, "dimnames")=List of 2
##   .. .. ..$ : chr [1:136] "1" "2" "3" "4" ...
##   .. .. ..$ : chr [1:2] "(Intercept)" "lnDn2H"
##   .. ..- attr(*, "assign")= int [1:2] 0 1
##   ..$ qraux: num [1:2] 1.09 1.1
##   ..$ pivot: int [1:2] 1 2
##   ..$ tol  : num 1e-07
##   ..$ rank : int 2
##   ..- attr(*, "class")= chr "qr"
##  $ df.residual  : int 134
##  $ xlevels      : Named list()
##  $ call         : language lm(formula = lnB ~ lnDn2H, data = Datos)
##  $ terms        :Classes 'terms', 'formula'  language lnB ~ lnDn2H
##   .. ..- attr(*, "variables")= language list(lnB, lnDn2H)
##   .. ..- attr(*, "factors")= int [1:2, 1] 0 1
##   .. .. ..- attr(*, "dimnames")=List of 2
##   .. .. .. ..$ : chr [1:2] "lnB" "lnDn2H"
##   .. .. .. ..$ : chr "lnDn2H"
##   .. ..- attr(*, "term.labels")= chr "lnDn2H"
##   .. ..- attr(*, "order")= int 1
##   .. ..- attr(*, "intercept")= int 1
##   .. ..- attr(*, "response")= int 1
##   .. ..- attr(*, ".Environment")=<environment: R_GlobalEnv> 
##   .. ..- attr(*, "predvars")= language list(lnB, lnDn2H)
##   .. ..- attr(*, "dataClasses")= Named chr [1:2] "numeric" "numeric"
##   .. .. ..- attr(*, "names")= chr [1:2] "lnB" "lnDn2H"
##  $ model        :'data.frame':   136 obs. of  2 variables:
##   ..$ lnB   : num [1:136] 2.76 3 1.13 2.45 3.29 ...
##   ..$ lnDn2H: num [1:136] 3.89 5 4.96 5.28 5.18 ...
##   ..- attr(*, "terms")=Classes 'terms', 'formula'  language lnB ~ lnDn2H
##   .. .. ..- attr(*, "variables")= language list(lnB, lnDn2H)
##   .. .. ..- attr(*, "factors")= int [1:2, 1] 0 1
##   .. .. .. ..- attr(*, "dimnames")=List of 2
##   .. .. .. .. ..$ : chr [1:2] "lnB" "lnDn2H"
##   .. .. .. .. ..$ : chr "lnDn2H"
##   .. .. ..- attr(*, "term.labels")= chr "lnDn2H"
##   .. .. ..- attr(*, "order")= int 1
##   .. .. ..- attr(*, "intercept")= int 1
##   .. .. ..- attr(*, "response")= int 1
##   .. .. ..- attr(*, ".Environment")=<environment: R_GlobalEnv> 
##   .. .. ..- attr(*, "predvars")= language list(lnB, lnDn2H)
##   .. .. ..- attr(*, "dataClasses")= Named chr [1:2] "numeric" "numeric"
##   .. .. .. ..- attr(*, "names")= chr [1:2] "lnB" "lnDn2H"
##  - attr(*, "class")= chr "lm"

Mod4_MDes$ residuals        # Ver los residuales

##            1            2            3            4            5            6 
##  1.218601599  0.491787841 -1.346125849 -0.307972707  0.622752731 -0.395295541 
##            7            8            9           10           11           12 
## -0.434036271 -0.052939043  0.371940640 -0.420046559 -0.130530699 -0.097988410 
##           13           14           15           16           17           18 
##  0.521487525 -0.973203422  0.156536871 -0.101306813  0.070986905 -0.186445534 
##           19           20           21           22           23           24 
## -0.509194315 -0.098107447 -0.415151211 -0.390475902 -0.528377987 -0.020848034 
##           25           26           27           28           29           30 
##  0.300080669  0.069079271  0.071770134  0.081254404  0.291906115 -0.308079827 
##           31           32           33           34           35           36 
## -0.374571112 -0.404781030 -0.628787588 -0.195413890 -0.113834747  0.002400712 
##           37           38           39           40           41           42 
## -0.275540972 -0.554459081 -0.169690561 -0.417015630 -0.422396972  0.381793732 
##           43           44           45           46           47           48 
## -0.514383376  0.146695460 -0.220776925  0.657138853 -0.336566419 -0.497117253 
##           49           50           51           52           53           54 
##  0.120586446  0.231596655 -0.293163724  0.174507013 -0.090702890  0.155158510 
##           55           56           57           58           59           60 
## -0.173731804  0.283351142  0.095738223 -0.295952451 -0.066426394  0.498781198 
##           61           62           63           64           65           66 
## -0.201436364 -0.103914857 -0.226640829  0.148751048 -0.063010636  0.228693002 
##           67           68           69           70           71           72 
##  0.398705575 -0.486226603  0.462757386 -0.036781971 -0.019731041 -0.421595329 
##           73           74           75           76           77           78 
## -0.667369303  0.006727043 -0.036318227 -0.012009052  0.317431199 -0.120217291 
##           79           80           81           82           83           84 
## -0.348729650 -0.254941360 -0.176877017 -0.076681053  0.005087463  0.370916534 
##           85           86           87           88           89           90 
##  0.131675530  0.486772254 -0.224595023  0.634077728  0.461873236  0.132670630 
##           91           92           93           94           95           96 
##  0.403848927 -0.250328434  0.790449732 -0.176923538 -0.125356055 -0.063903365 
##           97           98           99          100          101          102 
##  0.054856671 -0.532992692 -0.126725360  0.014278360  0.570859222  0.198048182 
##          103          104          105          106          107          108 
## -0.243540553  0.260941002 -0.162370342 -0.260898270 -0.385782160 -0.033803123 
##          109          110          111          112          113          114 
##  0.645831348  0.156590862  0.389999502  0.471296168  0.190227571  0.108115467 
##          115          116          117          118          119          120 
##  0.216958156  0.443926714  0.480462008  0.415108697 -0.277031762 -0.211177554 
##          121          122          123          124          125          126 
## -0.354438257  0.455478219  0.628673677  0.742006415  0.139106707 -0.331181167 
##          127          128          129          130          131          132 
##  0.555553981 -0.327733182 -0.460773200 -0.584763482 -0.162857069  0.198025321 
##          133          134          135          136 
##  0.438978481  1.080760526 -0.097877143  0.556517514

Mod4_MDes$ fitted.values    # Ver los valores estimados

##          1          2          3          4          5          6          7 
## 1.54140834 2.50942936 2.47752796 2.75552357 2.66825769 2.28840750 2.88331574 
##          8          9         10         11         12         13         14 
## 2.46348128 3.11482263 2.80789150 2.89936237 2.96588731 2.88833918 2.89012603 
##         15         16         17         18         19         20         21 
## 2.82813446 2.97656493 2.96588731 2.98919967 3.34181925 3.00007834 2.72867624 
##         22         23         24         25         26         27         28 
## 3.22662611 2.97765746 3.44673803 3.07031380 2.68075246 3.02832215 3.02333227 
##         29         30         31         32         33         34         35 
## 3.31440662 3.21334020 3.52087624 3.23563866 3.54005071 3.07517399 3.51835992 
##         36         37         38         39         40         41         42 
## 3.27888637 3.61557192 3.23821659 3.38736567 2.63839067 3.16065301 3.41436939 
##         43         44         45         46         47         48         49 
## 3.20155037 3.26643049 3.44125147 3.19852542 3.67624294 3.74244025 3.66451187 
##         50         51         52         53         54         55         56 
## 3.82262012 3.70760633 3.81113813 3.99529838 3.76978072 3.94878895 3.86532376 
##         57         58         59         60         61         62         63 
## 3.74457393 4.02260927 4.35111580 3.87571717 3.98266708 3.97553245 3.75623143 
##         64         65         66         67         68         69         70 
## 4.22006369 4.29144893 4.27787158 4.15105760 4.21408636 4.08414789 4.21577401 
##         71         72         73         74         75         76         77 
## 4.28942849 3.94294331 4.11576829 4.01163600 4.29871637 4.38348017 4.18182276 
##         78         79         80         81         82         83         84 
## 4.27706706 4.43420163 4.40802642 4.42437176 4.65149512 4.33115698 4.66869487 
##         85         86         87         88         89         90         91 
## 4.61698876 4.48359645 4.64174746 4.78685542 4.70957620 4.73185931 4.65481434 
##         92         93         94         95         96         97         98 
## 5.01036333 4.60094806 5.19618816 4.95566780 5.10913330 4.82491035 5.32505730 
##         99        100        101        102        103        104        105 
## 5.39916066 5.55299687 4.79455589 5.36796536 5.48348701 5.16757095 5.65056399 
##        106        107        108        109        110        111        112 
## 3.67632723 4.08337355 3.40797183 2.97930926 3.88838837 3.50059556 3.83627628 
##        113        114        115        116        117        118        119 
## 4.03777338 4.06550228 4.20788848 4.82491035 4.67473528 5.17815182 6.01037304 
##        120        121        122        123        124        125        126 
## 6.02587509 0.09307349 0.98672377 1.60990609 2.04485497 2.88521302 2.85129407 
##        127        128        129        130        131        132        133 
## 2.85129407 3.50370151 4.02295500 3.82226477 4.02295500 3.50820277 3.38158691 
##        134        135        136 
## 3.44871582 4.28647072 4.59266159

confint(Mod4_MDes)          # ver los intervalos de confianza de los coeficienets

##                  2.5 %     97.5 %
## (Intercept) -2.2726301 -1.4610346
## lnDn2H       0.8134672  0.9379762

Corrección de los supuestos

verificación de datos atípicos

library(car)
library(performance)

Residuales<-cbind(ei=residuals(Mod4_MDes, type='working'),  
                  pi=residuals(Mod4_MDes, type='deviance'), 
                  pe=residuals(Mod4_MDes, type='pearson'),
                  pa=residuals(Mod4_MDes, type='partial'),
                  di=rstandard(Mod4_MDes),
                  ri=rstudent(Mod4_MDes)); Residuales

##               ei           pi           pe      lnDn2H           di
## 1    1.218601599  1.218601599  1.218601599 -1.00216455  3.142533961
## 2    0.491787841  0.491787841  0.491787841 -0.76095729  1.250293269
## 3   -1.346125849 -1.346125849 -1.346125849 -2.63077238 -3.423470540
## 4   -0.307972707 -0.307972707 -0.307972707 -1.31462363 -0.781160841
## 5    0.622752731  0.622752731  0.622752731 -0.47116407  1.580792882
## 6   -0.395295541 -0.395295541 -0.395295541 -1.86906253 -1.007514891
## 7   -0.434036271 -0.434036271 -0.434036271 -1.31289502 -1.099815421
## 8   -0.052939043 -0.052939043 -0.052939043 -1.35163226 -0.134655121
## 9    0.371940640  0.371940640  0.371940640 -0.27541122  0.941091631
## 10  -0.420046559 -0.420046559 -0.420046559 -1.37432955 -1.064977427
## 11  -0.130530699 -0.130530699 -0.130530699 -0.99334282 -0.330716686
## 12  -0.097988410 -0.097988410 -0.097988410 -0.89427559 -0.248152975
## 13   0.521487525  0.521487525  0.521487525 -0.35234778  1.321362100
## 14  -0.973203422 -0.973203422 -0.973203422 -1.84525188 -2.465902669
## 15   0.156536871  0.156536871  0.156536871 -0.77750317  0.396817391
## 16  -0.101306813 -0.101306813 -0.101306813 -0.88691637 -0.256538789
## 17   0.070986905  0.070986905  0.070986905 -0.72530027  0.179772399
## 18  -0.186445534 -0.186445534 -0.186445534 -0.95942036 -0.472096685
## 19  -0.509194315 -0.509194315 -0.509194315 -0.92954956 -1.287084241
## 20  -0.098107447 -0.098107447 -0.098107447 -0.86020360 -0.248399608
## 21  -0.415151211 -0.415151211 -0.415151211 -1.44864946 -1.053254740
## 22  -0.390475902 -0.390475902 -0.390475902 -0.92602429 -0.987450014
## 23  -0.528377987 -0.528377987 -0.528377987 -1.31289502 -1.337999704
## 24  -0.020848034 -0.020848034 -0.020848034 -0.33628450 -0.052680539
## 25   0.300080669  0.300080669  0.300080669 -0.39178002  0.759457327
## 26   0.069079271  0.069079271  0.069079271 -1.01234276  0.175330694
## 27   0.071770134  0.071770134  0.071770134 -0.66208220  0.181684014
## 28   0.081254404  0.081254404  0.081254404 -0.65758781  0.205699427
## 29   0.291906115  0.291906115  0.291906115 -0.15586176  0.737919996
## 30  -0.308079827 -0.308079827 -0.308079827 -0.85691412 -0.779130120
## 31  -0.374571112 -0.374571112 -0.374571112 -0.61586936 -0.946336211
## 32  -0.404781030 -0.404781030 -0.404781030 -0.93131686 -1.023584882
## 33  -0.628787588 -0.628787588 -0.628787588 -0.85091137 -1.588543925
## 34  -0.195413890 -0.195413890 -0.195413890 -0.88241439 -0.494548337
## 35  -0.113834747 -0.113834747 -0.113834747 -0.35764932 -0.287599555
## 36   0.002400712  0.002400712  0.002400712 -0.48088741  0.006069676
## 37  -0.275540972 -0.275540972 -0.275540972 -0.42214355 -0.696035835
## 38  -0.554459081 -0.554459081 -0.554459081 -1.07841698 -1.402065665
## 39  -0.169690561 -0.169690561 -0.169690561 -0.54449939 -0.428860658
## 40  -0.417015630 -0.417015630 -0.417015630 -1.54079945 -1.058842650
## 41  -0.422396972 -0.422396972 -0.422396972 -1.02391845 -1.068504368
## 42   0.381793732  0.381793732  0.381793732  0.03398863  0.964833453
## 43  -0.514383376 -0.514383376 -0.514383376 -1.07500750 -1.300939580
## 44   0.146695460  0.146695460  0.146695460 -0.34904854  0.370906055
## 45  -0.220776925 -0.220776925 -0.220776925 -0.54169995 -0.557885450
## 46   0.657138853  0.657138853  0.657138853  0.09348978  1.662009303
## 47  -0.336566419 -0.336566419 -0.336566419 -0.42249797 -0.850140960
## 48  -0.497117253 -0.497117253 -0.497117253 -0.51685150 -1.255643858
## 49   0.120586446  0.120586446  0.120586446  0.02292382  0.304594818
## 50   0.231596655  0.231596655  0.231596655  0.29204228  0.584986395
## 51  -0.293163724 -0.293163724 -0.293163724 -0.34773188 -0.740495635
## 52   0.174507013  0.174507013  0.174507013  0.22347065  0.440782272
## 53  -0.090702890 -0.090702890 -0.090702890  0.14242100 -0.229152900
## 54   0.155158510  0.155158510  0.155158510  0.16276474  0.391906662
## 55  -0.173731804 -0.173731804 -0.173731804  0.01288266 -0.438882850
## 56   0.283351142  0.283351142  0.283351142  0.38650042  0.715732857
## 57   0.095738223  0.095738223  0.095738223  0.07813767  0.241820359
## 58  -0.295952451 -0.295952451 -0.295952451 -0.03551767 -0.747739481
## 59  -0.066426394 -0.066426394 -0.066426394  0.52251491 -0.168023219
## 60   0.498781198  0.498781198  0.498781198  0.61232388  1.259911652
## 61  -0.201436364 -0.201436364 -0.201436364  0.01905622 -0.508899293
## 62  -0.103914857 -0.103914857 -0.103914857  0.10944310 -0.262522226
## 63  -0.226640829 -0.226640829 -0.226640829 -0.23258390 -0.572460014
## 64   0.148751048  0.148751048  0.148751048  0.60664025  0.376047369
## 65  -0.063010636 -0.063010636 -0.063010636  0.46626381 -0.159339246
## 66   0.228693002  0.228693002  0.228693002  0.74439009  0.578277538
## 67   0.398705575  0.398705575  0.398705575  0.78758869  1.007697289
## 68  -0.486226603 -0.486226603 -0.486226603 -0.03431474 -1.229168682
## 69   0.462757386  0.462757386  0.462757386  0.78473079  1.169353779
## 70  -0.036781971 -0.036781971 -0.036781971  0.41681754 -0.092984490
## 71  -0.019731041 -0.019731041 -0.019731041  0.50752296 -0.049894775
## 72  -0.421595329 -0.421595329 -0.421595329 -0.24082651 -1.065028493
## 73  -0.667369303 -0.667369303 -0.667369303 -0.31377551 -1.686541608
## 74   0.006727043  0.006727043  0.006727043  0.25618855  0.016995837
## 75  -0.036318227 -0.036318227 -0.036318227  0.50022366 -0.091843289
## 76  -0.012009052 -0.012009052 -0.012009052  0.60929663 -0.030381393
## 77   0.317431199  0.317431199  0.317431199  0.73707947  0.802364991
## 78  -0.120217291 -0.120217291 -0.120217291  0.39467528 -0.303982721
## 79  -0.348729650 -0.348729650 -0.348729650  0.32329749 -0.882481539
## 80  -0.254941360 -0.254941360 -0.254941360  0.39091057 -0.645052545
## 81  -0.176877017 -0.176877017 -0.176877017  0.48532025 -0.447573743
## 82  -0.076681053 -0.076681053 -0.076681053  0.81263957 -0.194318957
## 83   0.005087463  0.005087463  0.005087463  0.57406995  0.012867330
## 84   0.370916534  0.370916534  0.370916534  1.27743691  0.940067308
## 85   0.131675530  0.131675530  0.131675530  0.98648979  0.333598264
## 86   0.486772254  0.486772254  0.486772254  1.20819421  1.232157588
## 87  -0.224595023 -0.224595023 -0.224595023  0.65497795 -0.569109940
## 88   0.634077728  0.634077728  0.634077728  1.65875866  1.608558446
## 89   0.461873236  0.461873236  0.461873236  1.40927495  1.170960300
## 90   0.132670630  0.132670630  0.132670630  1.10235545  0.336411789
## 91   0.403848927  0.403848927  0.403848927  1.29648877  1.023426583
## 92  -0.250328434 -0.250328434 -0.250328434  0.99786040 -0.636390411
## 93   0.790449732  0.790449732  0.790449732  1.62922330  2.002368725
## 94  -0.176923538 -0.176923538 -0.176923538  1.25709013 -0.450717702
## 95  -0.125356055 -0.125356055 -0.125356055  1.06813725 -0.318505373
## 96  -0.063903365 -0.063903365 -0.063903365  1.28305545 -0.162630652
## 97   0.054856671  0.054856671  0.054856671  1.11759253  0.139208983
## 98  -0.532992692 -0.532992692 -0.532992692  1.02989011 -1.360017644
## 99  -0.126725360 -0.126725360 -0.126725360  1.51026081 -0.323683357
## 100  0.014278360  0.014278360  0.014278360  1.80510074  0.036551266
## 101  0.570859222  0.570859222  0.570859222  1.60324062  1.448278115
## 102  0.198048182  0.198048182  0.198048182  1.80383905  0.505641409
## 103 -0.243540553 -0.243540553 -0.243540553  1.47777196 -0.622797768
## 104  0.260941002  0.260941002  0.260941002  1.66633746  0.664528234
## 105 -0.162370342 -0.162370342 -0.162370342  1.72601916 -0.416285927
## 106 -0.260898270 -0.260898270 -0.260898270 -0.34674553 -0.659008980
## 107 -0.385782160 -0.385782160 -0.385782160 -0.06458310 -0.974841057
## 108 -0.033803123 -0.033803123 -0.033803123 -0.38800578 -0.085425677
## 109  0.645831348  0.645831348  0.645831348 -0.13703388  1.635406674
## 110  0.156590862  0.156590862  0.156590862  0.28280474  0.395550439
## 111  0.389999502  0.389999502  0.389999502  0.12842057  0.985356744
## 112  0.471296168  0.471296168  0.471296168  0.54539796  1.190448686
## 113  0.190227571  0.190227571  0.190227571  0.46582646  0.480636106
## 114  0.108115467  0.108115467  0.108115467  0.41144325  0.273186672
## 115  0.216958156  0.216958156  0.216958156  0.66267214  0.548452191
## 116  0.443926714  0.443926714  0.443926714  1.50666257  1.126546413
## 117  0.480462008  0.480462008  0.480462008  1.39302280  1.217759733
## 118  0.415108697  0.415108697  0.415108697  1.83108602  1.057273302
## 119 -0.277031762 -0.277031762 -0.277031762  1.97116679 -0.714786985
## 120 -0.211177554 -0.211177554 -0.211177554  2.05252304 -0.545036382
## 121 -0.354438257 -0.354438257 -0.354438257 -4.02353926 -0.949482310
## 122  0.455478219  0.455478219  0.455478219 -2.31997250  1.188826951
## 123  0.628673677  0.628673677  0.628673677 -1.52359473  1.619143706
## 124  0.742006415  0.742006415  0.742006415 -0.97531311  1.897397092
## 125  0.139106707  0.139106707  0.139106707 -0.73785476  0.352481111
## 126 -0.331181167 -0.331181167 -0.331181167 -1.24206159 -0.839387829
## 127  0.555553981  0.555553981  0.555553981 -0.35532644  1.408066933
## 128 -0.327733182 -0.327733182 -0.327733182 -0.58620617 -0.828031700
## 129 -0.460773200 -0.460773200 -0.460773200 -0.19999269 -1.164168667
## 130 -0.584763482 -0.584763482 -0.584763482 -0.52467320 -1.477044736
## 131 -0.162857069 -0.162857069 -0.162857069  0.09792344 -0.411467283
## 132  0.198025321  0.198025321  0.198025321 -0.05594640  0.500314563
## 133  0.438978481  0.438978481  0.438978481  0.05839090  1.109454701
## 134  1.080760526  1.080760526  1.080760526  0.76730185  2.730941181
## 135 -0.097877143 -0.097877143 -0.097877143  0.42641908 -0.247503175
## 136  0.556517514  0.556517514  0.556517514  1.38700461  1.409690372
##               ri
## 1    3.252947683
## 2    1.252949101
## 3   -3.570384330
## 4   -0.780018663
## 5    1.589776633
## 6   -1.007572037
## 7   -1.100683043
## 8   -0.134160812
## 9    0.940687341
## 10  -1.065515035
## 11  -0.329614904
## 12  -0.247282123
## 13   1.325083527
## 14  -2.514395978
## 15   0.395566440
## 16  -0.255642548
## 17   0.179121952
## 18  -0.470723459
## 19  -1.290273050
## 20  -0.247528004
## 21  -1.053687972
## 22  -0.987357436
## 23  -1.341992487
## 24  -0.052484146
## 25   0.758251854
## 26   0.174695290
## 27   0.181027118
## 28   0.204962817
## 29   0.736659685
## 30  -0.777981672
## 31  -0.945964840
## 32  -1.023768584
## 33  -1.597720968
## 34  -0.493149808
## 35  -0.286612885
## 36   0.006046987
## 37  -0.694690759
## 38  -1.407184153
## 39  -0.427550953
## 40  -1.059325222
## 41  -1.069074029
## 42   0.964582925
## 43  -1.304339431
## 44   0.369709315
## 45  -0.556446483
## 46   1.673130916
## 47  -0.849256214
## 48  -1.258374758
## 49   0.303561249
## 50   0.583545127
## 51  -0.739241471
## 52   0.439453183
## 53  -0.228340996
## 54   0.390665540
## 55  -0.437556757
## 56   0.714424118
## 57   0.240968939
## 58  -0.746503202
## 59  -0.167412729
## 60   1.262703020
## 61  -0.507487501
## 62  -0.261608114
## 63  -0.571018638
## 64   0.374839418
## 65  -0.158758624
## 66   0.576835959
## 67   1.007755838
## 68  -1.231536135
## 69   1.170972196
## 70  -0.092639872
## 71  -0.049708714
## 72  -1.065566563
## 73  -1.698359006
## 74   0.016932319
## 75  -0.091502828
## 76  -0.030267921
## 77   0.801292667
## 78  -0.302950807
## 79  -0.881748513
## 80  -0.643641210
## 81  -0.446234236
## 82  -0.193619811
## 83   0.012819235
## 84   0.939656657
## 85   0.332489259
## 86   1.234565074
## 87  -0.567668879
## 88   1.618244954
## 89   1.172597560
## 90   0.335295791
## 91   1.023609009
## 92  -0.634971648
## 93   2.025415057
## 94  -0.449373528
## 95  -0.317434876
## 96  -0.162038677
## 97   0.138698604
## 98  -1.364382691
## 99  -0.322599462
## 100  0.036414807
## 101  1.454290914
## 102  0.504232424
## 103 -0.621369506
## 104  0.663137596
## 105 -0.414998145
## 106 -0.657611908
## 107 -0.974659022
## 108 -0.085108645
## 109  1.645800333
## 110  0.394302004
## 111  0.985249068
## 112  1.192320071
## 113  0.479252614
## 114  0.272241233
## 115  0.547016205
## 116  1.127687850
## 117  1.219976688
## 118  1.057741941
## 119 -0.713476359
## 120 -0.543601745
## 121 -0.949130970
## 122  1.190678461
## 123  1.629105747
## 124  1.916220351
## 125  0.351326331
## 126 -0.838457129
## 127  1.413297592
## 128 -0.827054849
## 129 -1.165726736
## 130 -1.483650313
## 131 -0.410188295
## 132  0.498910427
## 133  1.110418970
## 134  2.799760538
## 135 -0.246634307
## 136  1.414951575

boxplot(Residuales, col=terrain.colors(4))        # todos los residuales

ri=rstudent(Mod4_MDes)    # residuales estudentizados 
boxplot(ri)                # Graficar residuales estudentizados

outlierTest(Mod4_MDes, cutoff=Inf, n.max=5)  # prueba de bonferroni para detectar outliers

##      rstudent unadjusted p-value Bonferroni p
## 3   -3.570384         0.00049675     0.067558
## 1    3.252948         0.00144780     0.196910
## 134  2.799761         0.00587610     0.799150
## 14  -2.514396         0.01311600           NA
## 93   2.025415         0.04482600           NA

#eliminar datos atípicos
Datos1<-Datos[-c(3, 1, 134),]
str(Datos1)

## 'data.frame':    133 obs. of  16 variables:
##  $ Sitio : chr  "Ana" "Ana" "Ana" "Ana" ...
##  $ Tipo  : chr  "Natural" "Natural" "Natural" "Natural" ...
##  $ Db    : num  6.3 6.9 7 7.1 7.2 7.4 8.2 8.2 8.6 8.7 ...
##  $ Dn    : num  5.8 6.2 6.5 5 7.1 5.3 7.2 7.5 7.6 6.5 ...
##  $ DC1   : num  3.8 2.8 4.7 1.1 3 2.6 4.6 2.3 2.2 4.1 ...
##  $ DC2   : num  3.2 2.2 3.8 1.3 2.4 1 4.2 2.2 3.8 1.7 ...
##  $ H     : num  4.4 5.1 4.2 4.6 4.5 5 5.7 3.7 4 5.9 ...
##  $ B     : num  20.11 11.56 26.87 6.64 11.58 ...
##  $ DnH   : num  25.5 31.6 27.3 23 31.9 ...
##  $ Dn2   : num  33.6 38.4 42.2 25 50.4 ...
##  $ Dn2H  : num  148 196 177 115 227 ...
##  $ lnB   : num  3 2.45 3.29 1.89 2.45 ...
##  $ lnDn  : num  1.76 1.82 1.87 1.61 1.96 ...
##  $ lnH   : num  1.48 1.63 1.44 1.53 1.5 ...
##  $ lnDnH : num  3.24 3.45 3.31 3.14 3.46 ...
##  $ lnDn2H: num  5 5.28 5.18 4.74 5.42 ...

attach(Datos1)

# Correr el modelo sin datos atípicos o outliers
Mod4_MDes1 <- lm(lnB ~ lnDn2H, data=Datos1)    # el modelo
outlierTest(Mod4_MDes1, cutoff=Inf, n.max=5)  # prueba de bonferroni para detectar outliers

##      rstudent unadjusted p-value Bonferroni p
## 14  -2.765920          0.0065033      0.86494
## 93   2.258715          0.0255670           NA
## 124  2.227489          0.0276340           NA
## 123  1.919754          0.0570800           NA
## 46   1.910527          0.0582680           NA

#veamos cuantos datos tenemos
Datos2<-Datos1[-14,]
str(Datos2)

## 'data.frame':    132 obs. of  16 variables:
##  $ Sitio : chr  "Ana" "Ana" "Ana" "Ana" ...
##  $ Tipo  : chr  "Natural" "Natural" "Natural" "Natural" ...
##  $ Db    : num  6.3 6.9 7 7.1 7.2 7.4 8.2 8.2 8.6 8.7 ...
##  $ Dn    : num  5.8 6.2 6.5 5 7.1 5.3 7.2 7.5 7.6 6.5 ...
##  $ DC1   : num  3.8 2.8 4.7 1.1 3 2.6 4.6 2.3 2.2 4.1 ...
##  $ DC2   : num  3.2 2.2 3.8 1.3 2.4 1 4.2 2.2 3.8 1.7 ...
##  $ H     : num  4.4 5.1 4.2 4.6 4.5 5 5.7 3.7 4 5.9 ...
##  $ B     : num  20.11 11.56 26.87 6.64 11.58 ...
##  $ DnH   : num  25.5 31.6 27.3 23 31.9 ...
##  $ Dn2   : num  33.6 38.4 42.2 25 50.4 ...
##  $ Dn2H  : num  148 196 177 115 227 ...
##  $ lnB   : num  3 2.45 3.29 1.89 2.45 ...
##  $ lnDn  : num  1.76 1.82 1.87 1.61 1.96 ...
##  $ lnH   : num  1.48 1.63 1.44 1.53 1.5 ...
##  $ lnDnH : num  3.24 3.45 3.31 3.14 3.46 ...
##  $ lnDn2H: num  5 5.28 5.18 4.74 5.42 ...

attach(Datos2)


#veamos si ya pasan lo supuestos
plot(lnDn2H,lnB, col="deepskyblue1",data=Datos2)  # la plot

Mod4_MDes2 <- lm(lnB ~ lnDn2H, data=Datos2);  summary(Mod4_MDes2 )   # el modelo

## 
## Call:
## lm(formula = lnB ~ lnDn2H, data = Datos2)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.95675 -0.27174 -0.03517  0.24140  0.78804 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -1.93576    0.19184  -10.09   <2e-16 ***
## lnDn2H       0.88538    0.02927   30.25   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.358 on 130 degrees of freedom
## Multiple R-squared:  0.8756, Adjusted R-squared:  0.8747 
## F-statistic: 915.2 on 1 and 130 DF,  p-value: < 2.2e-16

abline(Mod4_MDes2)

#Revisar nuevamente los supuestos
check_heteroskedasticity (Mod4_MDes2)

## Warning: Heteroscedasticity (non-constant error variance) detected (p = 0.043).

check_autocorrelation (Mod4_MDes2)

## OK: Residuals appear to be independent and not autocorrelated (p = 0.236).

check_collinearity (Mod4_MDes2)

## NULL

check_normality (Mod4_MDes2)

## OK: residuals appear as normally distributed (p = 0.071).

Transformación Box Cox variable B

attach(Datos2)
Lambda<-car::powerTransform(B, family="bcPower")  #Buscamos Lambda
str(Lambda)

## List of 14
##  $ value      : num 500
##  $ counts     : Named int [1:2] 3 3
##   ..- attr(*, "names")= chr [1:2] "function" "gradient"
##  $ convergence: int 0
##  $ message    : chr "CONVERGENCE: REL_REDUCTION_OF_F <= FACTR*EPSMCH"
##  $ hessian    : num [1, 1] 218
##  $ start      : num 0.0947
##  $ lambda     : Named num 0.0947
##   ..- attr(*, "names")= chr "B"
##  $ roundlam   : Named num 0
##   ..- attr(*, "names")= chr "B"
##  $ invHess    : num [1, 1] 0.0046
##  $ llik       : num 500
##  $ family     : chr "bcPower"
##  $ xqr        :List of 4
##   ..$ qr   : num [1:132, 1] -11.489 0.087 0.087 0.087 0.087 ...
##   ..$ rank : int 1
##   ..$ qraux: num 1.09
##   ..$ pivot: int 1
##   ..- attr(*, "class")= chr "qr"
##  $ y          : num [1:132, 1] 20.11 11.56 26.87 6.64 11.58 ...
##   ..- attr(*, "dimnames")=List of 2
##   .. ..$ : NULL
##   .. ..$ : chr "B"
##  $ x          : num [1:132, 1] 1 1 1 1 1 1 1 1 1 1 ...
##  - attr(*, "class")= chr "powerTransform"

Lambda$ start

## [1] 0.09474429

Datos2$B_Box<-B^Lambda$ start
View(Datos2)
attach(Datos2)

shapiro.test(B_Box) #  Prueba de normalidad

## 
##  Shapiro-Wilk normality test
## 
## data:  B_Box
## W = 0.98299, p-value = 0.09826

hist(B, col = 2)

hist(B_Box, col = 3)

Mod4_MDes3 <- lm(B_Box ~ lnDn2H, data=Datos2)   # La regresion con Biomasa normalizado
summary(Mod4_MDes3)

## 
## Call:
## lm(formula = B_Box ~ lnDn2H, data = Datos2)
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -0.116360 -0.036558 -0.007177  0.034728  0.118994 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 0.670101   0.026394   25.39   <2e-16 ***
## lnDn2H      0.118816   0.004027   29.51   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.04926 on 130 degrees of freedom
## Multiple R-squared:  0.8701, Adjusted R-squared:  0.8691 
## F-statistic: 870.6 on 1 and 130 DF,  p-value: < 2.2e-16

# revisar los supuestos

check_heteroskedasticity (Mod4_MDes3)

## OK: Error variance appears to be homoscedastic (p = 0.134).

check_autocorrelation (Mod4_MDes3)

## OK: Residuals appear to be independent and not autocorrelated (p = 0.062).

check_collinearity (Mod4_MDes3)

## NULL

check_normality (Mod4_MDes3)

## OK: residuals appear as normally distributed (p = 0.088).

# Modelo final

plot(lnDn2H, B_Box, col="deepskyblue1")      
abline(lm( B_Box~lnDn2H))

# Predecir Biomasa con el el modelo final

estimados<-(Mod4_MDes3$ fitted.values); estimados # estimados por el modelo

##         2         4         5         6         7         8         9        10 
## 1.2638636 1.2972531 1.2854131 1.2338757 1.3145917 1.2576294 1.3460021 1.3043583 
##        11        12        13        14        15        17        18        19 
## 1.3167689 1.3257949 1.3152733 1.3155157 1.3071049 1.3257949 1.3289579 1.3768006 
##        20        21        22        23        24        25        26        27 
## 1.3304339 1.2936106 1.3611714 1.3273918 1.3910358 1.3399633 1.2871083 1.3342659 
##        28        29        30        31        32        33        34        35 
## 1.3335889 1.3730813 1.3593688 1.4010947 1.3623942 1.4036963 1.3406227 1.4007533 
##        36        37        38        39        40        41        42        43 
## 1.3682620 1.4139428 1.3627440 1.3829803 1.2813608 1.3522203 1.3866441 1.3577692 
##        44        45        46        47        48        49        50        51 
## 1.3665720 1.3902914 1.3573588 1.4221746 1.4311561 1.4205829 1.4420347 1.4264299 
##        52        53        54        55        56        57        58        59 
## 1.4404769 1.4654634 1.4348656 1.4591531 1.4478287 1.4314456 1.4691689 1.5137400 
##        60        61        62        63        64        65        66        67 
## 1.4492388 1.4637496 1.4627816 1.4330272 1.4959591 1.5056445 1.5038024 1.4865965 
##        68        69        70        71        72        73        74        75 
## 1.4951481 1.4775183 1.4953771 1.5053704 1.4583599 1.4818085 1.4676800 1.5066305 
##        76        77        78        79        80        81        82        83 
## 1.5181311 1.4907706 1.5036932 1.5250129 1.5214615 1.5236792 1.5544949 1.5110320 
##        84        85        86        87        88        89        90        91 
## 1.5568285 1.5498131 1.5317147 1.5531723 1.5728603 1.5623752 1.5653985 1.5549452 
##        92        93        94        95        96        97        98        99 
## 1.6031854 1.5476368 1.6283978 1.5957644 1.6165863 1.5780235 1.6458825 1.6559367 
##       100       101       102       103       104       105       106       107 
## 1.6768089 1.5739051 1.6517042 1.6673779 1.6245150 1.6900466 1.4221860 1.4774132 
##       108       109       110       111       112       113       114       115 
## 1.3857761 1.3276160 1.4509580 1.3983431 1.4438876 1.4712263 1.4749885 1.4943072 
##       116       117       118       119       120       121       122       123 
## 1.5780235 1.5576481 1.6259506 1.7388648 1.7409681 0.9360171 1.0572658 1.1418180 
##       124       125       126       127       128       129       130       131 
## 1.2008310 1.3148492 1.3102471 1.3102471 1.3987645 1.4692158 1.4419865 1.4692158 
##       132       133       135       136 
## 1.3993752 1.3821962 1.5049691 1.5465125

regreso_unidades_originales<-(estimados^(1/Lambda$ start)); regreso_unidades_originales

##           2           4           5           6           7           8 
##  11.8417978  15.5935519  14.1551980   9.1906216  17.9392942  11.2396116 
##           9          10          11          12          13          14 
##  23.0166709  16.5189628  18.2553717  19.6202338  18.0377063  18.0728293 
##          15          17          18          19          20          21 
##  16.8898052  19.6202338  20.1199528  29.2244142  20.3570649  15.1375573 
##          22          23          24          25          26          27 
##  25.9067602  19.8711142  32.5759016  21.9498037  14.3534859  20.9845244 
##          28          29          30          31          32          33 
##  20.8724127  28.4018224  25.5469263  35.1499092  26.1534604  35.8449210 
##          34          35          36          37          38          39 
##  22.0640832  35.0596119  27.3671280  38.7049865  26.2244159  30.6389607 
##          40          41          42          43          44          45 
##  13.6912273  24.1640638  31.5066045  25.2314074  27.0124536  32.3923726 
##          46          47          48          49          50          51 
##  25.1510246  41.1505800  43.9778081  40.6670797  47.6370780  42.4688949 
##          52          53          54          55          56          57 
##  47.0966939  56.4703247  45.1959373  53.9559691  49.6964870  44.0717920 
##          58          59          60          61          62          63 
##  57.9957468  79.5045403  50.2097556  55.7771784  55.3890762  44.5884953 
##          64          65          66          67          68          69 
##  70.1826340  75.1297006  74.1651527  65.6826951  69.7820905  61.5705385 
##          70          71          72          73          74          75 
##  69.8949704  74.9854511  53.6472240  63.4838973  57.3784194  75.6506391 
##          76          77          78          79          80          81 
##  81.9728016  67.6556030  74.1083521  85.9808554  83.8908442  85.1905062 
##          82          83          84          85          86          87 
## 105.2379684  78.0161273 106.9174633 101.9403612  90.0537426 104.2967822 
##          88          89          90          91          92          93 
## 119.1271044 111.0071844 113.2955034 105.5602062 145.7294394 100.4395192 
##          94          95          96          97          98          99 
## 171.8201033 138.7649589 159.1124660 123.3199517 192.3225539 205.0909284 
##         100         101         102         103         104         105 
## 234.0790010 119.9649666 199.6251554 220.5507021 167.5449100 254.3361207 
##         106         107         108         109         110         111 
##  41.1540728  61.5243451  31.2990612  19.9065538  50.8420002  34.4280958 
##         112         113         114         115         116         117 
##  48.2870911  58.8587357  60.4669078  69.3689546 123.3199517 107.5130209 
##         118         119         120         121         122         123 
## 169.1142596 343.5102926 347.9212275   0.4976319   1.7999381   4.0543319 
##         124         125         126         127         128         129 
##   6.9009615  17.9764058  17.3233115  17.3233115  34.5377627  58.0152941 
##         130         131         132         133         135         136 
##  47.6202703  58.0152941  34.6972576  30.4561200  74.7747328  99.6720607

df<-data.frame(estimados, regreso_unidades_originales); df

Modelo final con mejor desempeño

library(report)
report(Mod4_MDes3)

## We fitted a linear model (estimated using OLS) to predict B_Box with lnDn2H
## (formula: B_Box ~ lnDn2H). The model explains a statistically significant and
## substantial proportion of variance (R2 = 0.87, F(1, 130) = 870.65, p < .001,
## adj. R2 = 0.87). The model's intercept, corresponding to lnDn2H = 0, is at 0.67
## (95% CI [0.62, 0.72], t(130) = 25.39, p < .001). Within this model:
## 
##   - The effect of lnDn2H is statistically significant and positive (beta = 0.12,
## 95% CI [0.11, 0.13], t(130) = 29.51, p < .001; Std. beta = 0.93, 95% CI [0.87,
## 1.00])
## 
## Standardized parameters were obtained by fitting the model on a standardized
## version of the dataset. 95% Confidence Intervals (CIs) and p-values were
## computed using a Wald t-distribution approximation.

tabla<-report_table(Mod4_MDes3)

library(DT)
datatable(tabla)

datatable(tabla, extensions = "Buttons", 
          options = list(dom = "Bfrtip",
                         buttons = c("copy", "csv", "excel", "pdf")))

library(equatiomatic)
extract_eq(Mod4_MDes3)                                      # El modelo final

\[ \operatorname{B\_Box} = \alpha + \beta_{1}(\operatorname{lnDn2H}) + \epsilon \]

extract_eq(Mod4_MDes3, wrap = TRUE, use_coefs = TRUE)       # El modelo final y coefficientes

\[ \begin{aligned} \operatorname{\widehat{B\_Box}} &= 0.67 + 0.12(\operatorname{lnDn2H}) \end{aligned} \]

Conclusión: El mejor modelo resulto el mod_4 con R^2= 0.8513 y RMSE=0.3944, pero no se cumplen con los supuestos, se utilizón la transformación de Box-Cox para transformar la variable B (Biomasa aérea), de modo que con la transformación Box Cox pasaron los supuestos de regresión.

Regresión Dummy

librerias Trabajo regresión Dummy

library(car)            
library(tidyverse)      
library(knitr)
library(ggplot2)
library(performance)
library(DT)
library(effects)
library(equatiomatic) 
library(vembedr)

Conversión de variable Tipo a Factor 0 y 1

Datos2$TipoD <- factor(Datos2$Tipo, labels=c("0", "1"))  
attach(Datos2)
view(Datos2)

Opción 1: Una Sola regresión

color<-as.factor(Tipo)
plot(lnDn2H, B_Box, col=color, cex=1.5, pch=16)

#Regresión sin procedencia (Tipo)
Fit0 <- lm(B_Box ~ lnDn2H)             # sin considear tipo
summary(Fit0)                         # los coeficicientes de regresion

## 
## Call:
## lm(formula = B_Box ~ lnDn2H)
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -0.116360 -0.036558 -0.007177  0.034728  0.118994 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 0.670101   0.026394   25.39   <2e-16 ***
## lnDn2H      0.118816   0.004027   29.51   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.04926 on 130 degrees of freedom
## Multiple R-squared:  0.8701, Adjusted R-squared:  0.8691 
## F-statistic: 870.6 on 1 and 130 DF,  p-value: < 2.2e-16

extract_eq(Fit0)                      # El modelo final

\[ \operatorname{B\_Box} = \alpha + \beta_{1}(\operatorname{lnDn2H}) + \epsilon \]

extract_eq(Fit0, wrap = TRUE, use_coefs = TRUE) # modelo final y coefficientes

\[ \begin{aligned} \operatorname{\widehat{B\_Box}} &= 0.67 + 0.12(\operatorname{lnDn2H}) \end{aligned} \]

Opción 2: Regresión Dummy (procedencia).

# Método 1
Fit1 <- lm(B_Box ~ Tipo)          # opcion 1: con la variable original
summary(Fit1)

## 
## Call:
## lm(formula = B_Box ~ Tipo)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.36904 -0.09392 -0.00619  0.08346  0.28424 
## 
## Coefficients:
##                Estimate Std. Error t value Pr(>|t|)    
## (Intercept)     1.45062    0.01224  118.52  < 2e-16 ***
## TipoPlantacion -0.10604    0.03631   -2.92  0.00412 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.1324 on 130 degrees of freedom
## Multiple R-squared:  0.06157,    Adjusted R-squared:  0.05435 
## F-statistic: 8.529 on 1 and 130 DF,  p-value: 0.004123

plot(predictorEffects(Fit1))

extract_eq(Fit1)                      # extraer el modelo

\[ \operatorname{B\_Box} = \alpha + \beta_{1}(\operatorname{Tipo}_{\operatorname{Plantacion}}) + \epsilon \]

extract_eq(Fit1, wrap = TRUE, use_coefs = TRUE) # modelo final y coefficientes

\[ \begin{aligned} \operatorname{\widehat{B\_Box}} &= 1.45 - 0.11(\operatorname{Tipo}_{\operatorname{Plantacion}}) \end{aligned} \]

# Método 2

Fit1_a <- lm(B_Box ~ TipoD)     # opcion 2: con la variable dummy
summary(Fit1_a)

## 
## Call:
## lm(formula = B_Box ~ TipoD)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.36904 -0.09392 -0.00619  0.08346  0.28424 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  1.45062    0.01224  118.52  < 2e-16 ***
## TipoD1      -0.10604    0.03631   -2.92  0.00412 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.1324 on 130 degrees of freedom
## Multiple R-squared:  0.06157,    Adjusted R-squared:  0.05435 
## F-statistic: 8.529 on 1 and 130 DF,  p-value: 0.004123

# Esto equivale

Plantacion   <- Datos2 %>% filter(Tipo == "Plantacion") %>% pull(B)
Natural<- Datos2 %>% filter(Tipo == "Natural") %>% pull(B)

t.test(
  x           = Plantacion,
  y           = Natural,
  alternative = "two.sided",
  mu          = 0,
  var.equal   = TRUE,
  conf.level  = 0.95
)

## 
##  Two Sample t-test
## 
## data:  Plantacion and Natural
## t = -1.9728, df = 130, p-value = 0.05064
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -75.7227010   0.1067352
## sample estimates:
## mean of x mean of y 
##  36.69800  74.50598

ggplot(Datos2, aes(x = Tipo, y = B, colour = Tipo)) + 
  geom_boxplot(outlier.shape = NA) +
  geom_jitter()

Opción 3: Intercept dummy (regresión con la misma pendiente).

# Método 1
Fit2_a <- lm(B_Box ~ factor(Tipo) + lnDn2H)   # con la variable original
summary(Fit2_a)

## 
## Call:
## lm(formula = B_Box ~ factor(Tipo) + lnDn2H)
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -0.111286 -0.037356 -0.005348  0.034459  0.119619 
## 
## Coefficients:
##                        Estimate Std. Error t value Pr(>|t|)    
## (Intercept)             0.65265    0.02817  23.165   <2e-16 ***
## factor(Tipo)Plantacion  0.02389    0.01416   1.688   0.0939 .  
## lnDn2H                  0.12109    0.00422  28.695   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.04891 on 129 degrees of freedom
## Multiple R-squared:  0.8729, Adjusted R-squared:  0.8709 
## F-statistic: 442.9 on 2 and 129 DF,  p-value: < 2.2e-16

# Método 2
Fit2 <- lm(B_Box ~ TipoD + lnDn2H)          # con la variable dummy
summary(Fit2)

## 
## Call:
## lm(formula = B_Box ~ TipoD + lnDn2H)
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -0.111286 -0.037356 -0.005348  0.034459  0.119619 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  0.65265    0.02817  23.165   <2e-16 ***
## TipoD1       0.02389    0.01416   1.688   0.0939 .  
## lnDn2H       0.12109    0.00422  28.695   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.04891 on 129 degrees of freedom
## Multiple R-squared:  0.8729, Adjusted R-squared:  0.8709 
## F-statistic: 442.9 on 2 and 129 DF,  p-value: < 2.2e-16

extract_eq(Fit2)                      # extraer el modelo

\[ \operatorname{B\_Box} = \alpha + \beta_{1}(\operatorname{TipoD}_{\operatorname{1}}) + \beta_{2}(\operatorname{lnDn2H}) + \epsilon \]

extract_eq(Fit2, wrap = TRUE, use_coefs = TRUE) # modelo final y coefficientes

\[ \begin{aligned} \operatorname{\widehat{B\_Box}} &= 0.65 + 0.02(\operatorname{TipoD}_{\operatorname{1}}) + 0.12(\operatorname{lnDn2H}) \end{aligned} \]

color<-as.factor(Tipo)
plot(lnDn2H, B_Box, col=color, cex=1.5, pch=16)

points(lnDn2H, fitted.values(Fit2), lwd=3, col="blue")

Opción 4: Slope dummy (regresión con diferente intercepta y pendiente).

# Método 1

Fit3 <- lm(B_Box ~ TipoD + lnDn2H+ TipoD*lnDn2H)   # dummy
summary(Fit3)

## 
## Call:
## lm(formula = B_Box ~ TipoD + lnDn2H + TipoD * lnDn2H)
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -0.103320 -0.035525 -0.004569  0.030738  0.114140 
## 
## Coefficients:
##                Estimate Std. Error t value Pr(>|t|)    
## (Intercept)    0.607305   0.030831  19.698  < 2e-16 ***
## TipoD1         0.201308   0.058057   3.467 0.000716 ***
## lnDn2H         0.127976   0.004631  27.632  < 2e-16 ***
## TipoD1:lnDn2H -0.030821   0.009801  -3.145 0.002068 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.04731 on 128 degrees of freedom
## Multiple R-squared:  0.882,  Adjusted R-squared:  0.8792 
## F-statistic: 318.9 on 3 and 128 DF,  p-value: < 2.2e-16

plot(lnDn2H, B_Box, col=color, cex=1.5, pch=16)
points(lnDn2H, fitted.values(Fit3), lwd=3, col="#7FFF00")  # Agregar los estimados a la plot

extract_eq(Fit3)                      # extraeer el modelo

\[ \operatorname{B\_Box} = \alpha + \beta_{1}(\operatorname{TipoD}_{\operatorname{1}}) + \beta_{2}(\operatorname{lnDn2H}) + \beta_{3}(\operatorname{TipoD}_{\operatorname{1}} \times \operatorname{lnDn2H}) + \epsilon \]

extract_eq(Fit3, wrap = TRUE, use_coefs = TRUE) # modelo final y coefficientes

\[ \begin{aligned} \operatorname{\widehat{B\_Box}} &= 0.61 + 0.2(\operatorname{TipoD}_{\operatorname{1}}) + 0.13(\operatorname{lnDn2H}) - 0.03(\operatorname{TipoD}_{\operatorname{1}} \times \operatorname{lnDn2H}) \end{aligned} \]

Conclusión: La ecuación quedo lm(formula = B_Box ~ TipoD + lnDn2H + TipoD * lnDn2H), donde todos los parámetros son estadísticamente significativos, por lo tanto, se requiere una ecuación diferente para cada tipo de procedencia natural y plantación, para predecir Biomasa.

Cumplimiento de supuestos Dummy

check_heteroskedasticity (Fit3)

## OK: Error variance appears to be homoscedastic (p = 0.858).

check_autocorrelation (Fit3)

## OK: Residuals appear to be independent and not autocorrelated (p = 0.340).

check_collinearity (Fit3)

check_normality (Fit3)

## OK: residuals appear as normally distributed (p = 0.126).

Validación Bootstrap

Librerias

library(caret)        
library(DT)
library(tidyr)
library(performance)
library(car)
library(effects)
library(equatiomatic)
library(bootstrap)

Extracción de variables del resumen del modelo (Error del modelo, R2 ajustada y el coeficiente de variación)

# Modelo de regresión

FitBoot <- lm(B_Box ~ TipoD + lnDn2H+ TipoD*lnDn2H)   
summary(FitBoot)

## 
## Call:
## lm(formula = B_Box ~ TipoD + lnDn2H + TipoD * lnDn2H)
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -0.103320 -0.035525 -0.004569  0.030738  0.114140 
## 
## Coefficients:
##                Estimate Std. Error t value Pr(>|t|)    
## (Intercept)    0.607305   0.030831  19.698  < 2e-16 ***
## TipoD1         0.201308   0.058057   3.467 0.000716 ***
## lnDn2H         0.127976   0.004631  27.632  < 2e-16 ***
## TipoD1:lnDn2H -0.030821   0.009801  -3.145 0.002068 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.04731 on 128 degrees of freedom
## Multiple R-squared:  0.882,  Adjusted R-squared:  0.8792 
## F-statistic: 318.9 on 3 and 128 DF,  p-value: < 2.2e-16

# Error del modelo
sqrt((sum(FitBoot$residuals^2))/(length(FitBoot$residuals)-2))

## [1] 0.04694561

# Coeficiente de variación

(((sqrt((sum(FitBoot$residuals^2))/(length(FitBoot$residuals)-2))/(mean(mpg))*100)))

## [1] NA

# R2 ajustada 
summary(FitBoot)$adj.r.squared

## [1] 0.8792403

Creación de la función Bootstrap para el Error del modelo

## 
## ORDINARY NONPARAMETRIC BOOTSTRAP
## 
## 
## Call:
## boot(data = Datos2, statistic = variable_boot1, R = 500, formula = FitBoot)
## 
## 
## Bootstrap Statistics :
##       original        bias    std. error
## t1* 0.04694561 -0.0008714507  0.00256407

## BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS
## Based on 500 bootstrap replicates
## 
## CALL : 
## boot.ci(boot.out = output1, type = c("norm", "basic", "perc", 
##     "bca"))
## 
## Intervals : 
## Level      Normal              Basic         
## 95%   ( 0.0428,  0.0528 )   ( 0.0427,  0.0530 )  
## 
## Level     Percentile            BCa          
## 95%   ( 0.0409,  0.0512 )   ( 0.0431,  0.0535 )  
## Calculations and Intervals on Original Scale
## Some BCa intervals may be unstable

## 
##  Lilliefors (Kolmogorov-Smirnov) normality test
## 
## data:  output1$t
## D = 0.028639, p-value = 0.4089

## 
## ORDINARY NONPARAMETRIC BOOTSTRAP
## 
## 
## Call:
## boot(data = Datos2, statistic = variable_boot, R = 500, formula = FitBoot)
## 
## 
## Bootstrap Statistics :
##      original      bias    std. error
## t1* 0.8792403 0.002771915  0.01948477

## BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS
## Based on 500 bootstrap replicates
## 
## CALL : 
## boot.ci(boot.out = output, type = c("norm", "basic", "perc", 
##     "bca"))
## 
## Intervals : 
## Level      Normal              Basic         
## 95%   ( 0.8383,  0.9147 )   ( 0.8395,  0.9217 )  
## 
## Level     Percentile            BCa          
## 95%   ( 0.8367,  0.9190 )   ( 0.8280,  0.9116 )  
## Calculations and Intervals on Original Scale
## Some BCa intervals may be unstable

## 
##  Lilliefors (Kolmogorov-Smirnov) normality test
## 
## data:  output$t
## D = 0.049944, p-value = 0.004632

Conclusiones generales: El modelo final para predecir la biomasa Aérea de Prosopis grandulosa, de acuerdo a las procedencias del material bosque natural y plantación se requieren dos ecuaciones cada una con diferente intercepta y pendiente quedando de la siguiente manera:

Ecuación para procedencia natural (0):

B_Box = β0 + β1 (lnDn2H) + ε

Ecuación para procedencia de plantación (1):

B_Box = β0 + β1(TipoD1) + β2 (lnDn2H) + β3 (TipoD1 x lnDn2H) + ε

La regresión Dummy, es práctico cuando se cuenta con variables categoricas o clase como en el ejemplo desarrollado, convirtiendo a variables dummy la procedencia nos permitio desarrollo si es necesario utilizar una ecuación o separar en ecuaciones diferentes para estimar biomasa de acuerdo al tipo de origen o procedencia de un bosque natural o de una plantación. Por lo tanto, los datos pertecenecen a dos categorias o clases de las observaciones (Natural y Plantación).

Regresión Dummy

Juan Ramírez Méndez

2023-11-20

LIBRERIAS DE TRABAJO

Base de datos

Modelo a ajustar

Generación de variables

Correlación

Ajuste de los modelos

Seleccionando el mejor modelo

Visualizando todos los modelos

Verificando el cumplimiento o no de los supuestos

Modelo con mejor desempeño

Corrección de los supuestos

verificación de datos atípicos

Transformación Box Cox variable B

Modelo final con mejor desempeño

Regresión Dummy

librerias Trabajo regresión Dummy

Conversión de variable Tipo a Factor 0 y 1

Opción 1: Una Sola regresión

Opción 2: Regresión Dummy (procedencia).

Opción 3: Intercept dummy (regresión con la misma pendiente).

Opción 4: Slope dummy (regresión con diferente intercepta y pendiente).

Cumplimiento de supuestos Dummy

Validación Bootstrap

Librerias

Extracción de variables del resumen del modelo (Error del modelo, R2 ajustada y el coeficiente de variación)

Creación de la función Bootstrap para el Error del modelo