IMPORTANTE: en este video explico todo lo de este script

https://www.youtube.com/watch?v=5cDtzvCDzy0

Los datos

setwd("C:/CURSO REG JMG")       # Ajuste del directorio de trabajo 
data ("airquality")             # Conjunto de datos 
Datos<-airquality               # el objeto
Datos<-Datos %>% drop_na()      # Eliminar celdas con NA
attach(Datos)                   # crucial
str(Datos)                      # estructura de los datos
## 'data.frame':    111 obs. of  6 variables:
##  $ Ozone  : int  41 36 12 18 23 19 8 16 11 14 ...
##  $ Solar.R: int  190 118 149 313 299 99 19 256 290 274 ...
##  $ Wind   : num  7.4 8 12.6 11.5 8.6 13.8 20.1 9.7 9.2 10.9 ...
##  $ Temp   : int  67 72 74 62 65 59 61 69 66 68 ...
##  $ Month  : int  5 5 5 5 5 5 5 5 5 5 ...
##  $ Day    : int  1 2 3 4 7 8 9 12 13 14 ...

El modelo inicial

plot(Temp, Ozone, col="deepskyblue1")  # la plot
mod <- lm(Ozone ~ Temp, data=Datos)    # el modelo lineal simple
abline(mod, col="red")                 # la linea de los predichos

summary(mod)                           # resumen del modelo
## 
## Call:
## lm(formula = Ozone ~ Temp, data = Datos)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -40.922 -17.459  -0.874  10.444 118.078 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -147.6461    18.7553  -7.872 2.76e-12 ***
## Temp           2.4391     0.2393  10.192  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 23.92 on 109 degrees of freedom
## Multiple R-squared:  0.488,  Adjusted R-squared:  0.4833 
## F-statistic: 103.9 on 1 and 109 DF,  p-value: < 2.2e-16
library(gvlma)                         # liberia para evaluar los supuestos
gvlma(mod)                             # una forma para evaluar los supuestos
## 
## Call:
## lm(formula = Ozone ~ Temp, data = Datos)
## 
## Coefficients:
## (Intercept)         Temp  
##    -147.646        2.439  
## 
## 
## ASSESSMENT OF THE LINEAR MODEL ASSUMPTIONS
## USING THE GLOBAL TEST ON 4 DEGREES-OF-FREEDOM:
## Level of Significance =  0.05 
## 
## Call:
##  gvlma(x = mod) 
## 
##                        Value   p-value                   Decision
## Global Stat        1.924e+02 0.000e+00 Assumptions NOT satisfied!
## Skewness           4.941e+01 2.076e-12 Assumptions NOT satisfied!
## Kurtosis           1.312e+02 0.000e+00 Assumptions NOT satisfied!
## Link Function      1.185e+01 5.764e-04 Assumptions NOT satisfied!
## Heteroscedasticity 5.076e-03 9.432e-01    Assumptions acceptable.

Interpretaciones mas facil

library(performance)
## Warning: package 'performance' was built under R version 4.1.3
#library(easystats)
library(car)
#library(graphics)

#revisar los supuestos
check_heteroskedasticity (mod)  # los errores deben mostrar homogeneidad
## Warning: Heteroscedasticity (non-constant error variance) detected (p = 0.019).
check_autocorrelation (mod)     # residualaes no correlacionados
## OK: Residuals appear to be independent and not autocorrelated (p = 0.408).
check_collinearity (mod)        # variables (x) independienets no correlacioandas
## Warning: Not enough model terms in the conditional part of the model to check for
##   multicollinearity.
## NULL
check_normality (mod)           # normalidad de los residuales (errores)
## Warning: Non-normality of residuals detected (p < .001).

Corrigiendo la los supuestos de regresion

1. Deteccion de datos atipicos

Los conceptos de atípico e influyente son diferentes y se definen así: Punto atípico (outlier): es una observación que es numéricamente distante del resto de los datos. Punto influyente: punto que tiene impacto en las estimativas del modelo.

library(car)
library(performance)

# Veamos si hay atipicos usando ri, como lo hemos venido haciendo tradicionalmente
Residuales<-cbind(ei=residuals(mod, type='working'),  # ordinarios
                  pi=residuals(mod, type='deviance'), 
                  pe=residuals(mod, type='pearson'),
                  pa=residuals(mod, type='partial'),
                  di=rstandard(mod),
                  ri=rstudent(mod)); Residuales
##               ei           pi           pe       Temp           di          ri
## 1    25.22570871  25.22570871  25.22570871  -1.099099  1.065645822  1.06631545
## 2     8.03015918   8.03015918   8.03015918  -6.099099  0.337800946  0.33642397
## 3   -20.84806063 -20.84806063 -20.84806063 -30.099099 -0.876154823 -0.87521386
## 4    14.42125824  14.42125824  14.42125824 -24.099099  0.613399170  0.61163550
## 5    12.10392852  12.10392852  12.10392852 -19.099099  0.512560745  0.51082011
## 6    22.73858795  22.73858795  22.73858795 -23.099099  0.972412941  0.97216807
## 7     6.86036814   6.86036814   6.86036814 -34.099099  0.292295333  0.29106553
## 8    -4.65251110  -4.65251110  -4.65251110 -26.099099 -0.196150433 -0.19528306
## 9    -2.33518138  -2.33518138  -2.33518138 -31.099099 -0.098762481 -0.09831280
## 10   -4.21340120  -4.21340120  -4.21340120 -28.099099 -0.177805796 -0.17701397
## 11   24.17769786  24.17769786  24.17769786 -24.099099  1.036052084  1.03640435
## 12    5.54303843   5.54303843   5.54303843 -28.099099  0.235050192  0.23402881
## 13   20.66481862  20.66481862  20.66481862  -8.099099  0.873982964  0.87302900
## 14   14.61680777  14.61680777  14.61680777 -36.099099  0.627694067  0.62594041
## 15   11.78659880  11.78659880  11.78659880 -12.099099  0.497395212  0.49567117
## 16    7.42125824   7.42125824   7.42125824 -31.099099  0.315658562  0.31435096
## 17    4.73858795   4.73858795   4.73858795 -41.099099  0.202645136  0.20175144
## 18  -19.40895072 -19.40895072 -19.40895072 -31.099099 -0.816029615 -0.81477035
## 19    2.86036814   2.86036814   2.86036814 -38.099099  0.121869882  0.12131782
## 20   30.86036814  30.86036814  30.86036814 -10.099099  1.314848035  1.31930717
## 21    7.22570871   7.22570871   7.22570871 -19.099099  0.305245985  0.30397250
## 22   -4.92182997  -4.92182997  -4.92182997   2.900901 -0.206800594 -0.20589018
## 23   69.95638984  69.95638984  69.95638984  72.900901  2.938047423  3.04770768
## 24   -0.72628044  -0.72628044  -0.72628044  -5.099099 -0.030505230 -0.03036510
## 25  -23.36093987 -23.36093987 -23.36093987 -13.099099 -0.981925053 -0.98176224
## 26   -0.87381912  -0.87381912  -0.87381912  28.900901 -0.036975519 -0.03680575
## 27  -25.55648940 -25.55648940 -25.55648940  -3.099099 -1.077874042 -1.07868242
## 28  -29.36093987 -29.36093987 -29.36093987 -19.099099 -1.234121684 -1.23712108
## 29  -19.16539035 -19.16539035 -19.16539035 -21.099099 -0.804879563 -0.80357048
## 30    9.03015918   9.03015918   9.03015918  -5.099099  0.379867478  0.37837149
## 31    9.10392852   9.10392852   9.10392852 -22.099099  0.385520815  0.38401019
## 32  -18.40895072 -18.40895072 -18.40895072 -30.099099 -0.773985631 -0.77255292
## 33  -24.72628044 -24.72628044 -24.72628044 -29.099099 -1.038553189 -1.03893128
## 34   77.76084032  77.76084032  77.76084032  92.900901  3.271953522  3.42968941
## 35  -10.67826959 -10.67826959 -10.67826959   6.900901 -0.449616912 -0.44796530
## 36  -17.92182997 -17.92182997 -17.92182997 -10.099099 -0.753021762 -0.75151690
## 37    9.19995022   9.19995022   9.19995022  21.900901  0.386883730  0.38536963
## 38  -14.80004978 -14.80004978 -14.80004978  -2.099099 -0.622383636 -0.62062585
## 39   10.00440069  10.00440069  10.00440069  34.900901  0.422364905  0.42076745
## 40   20.24796107  20.24796107  20.24796107  54.900901  0.859119679  0.85807983
## 41   20.24796107  20.24796107  20.24796107  54.900901  0.859119679  0.85807983
## 42   15.56529079  15.56529079  15.56529079  42.900901  0.657853508  0.65613272
## 43  -20.40895072 -20.40895072 -20.40895072 -32.099099 -0.858073599 -0.85702791
## 44  -22.92182997 -22.92182997 -22.92182997 -15.099099 -0.963106827 -0.96278406
## 45  -40.48272006 -40.48272006 -40.48272006 -35.099099 -1.700497555 -1.71558877
## 46   -1.92182997  -1.92182997  -1.92182997   5.900901 -0.080749555 -0.08038069
## 47  -17.36093987 -17.36093987 -17.36093987  -7.099099 -0.729728423 -0.72815415
## 48    3.76084032   3.76084032   3.76084032  18.900901  0.158245393  0.15753592
## 49   14.44351060  14.44351060  14.44351060  36.900901  0.609171506  0.60740554
## 50    3.32173041   3.32173041   3.32173041  20.900901  0.139864063  0.13923350
## 51  -16.84806063 -16.84806063 -16.84806063 -26.099099 -0.708051931 -0.70642294
## 52   17.88262051  17.88262051  17.88262051  37.900901  0.753551710  0.75204856
## 53   48.32173041  48.32173041  48.32173041  65.900901  2.034624338  2.06485974
## 54  -32.36093987 -32.36093987 -32.36093987 -22.099099 -1.360219999 -1.36560579
## 55  -10.11737949 -10.11737949 -10.11737949   9.900901 -0.426333971 -0.42472808
## 56   15.00440069  15.00440069  15.00440069  39.900901  0.633454463  0.63170584
## 57  -12.11737949 -12.11737949 -12.11737949   7.900901 -0.510611520 -0.50887284
## 58    9.19995022   9.19995022   9.19995022  21.900901  0.386883730  0.38536963
## 59    9.07817003   9.07817003   9.07817003  16.900901  0.381437588  0.37993750
## 60  -10.92182997 -10.92182997 -10.92182997  -3.099099 -0.458902671 -0.45723467
## 61  -40.92182997 -40.92182997 -40.92182997 -33.099099 -1.719413060 -1.73520116
## 62  -36.36093987 -36.36093987 -36.36093987 -26.099099 -1.528351086 -1.53789181
## 63   52.56529079  52.56529079  52.56529079  79.900901  2.221626399  2.26324653
## 64   17.12618088  17.12618088  17.12618088  46.900901  0.724691666  0.72310385
## 65   38.12618088  38.12618088  38.12618088  67.900901  1.613303383  1.62540921
## 66  -18.11737949 -18.11737949 -18.11737949   1.900901 -0.763444166 -0.76197401
## 67  -24.36093987 -24.36093987 -24.36093987 -14.099099 -1.023957825 -1.02418777
## 68   17.51727994  17.51727994  17.51727994  22.900901  0.735822387  0.73426520
## 69  -18.16539035 -18.16539035 -18.16539035 -20.099099 -0.762883050 -0.76141097
## 70   13.95638984  13.95638984  13.95638984  16.900901  0.586144244  0.58437100
## 71  -14.72628044 -14.72628044 -14.72628044 -19.099099 -0.618533206 -0.61677273
## 72  -11.60450025 -11.60450025 -11.60450025 -11.099099 -0.487334131 -0.48562284
## 73    1.39549975   1.39549975   1.39549975   1.900901  0.058604390  0.05833586
## 74  -19.16539035 -19.16539035 -19.16539035 -21.099099 -0.804879563 -0.80357048
## 75  -18.96984082 -18.96984082 -18.96984082 -33.099099 -0.797995411 -0.79665696
## 76   -0.04361016  -0.04361016  -0.04361016   2.900901 -0.001831551 -0.00182313
## 77  118.07817003 118.07817003 118.07817003 125.900901  4.961292001  5.61270921
## 78   10.88262051  10.88262051  10.88262051  30.900901  0.458580290  0.45691284
## 79  -12.94758846 -12.94758846 -12.94758846  33.900901 -0.554159269 -0.55239009
## 80   36.36974126  36.36974126  36.36974126  75.900901  1.548028066  1.55813382
## 81   -2.50847855  -2.50847855  -2.50847855  41.900901 -0.107153258 -0.10666622
## 82    3.36974126   3.36974126   3.36974126  42.900901  0.143428407  0.14278244
## 83   21.68707098  21.68707098  21.68707098  53.900901  0.918883233  0.91822178
## 84    1.24796107   1.24796107   1.24796107  35.900901  0.052950908  0.05270813
## 85   -6.19114883  -6.19114883  -6.19114883  30.900901 -0.263089312 -0.26196289
## 86   11.80885117  11.80885117  11.80885117  48.900901  0.501810344  0.50008114
## 87  -17.55648940 -17.55648940 -17.55648940   4.900901 -0.740464932 -0.73892127
## 88  -25.23915968 -25.23915968 -25.23915968 -10.099099 -1.061991577 -1.06262061
## 89  -27.48272006 -27.48272006 -27.48272006 -22.099099 -1.154425844 -1.15620809
## 90  -19.60450025 -19.60450025 -19.60450025 -19.099099 -0.823296298 -0.82207102
## 91  -14.28717053 -14.28717053 -14.28717053 -21.099099 -0.600228781 -0.59845895
## 92   -6.40895072  -6.40895072  -6.40895072 -18.099099 -0.269457822 -0.26830831
## 93   -5.92182997  -5.92182997  -5.92182997   1.900901 -0.248817607 -0.24774398
## 94  -16.72628044 -16.72628044 -16.72628044 -21.099099 -0.702537203 -0.70089578
## 95  -12.16539035 -12.16539035 -12.16539035 -14.099099 -0.510903973 -0.50916499
## 96  -16.53073091 -16.53073091 -16.53073091 -33.099099 -0.695834449 -0.69417870
## 97  -12.53073091 -12.53073091 -12.53073091 -29.099099 -0.527460902 -0.52570713
## 98    3.39549975   3.39549975   3.39549975   3.900901  0.142594932  0.14195256
## 99    2.22570871   2.22570871   2.22570871 -24.099099  0.094023808  0.09359531
## 100 -24.72628044 -24.72628044 -24.72628044 -29.099099 -1.038553189 -1.03893128
## 101   5.78659880   5.78659880   5.78659880 -18.099099  0.244194834  0.24313861
## 102 -36.36093987 -36.36093987 -36.36093987 -26.099099 -1.528351086 -1.53789181
## 103   4.54303843   4.54303843   4.54303843 -29.099099  0.192645616  0.19179254
## 104  -2.53073091  -2.53073091  -2.53073091 -19.099099 -0.106527035 -0.10604277
## 105 -13.92182997 -13.92182997 -13.92182997  -6.099099 -0.584953710 -0.58318033
## 106 -13.65251110 -13.65251110 -13.65251110 -35.099099 -0.575591527 -0.57381784
## 107   7.98214833   7.98214833   7.98214833 -28.099099  0.338978946  0.33759841
## 108   6.90837899   6.90837899   6.90837899 -12.099099  0.291012520  0.28978712
## 109 -21.28717053 -21.28717053 -21.28717053 -28.099099 -0.894310906 -0.89348313
## 110 -19.72628044 -19.72628044 -19.72628044 -24.099099 -0.828543198 -0.82734321
## 111   1.78659880   1.78659880   1.78659880 -22.099099  0.075394582  0.07504990
boxplot(Residuales, col=terrain.colors(4))        # todos los residuales

ri=rstudent(mod)           # extraer los residuales estudentizados *los importantes*
boxplot(ri)                # Graficar los residuales estudentizados

outlierTest(mod, cutoff=Inf, n.max=5)  # prueba de bonferroni para detectar outliers
##    rstudent unadjusted p-value Bonferroni p
## 77 5.612709         1.5566e-07   1.7278e-05
## 34 3.429689         8.5674e-04   9.5098e-02
## 23 3.047708         2.8994e-03   3.2183e-01
## 63 2.263247         2.5620e-02           NA
## 53 2.064860         4.1331e-02           NA
str(Datos)
## 'data.frame':    111 obs. of  6 variables:
##  $ Ozone  : int  41 36 12 18 23 19 8 16 11 14 ...
##  $ Solar.R: int  190 118 149 313 299 99 19 256 290 274 ...
##  $ Wind   : num  7.4 8 12.6 11.5 8.6 13.8 20.1 9.7 9.2 10.9 ...
##  $ Temp   : int  67 72 74 62 65 59 61 69 66 68 ...
##  $ Month  : int  5 5 5 5 5 5 5 5 5 5 ...
##  $ Day    : int  1 2 3 4 7 8 9 12 13 14 ...

Eliminar datos atipicos

Datos1<-Datos[-c(23, 34, 77),]
str(Datos1)
## 'data.frame':    108 obs. of  6 variables:
##  $ Ozone  : int  41 36 12 18 23 19 8 16 11 14 ...
##  $ Solar.R: int  190 118 149 313 299 99 19 256 290 274 ...
##  $ Wind   : num  7.4 8 12.6 11.5 8.6 13.8 20.1 9.7 9.2 10.9 ...
##  $ Temp   : int  67 72 74 62 65 59 61 69 66 68 ...
##  $ Month  : int  5 5 5 5 5 5 5 5 5 5 ...
##  $ Day    : int  1 2 3 4 7 8 9 12 13 14 ...

ya no hay outliers? corramos el modelo otra vez

mod1 <- lm(Ozone ~ Temp, data=Datos1)    # el modelo
outlierTest(mod1, cutoff=Inf, n.max=5)  # prueba de bonferroni para detectar outliers
##     rstudent unadjusted p-value Bonferroni p
## 63  3.177276          0.0019528      0.21090
## 53  2.881109          0.0048053      0.51897
## 65  2.319358          0.0223110           NA
## 80  2.253883          0.0262810           NA
## 61 -2.091863          0.0388630           NA

veamos cuantos datos tenemos

Datos_2<-Datos1[-63,]
str(Datos_2)
## 'data.frame':    107 obs. of  6 variables:
##  $ Ozone  : int  41 36 12 18 23 19 8 16 11 14 ...
##  $ Solar.R: int  190 118 149 313 299 99 19 256 290 274 ...
##  $ Wind   : num  7.4 8 12.6 11.5 8.6 13.8 20.1 9.7 9.2 10.9 ...
##  $ Temp   : int  67 72 74 62 65 59 61 69 66 68 ...
##  $ Month  : int  5 5 5 5 5 5 5 5 5 5 ...
##  $ Day    : int  1 2 3 4 7 8 9 12 13 14 ...

veamos si ya pasan lo supuestos

plot(Temp, Ozone, col="deepskyblue1")  # la plot
mod2 <- lm(Ozone ~ Temp, data=Datos_2);  summary(mod2)   # el modelo
## 
## Call:
## lm(formula = Ozone ~ Temp, data = Datos_2)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -37.565 -14.557   1.435  12.646  57.128 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -138.7847    14.4150  -9.628  4.2e-16 ***
## Temp           2.2883     0.1844  12.408  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 18.24 on 105 degrees of freedom
## Multiple R-squared:  0.5945, Adjusted R-squared:  0.5907 
## F-statistic:   154 on 1 and 105 DF,  p-value: < 2.2e-16
abline(mod2)

#revisar los supuestos
check_heteroskedasticity (mod2)  # los errores deben mostrar homogeneidad
## Warning: Heteroscedasticity (non-constant error variance) detected (p = 0.015).
check_autocorrelation (mod2)     # residualaes no correlacionados
## OK: Residuals appear to be independent and not autocorrelated (p = 0.100).
check_collinearity (mod2)        # variables (x) independienets no correlacioandas
## Warning: Not enough model terms in the conditional part of the model to check for
##   multicollinearity.
## NULL
check_normality (mod2)           # normalidad de los residuales (errores)
## OK: residuals appear as normally distributed (p = 0.077).

Opcion 2. Regresion ponderada

mod3 <- lm(Ozone ~ Temp, data=Datos_2, weights = 1/(Ozone))  # ponderaciones 1/x, 1/(x)^2, sqrt(x)
summary(mod3)    
## 
## Call:
## lm(formula = Ozone ~ Temp, data = Datos_2, weights = 1/(Ozone))
## 
## Weighted Residuals:
##    Min     1Q Median     3Q    Max 
## -9.633 -1.210  1.314  2.995  6.912 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -84.5220     9.8233  -8.604 8.13e-14 ***
## Temp          1.4626     0.1382  10.583  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.159 on 105 degrees of freedom
## Multiple R-squared:  0.5161, Adjusted R-squared:  0.5115 
## F-statistic:   112 on 1 and 105 DF,  p-value: < 2.2e-16
#revisar los supuestos
check_heteroskedasticity (mod3)  # los errores deben mostrar homogeneidad
## Warning: Heteroscedasticity (non-constant error variance) detected (p = 0.024).
check_autocorrelation (mod3)     # residualaes no correlacionados
## Warning: Autocorrelated residuals detected (p < .001).
check_collinearity (mod3)        # variables (x) independienets no correlacioandas
## Warning: Not enough model terms in the conditional part of the model to check for
##   multicollinearity.
## NULL
check_normality (mod3)           # normalidad de los residuales (errores)
## Warning: Non-normality of residuals detected (p = 0.005).

Opcion 3. Tranformar la variable dependiente (y) a ln

mod5 <- lm(log(Ozone) ~ log(Temp), data=Datos_2)    # el modelo
summary(mod5)      
## 
## Call:
## lm(formula = log(Ozone) ~ log(Temp), data = Datos_2)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -2.05667 -0.31045  0.05166  0.38289  1.24546 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -17.9536     1.8435  -9.739 2.36e-16 ***
## log(Temp)     4.9075     0.4243  11.567  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.5552 on 105 degrees of freedom
## Multiple R-squared:  0.5603, Adjusted R-squared:  0.5561 
## F-statistic: 133.8 on 1 and 105 DF,  p-value: < 2.2e-16
#revisar los supuestos
check_heteroskedasticity (mod5)  # los errores deben mostrar homogeneidad
## Warning: Heteroscedasticity (non-constant error variance) detected (p < .001).
check_autocorrelation (mod5)     # residualaes no correlacionados
## OK: Residuals appear to be independent and not autocorrelated (p = 0.268).
check_collinearity (mod5)        # variables (x) independienets no correlacioandas
## Warning: Not enough model terms in the conditional part of the model to check for
##   multicollinearity.
## NULL
check_normality (mod5)           # normalidad de los residuales (errores)
## Warning: Non-normality of residuals detected (p = 0.005).

Opcion 4. Transformación Box Cox de la variable Ozone

# view(Datos_2)
attach(Datos_2)
Lambda<-car::powerTransform(Ozone, family="bcPower")  #Buscamos Lambda
# str(Lamba)
Lambda$ start
## [1] 0.238367

ya se normalizo la variable?

Datos_2$Ozone_Box<-Ozone^Lambda$ start
View(Datos_2)
attach(Datos_2)
shapiro.test(Ozone_Box) #  Prueba de normalidad si no es un dataframe
## 
##  Shapiro-Wilk normality test
## 
## data:  Ozone_Box
## W = 0.98179, p-value = 0.1505
hist(Ozone_Box, col = 3)

La regresion con Ozono normalizado

mod4 <- lm(Ozone_Box ~ Temp, data=Datos_2)    # el modelo
summary(mod4)                    
## 
## Call:
## lm(formula = Ozone_Box ~ Temp, data = Datos_2)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.76478 -0.18871  0.00496  0.20944  0.59107 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -0.430691   0.213938  -2.013   0.0467 *  
## Temp         0.034821   0.002737  12.722   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.2708 on 105 degrees of freedom
## Multiple R-squared:  0.6065, Adjusted R-squared:  0.6028 
## F-statistic: 161.9 on 1 and 105 DF,  p-value: < 2.2e-16
#revisar los supuestos
check_heteroskedasticity (mod4)  # los errores deben mostrar homogeneidad
## OK: Error variance appears to be homoscedastic (p = 0.096).
check_autocorrelation (mod4)     # residualaes no correlacionados
## OK: Residuals appear to be independent and not autocorrelated (p = 0.344).
check_collinearity (mod4)        # variables (x) independienets no correlacioandas
## Warning: Not enough model terms in the conditional part of the model to check for
##   multicollinearity.
## NULL
check_normality (mod4)           # normalidad de los residuales (errores)
## OK: residuals appear as normally distributed (p = 0.254).

y como quedo el modelo fina?

plot(Temp, Ozone, col="deepskyblue1")      # plot del modelo sin cumplir los supuestos
abline(lm(Ozone ~Temp))

plot(Temp, Ozone_Box, col="deepskyblue1")  # plot el modelo supuestos cumplidos
abline(mod4)

# predecir ozono con el el mod final
estimados<-(mod4$ fitted.values); estimados            # estimados por el modelo
##        1        2        3        4        5        6        7        8 
## 1.902285 2.076388 2.146029 1.728182 1.832644 1.623721 1.693362 1.971926 
##        9       10       11       12       13       14       15       16 
## 1.867464 1.937106 1.588900 1.797823 1.867464 1.554080 1.937106 1.728182 
##       17       18       19       20       21       22       24       25 
## 1.623721 2.111208 1.693362 1.693362 1.902285 2.389773 2.215670 2.424593 
##       26       27       28       29       30       31       32       33 
## 2.703157 2.598696 2.424593 2.250490 2.076388 1.832644 2.111208 2.215670 
##       35       36       37       38       39       40       41       42 
## 2.529055 2.389773 2.459414 2.459414 2.633516 2.772799 2.772799 2.668337 
##       43       44       45       46       47       48       49       50 
## 2.111208 2.389773 2.354952 2.389773 2.424593 2.494234 2.598696 2.529055 
##       51       52       53       54       55       56       57       58 
## 2.146029 2.563875 2.529055 2.424593 2.563875 2.633516 2.563875 2.459414 
##       59       60       61       62       63       64       66       67 
## 2.389773 2.389773 2.389773 2.424593 2.668337 2.703157 2.563875 2.424593 
##       68       69       70       71       72       73       74       75 
## 2.354952 2.250490 2.320131 2.215670 2.285311 2.285311 2.250490 2.076388 
##       76       78       79       80       81       82       83       84 
## 2.320131 2.563875 2.946901 2.842440 2.912081 2.842440 2.737978 2.772799 
##       85       86       87       88       89       90       91       92 
## 2.807619 2.807619 2.598696 2.494234 2.354952 2.285311 2.180849 2.111208 
##       93       94       95       96       97       98       99      100 
## 2.389773 2.215670 2.250490 2.041567 2.041567 2.285311 1.902285 2.215670 
##      101      102      103      104      105      106      107      108 
## 1.937106 2.424593 1.797823 2.041567 2.389773 1.971926 1.763003 2.006747 
##      109      110      111 
## 2.180849 2.215670 1.937106
regreso_unidades_originales<-(estimados^(1/Lambda$ start)); regreso_unidades_originales
##         1         2         3         4         5         6         7         8 
## 14.846358 21.437584 24.619602  9.925177 12.696033  7.640809  9.112833 17.263481 
##         9        10        11        12        13        14        15        16 
## 13.739178 16.020223  6.976581 11.714330 13.739178  6.357270 16.020223  9.925177 
##        17        18        19        20        21        22        24        25 
##  7.640809 22.986681  9.112833  9.112833 14.846358 38.662114 28.149231 41.081013 
##        26        27        28        29        30        31        32        33 
## 64.832286 54.952386 41.081013 30.052249 21.437584 12.696033 22.986681 28.149231 
##        35        36        37        38        39        40        41        42 
## 49.033707 38.662114 43.613494 43.613494 58.108180 72.133286 72.133286 61.400152 
##        43        44        45        46        47        48        49        50 
## 22.986681 38.662114 36.353246 38.662114 41.081013 46.263168 54.952386 49.033707 
##        51        52        53        54        55        56        57        58 
## 24.619602 51.928848 49.033707 41.081013 51.928848 58.108180 51.928848 43.613494 
##        59        60        61        62        63        64        66        67 
## 38.662114 38.662114 38.662114 41.081013 61.400152 64.832286 51.928848 41.081013 
##        68        69        70        71        72        73        74        75 
## 36.353246 30.052249 34.150920 28.149231 32.051711 32.051711 30.052249 21.437584 
##        76        78        79        80        81        82        83        84 
## 34.150920 51.928848 93.129565 80.044305 88.599477 80.044305 68.408628 72.133286 
##        85        86        87        88        89        90        91        92 
## 76.010433 76.010433 54.952386 46.263168 36.353246 32.051711 26.339410 22.986681 
##        93        94        95        96        97        98        99       100 
## 38.662114 28.149231 30.052249 19.969305 19.969305 32.051711 14.846358 28.149231 
##       101       102       103       104       105       106       107       108 
## 16.020223 41.081013 11.714330 19.969305 38.662114 17.263481 10.791537 18.578899 
##       109       110       111 
## 26.339410 28.149231 16.020223
df<-data.frame(estimados, regreso_unidades_originales); df
##     estimados regreso_unidades_originales
## 1    1.902285                   14.846358
## 2    2.076388                   21.437584
## 3    2.146029                   24.619602
## 4    1.728182                    9.925177
## 5    1.832644                   12.696033
## 6    1.623721                    7.640809
## 7    1.693362                    9.112833
## 8    1.971926                   17.263481
## 9    1.867464                   13.739178
## 10   1.937106                   16.020223
## 11   1.588900                    6.976581
## 12   1.797823                   11.714330
## 13   1.867464                   13.739178
## 14   1.554080                    6.357270
## 15   1.937106                   16.020223
## 16   1.728182                    9.925177
## 17   1.623721                    7.640809
## 18   2.111208                   22.986681
## 19   1.693362                    9.112833
## 20   1.693362                    9.112833
## 21   1.902285                   14.846358
## 22   2.389773                   38.662114
## 24   2.215670                   28.149231
## 25   2.424593                   41.081013
## 26   2.703157                   64.832286
## 27   2.598696                   54.952386
## 28   2.424593                   41.081013
## 29   2.250490                   30.052249
## 30   2.076388                   21.437584
## 31   1.832644                   12.696033
## 32   2.111208                   22.986681
## 33   2.215670                   28.149231
## 35   2.529055                   49.033707
## 36   2.389773                   38.662114
## 37   2.459414                   43.613494
## 38   2.459414                   43.613494
## 39   2.633516                   58.108180
## 40   2.772799                   72.133286
## 41   2.772799                   72.133286
## 42   2.668337                   61.400152
## 43   2.111208                   22.986681
## 44   2.389773                   38.662114
## 45   2.354952                   36.353246
## 46   2.389773                   38.662114
## 47   2.424593                   41.081013
## 48   2.494234                   46.263168
## 49   2.598696                   54.952386
## 50   2.529055                   49.033707
## 51   2.146029                   24.619602
## 52   2.563875                   51.928848
## 53   2.529055                   49.033707
## 54   2.424593                   41.081013
## 55   2.563875                   51.928848
## 56   2.633516                   58.108180
## 57   2.563875                   51.928848
## 58   2.459414                   43.613494
## 59   2.389773                   38.662114
## 60   2.389773                   38.662114
## 61   2.389773                   38.662114
## 62   2.424593                   41.081013
## 63   2.668337                   61.400152
## 64   2.703157                   64.832286
## 66   2.563875                   51.928848
## 67   2.424593                   41.081013
## 68   2.354952                   36.353246
## 69   2.250490                   30.052249
## 70   2.320131                   34.150920
## 71   2.215670                   28.149231
## 72   2.285311                   32.051711
## 73   2.285311                   32.051711
## 74   2.250490                   30.052249
## 75   2.076388                   21.437584
## 76   2.320131                   34.150920
## 78   2.563875                   51.928848
## 79   2.946901                   93.129565
## 80   2.842440                   80.044305
## 81   2.912081                   88.599477
## 82   2.842440                   80.044305
## 83   2.737978                   68.408628
## 84   2.772799                   72.133286
## 85   2.807619                   76.010433
## 86   2.807619                   76.010433
## 87   2.598696                   54.952386
## 88   2.494234                   46.263168
## 89   2.354952                   36.353246
## 90   2.285311                   32.051711
## 91   2.180849                   26.339410
## 92   2.111208                   22.986681
## 93   2.389773                   38.662114
## 94   2.215670                   28.149231
## 95   2.250490                   30.052249
## 96   2.041567                   19.969305
## 97   2.041567                   19.969305
## 98   2.285311                   32.051711
## 99   1.902285                   14.846358
## 100  2.215670                   28.149231
## 101  1.937106                   16.020223
## 102  2.424593                   41.081013
## 103  1.797823                   11.714330
## 104  2.041567                   19.969305
## 105  2.389773                   38.662114
## 106  1.971926                   17.263481
## 107  1.763003                   10.791537
## 108  2.006747                   18.578899
## 109  2.180849                   26.339410
## 110  2.215670                   28.149231
## 111  1.937106                   16.020223
predicion_uno<-(-0.430691+0.034821*80); predicion_uno # prediccion de un solo datos mod final
## [1] 2.354989
de_regreso<-(predicion_uno^(1/Lambda$ start)); de_regreso
## [1] 36.35564

Reporte del modelo

library(report)
## Warning: package 'report' was built under R version 4.1.3
report(mod4)
## We fitted a linear model (estimated using OLS) to predict Ozone_Box with Temp
## (formula: Ozone_Box ~ Temp). The model explains a statistically significant and
## substantial proportion of variance (R2 = 0.61, F(1, 105) = 161.86, p < .001,
## adj. R2 = 0.60). The model's intercept, corresponding to Temp = 0, is at -0.43
## (95% CI [-0.85, -6.49e-03], t(105) = -2.01, p = 0.047). Within this model:
## 
##   - The effect of Temp is statistically significant and positive (beta = 0.03,
## 95% CI [0.03, 0.04], t(105) = 12.72, p < .001; Std. beta = 0.78, 95% CI [0.66,
## 0.90])
## 
## Standardized parameters were obtained by fitting the model on a standardized
## version of the dataset. 95% Confidence Intervals (CIs) and p-values were
## computed using a Wald t-distribution approximation.
report_model(mod4)
## linear model (estimated using OLS) to predict Ozone_Box with Temp (formula: Ozone_Box ~ Temp)
report_table(mod4)
## Parameter   | Coefficient |         95% CI | t(105) |      p | Std. Coef. | Std. Coef. 95% CI |   Fit
## -----------------------------------------------------------------------------------------------------
## (Intercept) |       -0.43 | [-0.85, -0.01] |  -2.01 | 0.047  |  -3.11e-16 |     [-0.12, 0.12] |      
## Temp        |        0.03 | [ 0.03,  0.04] |  12.72 | < .001 |       0.78 |     [ 0.66, 0.90] |      
##             |             |                |        |        |            |                   |      
## AIC         |             |                |        |        |            |                   | 28.03
## BIC         |             |                |        |        |            |                   | 36.05
## R2          |             |                |        |        |            |                   |  0.61
## R2 (adj.)   |             |                |        |        |            |                   |  0.60
## Sigma       |             |                |        |        |            |                   |  0.27