Modelos Lineales Generalizados

Derek Corcoran (derek.corcoran.barrios@gmail.com)
"01/08, 2018"

Supuestos

GLM

  • Independencia de cada punto
  • Distribución correcta de los residuales
  • Especificación correcta de la estructura de varianza

LM

  • Independencia de cada punto
  • Distribución normal de los residuales
  • homocedasticidad

Variables continuas y discretas

Plant Type Treatment conc uptake
Qn1 Quebec nonchilled 95 16.0
Qn1 Quebec nonchilled 175 30.4
Qn1 Quebec nonchilled 250 34.8
Qn1 Quebec nonchilled 350 37.2
Qn1 Quebec nonchilled 500 35.3
Qn1 Quebec nonchilled 675 39.2

Variables continuas y discretas

plot of chunk unnamed-chunk-2

Variables continuas y discretas

plot of chunk unnamed-chunk-3

Variables continuas y discretas

term estimate std.error statistic p.value
(Intercept) 19.5002898 1.8530800 10.523177 0.0e+00
conc 0.0177306 0.0035289 5.024461 2.9e-06
null.deviance df.null logLik AIC BIC deviance df.residual
9706.976 83 -307.409 620.818 628.1104 7421.982 82

\( R^2 \): 0.2355164

Variables continuas y discretas

plot of chunk unnamed-chunk-6

Variables continuas y discretas

term estimate std.error statistic p.value
(Intercept) -22.157173 7.47540 -2.964012 0.0039742
I(log(conc)) 8.483878 1.27403 6.659086 0.0000000
null.deviance df.null logLik AIC BIC deviance df.residual
9706.976 83 -300.5258 607.0516 614.344 6300.066 82

\( R^2 \): 0.3511533

Variables continuas y discretas

plot of chunk unnamed-chunk-9

Variables continuas y discretas

plot of chunk unnamed-chunk-10

Variables continuas y discretas

plot of chunk unnamed-chunk-11

Variables continuas y discretas

term estimate std.error statistic p.value
(Intercept) -14.036935 4.0767611 -3.443159 0.0009225
I(log(conc)) 8.483878 0.6783465 12.506701 0.0000000
TypeMississippi -9.380952 1.4402665 -6.513345 0.0000000
Treatmentchilled -3.580952 1.4402665 -2.486312 0.0150168
TypeMississippi:Treatmentchilled -6.557143 2.0368444 -3.219266 0.0018656
null.deviance df.null logLik AIC BIC deviance df.residual
9706.976 83 -246.0167 504.0333 518.6182 1720.688 79

\( R^2 \): 0.823154

Distribuciones

plot of chunk unnamed-chunk-14

Estructura de error

  • family =
  • gaussian (variable dependiente continua)
  • binomial (variable dependiente 0 o 1)
  • poisson (variable dependiente cuentas 1, 2 ,3 ,4 ,5)
  • gamma (variable dependiente continua solo positiva)

Modelo lineal generalizado (familia: binomial)

Survived Pclass Sex Age Fare Cabin Embarked
0 3 male 22 7.2500 NA S
1 3 female 26 7.9250 NA S
1 1 female 35 53.1000 C123 S
0 3 male 35 8.0500 NA S
0 1 male 54 51.8625 E46 S
0 3 male 2 21.0750 NA S

Modelo lineal generalizado (familia: binomial)

plot of chunk unnamed-chunk-16

Modelo lineal generalizado (familia: binomial)

term estimate std.error statistic p.value
(Intercept) 0.6165752 0.0333236 18.502645 0.00e+00
Fare 0.0018864 0.0004542 4.152808 3.73e-05
Sexmale -0.4829290 0.0350604 -13.774197 0.00e+00
null.deviance df.null logLik AIC BIC deviance df.residual
143.8804 643 -327.3111 662.6221 680.4929 104.2004 641

\( R^2 \): 0.3737256

Modelo lineal generalizado (familia: binomial)

plot of chunk unnamed-chunk-19

Modelo lineal generalizado (familia: binomial)

plot of chunk unnamed-chunk-20

Modelo lineal generalizado (familia: binomial)

term estimate std.error statistic p.value
(Intercept) 0.3418628 0.2187141 1.5630579 0.1180390
Fare 0.0138686 0.0055200 2.5124456 0.0119898
Sexmale -2.1310117 0.2700674 -7.8906661 0.0000000
Fare:Sexmale -0.0040442 0.0066630 -0.6069597 0.5438776
null.deviance df.null logLik AIC BIC deviance df.residual
823.027 643 -321.812 651.6239 669.4947 643.6239 640

\( R^2 \): 0.3370364

Modelo lineal generalizado (familia: binomial)

plot of chunk unnamed-chunk-23

Función link

  • Actua sobre \( Y \)
  • family Gaussian, link = identidad
  • family Gamma, link = inverso
  • family poisson, link = log
  • family binomial, link = logit

\[ Logit = log{\frac{p}{1-p}} \]

Función link

Valor Identidad Inverso Log logit
-1.0 -1.0 -1.0000000 NaN NaN
-0.8 -0.8 -1.2500000 NaN NaN
0.1 0.1 10.0000000 -2.3025851 -2.197225
0.2 0.2 5.0000000 -1.6094379 -1.386294
0.5 0.5 2.0000000 -0.6931472 0.000000
0.8 0.8 1.2500000 -0.2231436 1.386294
1.0 1.0 1.0000000 0.0000000 Inf
2.0 2.0 0.5000000 0.6931472 NaN
2.3 2.3 0.4347826 0.8329091 NaN

Función link

plot of chunk unnamed-chunk-25

Ajuste

Pseudo \( R^2 \)