Generalized Linear Mixed Models (GLMM)

Author

Fatkhurokhman Fauzi

Persiapan Data

Data yang digunakan perupakan data Seeds genrmination data set dari Crowder (1978) data tersebut ada di package hglm.data:

library(hglm.data)

Warning: package 'hglm.data' was built under R version 4.3.3

Loading required package: Matrix

Loading required package: MASS

Loading required package: sp

data("seeds")

Menata Data menggunakan package tidyverse

library(tidyverse)
expanded_data <-seeds %>%
  mutate(
    seed = ifelse(seed == "O75", 0, 1),        # 0 jika O75, 1 jika O73
    extract = ifelse(extract == "Bean", 0, 1)  # 0 jika Bean, 1 jika Cucumber
  )

# Menampilkan hasil
print(expanded_data)

   plate seed extract  r  n
1      1    0       0 10 39
2      2    0       0 23 62
3      3    0       0 23 81
4      4    0       0 26 51
5      5    0       0 17 39
6      6    1       0  8 16
7      7    1       0 10 30
8      8    1       0  8 28
9      9    1       0 23 45
10    10    1       0  0  4
11    11    0       1  5  6
12    12    0       1 53 74
13    13    0       1 55 72
14    14    0       1 32 51
15    15    0       1 46 79
16    16    0       1 10 13
17    17    1       1  3 12
18    18    1       1 22 41
19    19    1       1 15 30
20    20    1       1 32 51
21    21    1       1  3  7

Regresi Logistik

Pemodelan menggunakan regresi logistik sesuai dengan contoh 17.6 pada buku Pawitan halaman 461

data<-data.frame(rr=(seeds$n-seeds$r),expanded_data)
reglog_n<-glm(cbind(r,rr)~extract*seed,family=binomial(link="logit"),data=data)
summary(reglog_n)


Call:
glm(formula = cbind(r, rr) ~ extract * seed, family = binomial(link = "logit"), 
    data = data)

Coefficients:
             Estimate Std. Error z value Pr(>|z|)    
(Intercept)   -0.5582     0.1260  -4.429 9.46e-06 ***
extract        1.3182     0.1775   7.428 1.10e-13 ***
seed           0.1459     0.2232   0.654   0.5132    
extract:seed  -0.7781     0.3064  -2.539   0.0111 *  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 98.719  on 20  degrees of freedom
Residual deviance: 33.278  on 17  degrees of freedom
AIC: 117.87

Number of Fisher Scoring iterations: 4

Generalized Linear Mixed Models (GLMM)

library(lme4)
expanded_data$plate<-as.factor(expanded_data$plate)
glmm<-glmer(cbind(r,rr)~seed*extract+(1|plate),family=binomial(link="logit"),data=data)
summary(glmm)

Generalized linear mixed model fit by maximum likelihood (Laplace
  Approximation) [glmerMod]
 Family: binomial  ( logit )
Formula: cbind(r, rr) ~ seed * extract + (1 | plate)
   Data: data

     AIC      BIC   logLik deviance df.resid 
   117.5    122.8    -53.8    107.5       16 

Scaled residuals: 
     Min       1Q   Median       3Q      Max 
-1.60042 -0.78762  0.04326  0.72641  1.24275 

Random effects:
 Groups Name        Variance Std.Dev.
 plate  (Intercept) 0.05503  0.2346  
Number of obs: 21, groups:  plate, 21

Fixed effects:
             Estimate Std. Error z value Pr(>|z|)    
(Intercept)  -0.54848    0.16608  -3.302 0.000958 ***
seed          0.09743    0.27736   0.351 0.725390    
extract       1.33681    0.23618   5.660 1.51e-08 ***
seed:extract -0.81004    0.38417  -2.109 0.034986 *  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
            (Intr) seed   extrct
seed        -0.600              
extract     -0.702  0.412       
seed:extrct  0.431 -0.705 -0.617

Referensi

Crowder, M. J. 1978. Beta-binomial Anova for proportions, Journal of the Royal Statistical Society (C, Applied Statistics) 27(1), 34–37.

Pawitan, Y. (2001) In All Likelihood: Statistical Modelling and Inference Using Likelihood. Oxford University Press, Oxford.

Bayu, S., Notodiputro, K. A., & Sartono, B. (2023, December). GLMM and GLMMTree for Modelling Poverty in Indonesia. In Proceedings of The International Conference on Data Science and Official Statistics (Vol. 2023, No. 1, pp. 121-131).

McCulloch C.E., Searle S.R. (2001). Generalized, Linear, and Mixed Models. Wiley Series in Probability and Statistics