Les données

Le modèle

Simulation des trois types des données

> set.seed(1)
> t=c(0,3,6,12,18,24,30)
> n=10
> a=4.5
> b=.5
> sig=2
> epsilon=matrix(rnorm(n*length(t),0,sig),length(t),n)
> t.mat=matrix(rep(t,n),length(t),n)
> Y=a+b*t.mat+epsilon
> Y=t(Y)
> colnames(Y)=paste("t",t,sep="")

Représentation graphique des données. D’abord on transforme les données

> ## On ajoute une colonne avec l'id des enfants
> Y=as.data.frame(Y)
> Y$id=paste("enf",1:n,sep="")
> library(reshape2)
> Ydt=melt(Y,id=8)
> colnames(Ydt)=c("Enfant","Temps","Y")
> Ydt$Enfant=factor(Ydt$Enfant,levels=Y$id)
> Ydt$t=sort(rep(t,n))
> head(Ydt)
  Enfant Temps        Y t
1   enf1    t0 3.247092 0
2   enf2    t0 5.976649 0
3   enf3    t0 6.749862 0
4   enf4    t0 6.064273 0
5   enf5    t0 3.543700 0
6   enf6    t0 3.670011 0

Graphique utilisant ggplot2

> library(ggplot2)
> gr<-ggplot(data=Ydt,aes(x=t,y=Y,colour=Enfant,group=Enfant))+geom_line()+geom_point()
> gr<-gr+theme_classic()
> gr+theme(legend.position="none")

> mu=2.5
> a1=rnorm(n,0,mu)
> a1.mat=matrix(rep(a1,length(t)),length(t),n,byrow = T)
> Yf1=(a+a1.mat)+b*t.mat+epsilon
> Yf1=t(Yf1)
> colnames(Yf1)=paste("t",t,sep="")

Représentation graphique des données. D’abord on transforme les données

> ## On ajoute une colonne avec l'id des enfants
> Yf1=as.data.frame(Yf1)
> Yf1$id=paste("enf",1:n,sep="")
> Yf1dt=melt(Yf1,id=8)
> colnames(Yf1dt)=c("Enfant","Temps","Y")
> Yf1dt$Enfant=factor(Yf1dt$Enfant,levels=Yf1$id)
> Yf1dt$t=sort(rep(t,n))
> head(Yf1dt)
  Enfant Temps         Y t
1   enf1    t0 4.4358662 0
2   enf2    t0 4.2017833 0
3   enf3    t0 8.2766777 0
4   enf4    t0 3.7290285 0
5   enf5    t0 0.4096164 0
6   enf6    t0 4.3986265 0

Graphique utilisant ggplot2

> library(ggplot2)
> gr<-ggplot(data=Yf1dt,aes(x=t,y=Y,colour=Enfant,group=Enfant))+geom_line()+geom_point()
> gr<-gr+theme_classic()
> gr+theme(legend.position="none")

> nu=1.5
> b1=rnorm(n,0,nu)
> b1.mat=matrix(rep(b1,length(t)),length(t),n,byrow = T)
> Yf2=a+(b+b1.mat)*t.mat+epsilon
> Yf2=t(Yf2)
> colnames(Yf2)=paste("t",t,sep="")

Représentation graphique des données. D’abord on transforme les données

> ## On ajoute une colonne avec l'id des enfants
> Yf2=as.data.frame(Yf2)
> Yf2$id=paste("enf",1:n,sep="")
> Yf2dt=melt(Yf2,id=8)
> colnames(Yf2dt)=c("Enfant","Temps","Y")
> Yf2dt$Enfant=factor(Yf2dt$Enfant,levels=Yf2$id)
> Yf2dt$t=sort(rep(t,n))
> head(Yf2dt)
  Enfant Temps        Y t
1   enf1    t0 3.247092 0
2   enf2    t0 5.976649 0
3   enf3    t0 6.749862 0
4   enf4    t0 6.064273 0
5   enf5    t0 3.543700 0
6   enf6    t0 3.670011 0

Graphique utilisant ggplot2

> gr<-ggplot(data=Yf2dt,aes(x=t,y=Y,colour=Enfant,group=Enfant))+geom_line()+geom_point()
> gr<-gr+theme_classic()
> gr+theme(legend.position="none")

Modélisation fréquentiste

Modèles à effets fixes.

  • Les données Ydt avec effets fixes
> m0=lm(Y~t,data=Ydt)
> summary(m0)

Call:
lm(formula = Y ~ t, data = Ydt)

Residuals:
    Min      1Q  Median      3Q     Max 
-4.8953 -0.9993  0.0301  1.1589  4.4468 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  4.69221    0.36163   12.97   <2e-16 ***
t            0.50912    0.02145   23.73   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 1.862 on 68 degrees of freedom
Multiple R-squared:  0.8923,    Adjusted R-squared:  0.8907 
F-statistic: 563.2 on 1 and 68 DF,  p-value: < 2.2e-16
  • Les données Yf1dt avec effets mixtes, mais avec un intercept aléatoire.
> m1=lm(Y~t,data=Yf1dt)
> summary(m1)

Call:
lm(formula = Y ~ t, data = Yf1dt)

Residuals:
    Min      1Q  Median      3Q     Max 
-6.0508 -1.8704  0.1812  1.8076  5.2520 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  4.07287    0.50284    8.10 1.44e-11 ***
t            0.50912    0.02983   17.07  < 2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 2.589 on 68 degrees of freedom
Multiple R-squared:  0.8107,    Adjusted R-squared:  0.808 
F-statistic: 291.3 on 1 and 68 DF,  p-value: < 2.2e-16
  • Les données Yf2dt avec effets mixtes, mais avec une pente aléatoire.
> m2=lm(Y~t,data=Yf2dt)
> summary(m2)

Call:
lm(formula = Y ~ t, data = Yf2dt)

Residuals:
    Min      1Q  Median      3Q     Max 
-77.699  -6.338   1.328   7.618  48.655 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)   
(Intercept)   4.6922     3.8733   1.211  0.22992   
t             0.7002     0.2298   3.047  0.00329 **
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 19.94 on 68 degrees of freedom
Multiple R-squared:  0.1201,    Adjusted R-squared:  0.1072 
F-statistic: 9.285 on 1 and 68 DF,  p-value: 0.003288
  • Comparaison des AIC, BIC et log-vraisemblance
> mod0=list(m0,m1,m2)
> lapply(mod0,logLik)
[[1]]
'log Lik.' -141.8329 (df=3)

[[2]]
'log Lik.' -164.9088 (df=3)

[[3]]
'log Lik.' -307.8193 (df=3)
> lapply(mod0,function(x)AIC(logLik(x)))
[[1]]
[1] 289.6658

[[2]]
[1] 335.8176

[[3]]
[1] 621.6386
> lapply(mod0,function(x)BIC(logLik(x)))
[[1]]
[1] 296.4113

[[2]]
[1] 342.5631

[[3]]
[1] 628.384
  • Rappelons que l’AIC se calcule \[\mbox{AIC}=-2*log-likelihood + k*\mbox{npar}\] où par défaut \(k=2\) et npar et le nombre de paramètres dans le modèle. Quand \(k=\log(n)\) on obtient le BIC.
> lapply(mod0,logLik)[[1]][1]
[1] -141.8329
> npar=length(coef(m0))+1
> -2*lapply(mod0,logLik)[[1]][1]+2*npar  ## AIC
[1] 289.6658
> -2*lapply(mod0,logLik)[[1]][1]+log(length(t)*n)*npar  ## BIC 
[1] 296.4113

On préfèrera le modèle avec un AIC/BIC minimal.

Modèles à effets mixtes

  • Pour les modèles fixes on a besoin du package lme4
> library(lme4)
Loading required package: Matrix
  • Syntaxe : lmer(somme des covariables fixes+(somme des covariables aléatoires|Identification))

intercept aléatoire.

  • Les données Ydt avec effets fixes
> m0=lmer(Y~t+(1|Enfant),data=Ydt)
> summary(m0)
Linear mixed model fit by REML ['lmerMod']
Formula: Y ~ t + (1 | Enfant)
   Data: Ydt

REML criterion at convergence: 290.7

Scaled residuals: 
     Min       1Q   Median       3Q      Max 
-2.62881 -0.53661  0.01616  0.62236  2.38799 

Random effects:
 Groups   Name        Variance Std.Dev.
 Enfant   (Intercept) 0.000    0.000   
 Residual             3.468    1.862   
Number of obs: 70, groups:  Enfant, 10

Fixed effects:
            Estimate Std. Error t value
(Intercept)  4.69221    0.36163   12.97
t            0.50912    0.02145   23.73

Correlation of Fixed Effects:
  (Intr)
t -0.788
> fixef(m0)
(Intercept)           t 
   4.692210    0.509122 
> ranef(m0)
$Enfant
      (Intercept)
enf1            0
enf2            0
enf3            0
enf4            0
enf5            0
enf6            0
enf7            0
enf8            0
enf9            0
enf10           0
  • Les données Yf1dt avec effets mixtes, mais avec un intercept aléatoire.
> m1=lmer(Y~t+(1|Enfant),data=Yf1dt)
> summary(m1)
Linear mixed model fit by REML ['lmerMod']
Formula: Y ~ t + (1 | Enfant)
   Data: Yf1dt

REML criterion at convergence: 312.9

Scaled residuals: 
     Min       1Q   Median       3Q      Max 
-2.49722 -0.41422  0.01308  0.55757  2.24427 

Random effects:
 Groups   Name        Variance Std.Dev.
 Enfant   (Intercept) 3.238    1.799   
 Residual             3.705    1.925   
Number of obs: 70, groups:  Enfant, 10

Fixed effects:
            Estimate Std. Error t value
(Intercept)  4.07287    0.68082   5.982
t            0.50912    0.02217  22.960

Correlation of Fixed Effects:
  (Intr)
t -0.433
> fixef(m1)
(Intercept)           t 
   4.072869    0.509122 
> ranef(m1)
$Enfant
      (Intercept)
enf1    1.3616738
enf2   -1.2442992
enf3    2.6415163
enf4   -2.2833895
enf5   -2.3933947
enf6    1.0308462
enf7   -0.4739326
enf8    1.0738631
enf9    0.9323798
enf10  -0.6452631
> a1
 [1]  1.188773822 -1.774866077  1.526815884 -2.335244079 -3.134083501
 [6]  0.728615589 -1.108229683  0.002763379  0.185853310 -1.473802365
  • Les données Yf2dt avec effets mixtes, mais avec une pente aléatoire.
> m2=lmer(Y~t+(1|Enfant),data=Yf2dt)
> summary(m2)
Linear mixed model fit by REML ['lmerMod']
Formula: Y ~ t + (1 | Enfant)
   Data: Yf2dt

REML criterion at convergence: 578.5

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-3.5598 -0.4404  0.0197  0.4680  2.4251 

Random effects:
 Groups   Name        Variance Std.Dev.
 Enfant   (Intercept) 241.0    15.52   
 Residual             174.5    13.21   
Number of obs: 70, groups:  Enfant, 10

Fixed effects:
            Estimate Std. Error t value
(Intercept)   4.6922     5.5389   0.847
t             0.7002     0.1522   4.600

Correlation of Fixed Effects:
  (Intr)
t -0.365
> fixef(m2)
(Intercept)           t 
  4.6922098   0.7001625 
> ranef(m2)
$Enfant
      (Intercept)
enf1   -12.773028
enf2    -5.006146
enf3    19.816061
enf4   -30.667872
enf5     8.181913
enf6     3.578225
enf7    16.842616
enf8    -7.225146
enf9     4.635721
enf10    2.617656
  • Comparaison des AIC, BIC et log-vraisemblance
> mod1=list(m0,m1,m2)
> lapply(mod1,logLik)
[[1]]
'log Lik.' -145.354 (df=4)

[[2]]
'log Lik.' -156.4336 (df=4)

[[3]]
'log Lik.' -289.241 (df=4)
> lapply(mod1,function(x)AIC(logLik(x)))
[[1]]
[1] 298.7079

[[2]]
[1] 320.8671

[[3]]
[1] 586.482
> lapply(mod1,function(x)BIC(logLik(x)))
[[1]]
[1] 307.7019

[[2]]
[1] 329.8611

[[3]]
[1] 595.4759

Pente aléatoire.

  • Les données Ydt avec effets fixes
> m0=lmer(Y~1+(0+t|Enfant),data=Ydt)
> summary(m0)
Linear mixed model fit by REML ['lmerMod']
Formula: Y ~ 1 + (0 + t | Enfant)
   Data: Ydt

REML criterion at convergence: 330.2

Scaled residuals: 
     Min       1Q   Median       3Q      Max 
-2.55517 -0.62903  0.02345  0.64373  2.06706 

Random effects:
 Groups   Name Variance Std.Dev.
 Enfant   t    0.255    0.5049  
 Residual      3.206    1.7907  
Number of obs: 70, groups:  Enfant, 10

Fixed effects:
            Estimate Std. Error t value
(Intercept)    4.803      0.346   13.88
> fixef(m0)
(Intercept) 
   4.803267 
> ranef(m0)
$Enfant
              t
enf1  0.4982985
enf2  0.4265146
enf3  0.5504132
enf4  0.4296629
enf5  0.4553365
enf6  0.4966059
enf7  0.4934435
enf8  0.5578000
enf9  0.5443637
enf10 0.5551902
  • Les données Yf1dt avec effets mixtes, mais avec un intercept aléatoire.
> m1=lmer(Y~1+(0+t|Enfant),data=Yf1dt)
> summary(m1)
Linear mixed model fit by REML ['lmerMod']
Formula: Y ~ 1 + (0 + t | Enfant)
   Data: Yf1dt

REML criterion at convergence: 341.8

Scaled residuals: 
     Min       1Q   Median       3Q      Max 
-2.59366 -0.56497  0.03127  0.55614  2.06821 

Random effects:
 Groups   Name Variance Std.Dev.
 Enfant   t    0.2629   0.5127  
 Residual      3.8799   1.9697  
Number of obs: 70, groups:  Enfant, 10

Fixed effects:
            Estimate Std. Error t value
(Intercept)   4.2028     0.3802   11.05
> fixef(m1)
(Intercept) 
    4.20284 
> ranef(m1)
$Enfant
              t
enf1  0.5807971
enf2  0.3715408
enf3  0.6485444
enf4  0.3486769
enf5  0.3372463
enf6  0.5577491
enf7  0.4693371
enf8  0.5851877
enf9  0.5802638
enf10 0.5140494
> a1
 [1]  1.188773822 -1.774866077  1.526815884 -2.335244079 -3.134083501
 [6]  0.728615589 -1.108229683  0.002763379  0.185853310 -1.473802365
  • Les données Yf2dt avec effets mixtes, mais avec une pente aléatoire.
> m2=lmer(Y~1+(0+t|Enfant),data=Yf2dt)
> summary(m2)
Linear mixed model fit by REML ['lmerMod']
Formula: Y ~ 1 + (0 + t | Enfant)
   Data: Yf2dt

REML criterion at convergence: 349.8

Scaled residuals: 
     Min       1Q   Median       3Q      Max 
-2.55302 -0.60525  0.04667  0.65375  2.05168 

Random effects:
 Groups   Name Variance Std.Dev.
 Enfant   t    1.838    1.356   
 Residual      3.202    1.789   
Number of obs: 70, groups:  Enfant, 10

Fixed effects:
            Estimate Std. Error t value
(Intercept)   4.7137     0.3472   13.57
> fixef(m2)
(Intercept) 
   4.713667 
> ranef(m2)
$Enfant
               t
enf1  -0.3470603
enf2   0.2304313
enf3   2.3231789
enf4  -1.8471634
enf5   1.3521402
enf6   1.0024830
enf7   2.0935693
enf8   0.1091450
enf9   1.1060548
enf10  0.9626953
  • Comparaison des AIC, BIC et log-vraisemblance
> mod2=list(m0,m1,m2)
> lapply(mod2,logLik)
[[1]]
'log Lik.' -165.0986 (df=3)

[[2]]
'log Lik.' -170.8811 (df=3)

[[3]]
'log Lik.' -174.899 (df=3)
> lapply(mod2,function(x)AIC(logLik(x)))
[[1]]
[1] 336.1971

[[2]]
[1] 347.7621

[[3]]
[1] 355.798
> lapply(mod2,function(x)BIC(logLik(x)))
[[1]]
[1] 342.9426

[[2]]
[1] 354.5076

[[3]]
[1] 362.5435
  • Pour la base Ydt
> mo0=list(mod0[[1]],mod1[[1]],mod2[[1]])
> lapply(mo0,logLik)
[[1]]
'log Lik.' -141.8329 (df=3)

[[2]]
'log Lik.' -145.354 (df=4)

[[3]]
'log Lik.' -165.0986 (df=3)
> lapply(mo0,function(x)AIC(logLik(x)))
[[1]]
[1] 289.6658

[[2]]
[1] 298.7079

[[3]]
[1] 336.1971
> lapply(mo0,function(x)BIC(logLik(x)))
[[1]]
[1] 296.4113

[[2]]
[1] 307.7019

[[3]]
[1] 342.9426
  • Pour la base Yf1dt
> mo1=list(mod0[[2]],mod1[[2]],mod2[[2]])
> lapply(mo1,logLik)
[[1]]
'log Lik.' -164.9088 (df=3)

[[2]]
'log Lik.' -156.4336 (df=4)

[[3]]
'log Lik.' -170.8811 (df=3)
> lapply(mo1,function(x)AIC(logLik(x)))
[[1]]
[1] 335.8176

[[2]]
[1] 320.8671

[[3]]
[1] 347.7621
> lapply(mo1,function(x)BIC(logLik(x)))
[[1]]
[1] 342.5631

[[2]]
[1] 329.8611

[[3]]
[1] 354.5076
  • Pour la base Yf2dt
> mo2=list(mod0[[3]],mod1[[3]],mod2[[3]])
> lapply(mo2,logLik)
[[1]]
'log Lik.' -307.8193 (df=3)

[[2]]
'log Lik.' -289.241 (df=4)

[[3]]
'log Lik.' -174.899 (df=3)
> lapply(mo2,function(x)AIC(logLik(x)))
[[1]]
[1] 621.6386

[[2]]
[1] 586.482

[[3]]
[1] 355.798
> lapply(mo2,function(x)BIC(logLik(x)))
[[1]]
[1] 628.384

[[2]]
[1] 595.4759

[[3]]
[1] 362.5435

Méthode Bayésienne

Modèle à effets aléatoires avec intercept aléatoire.

  • le code R pour ~OpenBUGS`
> sink('exemple_Effet_mixte1.txt')
> cat("
+     model{
+     for(i in 1:n){
+        a.alpha[i]~dnorm(0,tau.mu)
+       for(k in 1:K){
+         y[i,k]~dnorm(mu[i,k],tau)
+         mu[i,k]<-alpha+a.alpha[i]+beta*tt[k]
+       }
+     }
+     alpha~dnorm(0,.001)
+     beta~dnorm(0,.001)
+     tau~dgamma(.01,.01)
+     tau.mu~dgamma(.01,.01)
+     }
+     ",fill=T)
> sink()
> ##
> filename<-"exemple_Effet_mixte1.txt"
> 
> ## data
> donnees0=list(y=as.matrix(Y[,1:(ncol(Y)-1)]),tt=c(0,3,6,12,18,24,30),n=nrow(Y),K=ncol(Y)-1)
> donnees1=list(y=as.matrix(Yf1[,1:(ncol(Y)-1)]),tt=c(0,3,6,12,18,24,30),n=nrow(Yf1),K=ncol(Yf1)-1)
> donnees2=list(y=as.matrix(Yf2[,1:(ncol(Y)-1)]),tt=c(0,3,6,12,18,24,30),n=nrow(Yf2),K=ncol(Yf2)-1)
> 
> ###
> inits<-function(){
+   inits=list(alpha=rnorm(1,0,3),beta=rnorm(1,0,4),tau=rgamma(1,1),tau.mu=rgamma(1,1))
+ }
> 
> ### parametres
> params <- c('alpha','a.alpha','beta','tau')
> 
> ## BUGS
> library(R2OpenBUGS)
> 
> out0 <-bugs(donnees0,inits,params,filename,codaPkg=F,n.thin =1,
+                      n.iter=10000,debug=F,n.chains = 3,
+                      working.directory=getwd())
> 
> out1 <-bugs(donnees1,inits,params,filename,codaPkg=F,n.thin =1,
+            n.iter=10000,debug=F,n.chains = 3,
+            working.directory=getwd())
> out2 <-bugs(donnees2,inits,params,filename,codaPkg=F,n.thin =1,
+             n.iter=10000,debug=F,n.chains = 3,
+             working.directory=getwd())
  • Résumé des résultats
> library(R2OpenBUGS)
> out0
Inference for Bugs model at "exemple_Effet_mixte1.txt", 
Current: 3 chains, each with 10000 iterations (first 5000 discarded)
Cumulative: n.sims = 15000 iterations saved
             mean  sd  2.5%   25%   50%   75% 97.5% Rhat n.eff
alpha         4.7 0.4   3.9   4.4   4.7   4.9   5.4    1  7600
a.alpha[1]    0.0 0.3  -0.7  -0.2   0.0   0.1   0.5    1 15000
a.alpha[2]    0.0 0.3  -0.7  -0.2   0.0   0.1   0.5    1 15000
a.alpha[3]    0.1 0.3  -0.4   0.0   0.1   0.3   0.9    1  4400
a.alpha[4]   -0.1 0.3  -0.9  -0.3  -0.1   0.0   0.4    1  8300
a.alpha[5]    0.0 0.3  -0.7  -0.2   0.0   0.1   0.5    1 12000
a.alpha[6]    0.0 0.3  -0.6  -0.2   0.0   0.1   0.5    1 13000
a.alpha[7]    0.0 0.3  -0.6  -0.1   0.0   0.1   0.5    1  6100
a.alpha[8]    0.1 0.3  -0.4  -0.1   0.1   0.2   0.8    1 15000
a.alpha[9]    0.0 0.3  -0.5  -0.1   0.0   0.2   0.7    1 15000
a.alpha[10]   0.0 0.3  -0.5  -0.1   0.0   0.2   0.6    1 14000
beta          0.5 0.0   0.5   0.5   0.5   0.5   0.6    1  4500
tau           0.3 0.1   0.2   0.3   0.3   0.3   0.4    1 15000
deviance    286.8 3.0 282.4 284.7 286.2 288.3 294.0    1 12000

For each parameter, n.eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor (at convergence, Rhat=1).

DIC info (using the rule, pD = Dbar-Dhat)
pD = 4.5 and DIC = 291.2
DIC is an estimate of expected predictive error (lower deviance is better).
> out1
Inference for Bugs model at "exemple_Effet_mixte1.txt", 
Current: 3 chains, each with 10000 iterations (first 5000 discarded)
Cumulative: n.sims = 15000 iterations saved
             mean  sd  2.5%   25%   50%   75% 97.5% Rhat n.eff
alpha         4.1 0.7   2.6   3.6   4.1   4.5   5.5    1  2800
a.alpha[1]    1.3 0.9  -0.4   0.7   1.3   1.9   3.2    1  2600
a.alpha[2]   -1.2 0.9  -3.1  -1.8  -1.2  -0.6   0.5    1  6100
a.alpha[3]    2.6 0.9   0.8   2.0   2.6   3.2   4.5    1  4300
a.alpha[4]   -2.2 0.9  -4.2  -2.8  -2.2  -1.6  -0.4    1  1900
a.alpha[5]   -2.3 0.9  -4.2  -3.0  -2.3  -1.7  -0.5    1  3400
a.alpha[6]    1.0 0.9  -0.7   0.4   1.0   1.6   2.9    1  2000
a.alpha[7]   -0.5 0.9  -2.3  -1.1  -0.5   0.1   1.3    1  1700
a.alpha[8]    1.1 0.9  -0.7   0.5   1.0   1.6   2.9    1  2500
a.alpha[9]    0.9 0.9  -0.8   0.3   0.9   1.5   2.8    1  2300
a.alpha[10]  -0.6 0.9  -2.5  -1.2  -0.6   0.0   1.2    1  2600
beta          0.5 0.0   0.5   0.5   0.5   0.5   0.6    1  4600
tau           0.3 0.1   0.2   0.2   0.3   0.3   0.4    1  9600
deviance    292.1 5.8 283.4 287.9 291.2 295.2 305.4    1  5100

For each parameter, n.eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor (at convergence, Rhat=1).

DIC info (using the rule, pD = Dbar-Dhat)
pD = 11.2 and DIC = 303.2
DIC is an estimate of expected predictive error (lower deviance is better).
> out2
Inference for Bugs model at "exemple_Effet_mixte1.txt", 
Current: 3 chains, each with 10000 iterations (first 5000 discarded)
Cumulative: n.sims = 15000 iterations saved
             mean  sd  2.5%   25%   50%   75% 97.5% Rhat n.eff
alpha         4.4 5.9  -7.5   0.7   4.6   8.3  15.9    1   920
a.alpha[1]  -12.5 7.1 -26.7 -17.0 -12.4  -7.9   1.3    1  1700
a.alpha[2]   -4.7 7.1 -18.8  -9.2  -4.7  -0.1   9.1    1  2400
a.alpha[3]   19.8 7.1   6.3  15.2  19.6  24.3  34.4    1  2600
a.alpha[4]  -30.2 7.3 -45.2 -34.9 -30.1 -25.4 -16.4    1  1200
a.alpha[5]    8.4 7.1  -5.4   3.7   8.3  13.0  22.7    1  2100
a.alpha[6]    3.7 7.1 -10.1  -1.0   3.7   8.2  17.9    1  1400
a.alpha[7]   16.9 7.2   3.1  12.1  16.9  21.4  31.4    1  1100
a.alpha[8]   -6.9 7.1 -21.0 -11.5  -7.0  -2.3   7.0    1  1700
a.alpha[9]    4.9 7.1  -9.1   0.3   4.9   9.5  19.1    1  1500
a.alpha[10]   2.8 7.1 -11.1  -1.9   2.9   7.4  16.8    1  1500
beta          0.7 0.2   0.4   0.6   0.7   0.8   1.0    1  5000
tau           0.0 0.0   0.0   0.0   0.0   0.0   0.0    1 15000
deviance    561.5 5.5 553.0 557.4 560.7 564.6 574.2    1 15000

For each parameter, n.eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor (at convergence, Rhat=1).

DIC info (using the rule, pD = Dbar-Dhat)
pD = 11.5 and DIC = 573.0
DIC is an estimate of expected predictive error (lower deviance is better).
  • Comparons les pD, et DIC
> res1=rbind(c(out0$DIC,out1$DIC,out2$DIC),
+           c(out0$pD,out1$pD,out2$pD))
> colnames(res1)=c("out0","out1","out2")
> rownames(res1)=c("DIC","pD")
> res1
       out0   out1   out2
DIC 291.200 303.20 573.00
pD    4.451  11.15  11.49
  • Le choix va se faire sur le données Yf2dt car c’est le modèle avec un pD proche du nombre réel des paramètres avec un DIC le moins élévé

Modèle à effets aléatoires avec pente aléatoire.

> sink('exemple_Effet_mixte2.txt')
> cat("
+     model{
+     for(i in 1:n){
+        b.beta[i]~dnorm(0,tau.mu)
+       for(k in 1:K){
+         y[i,k]~dnorm(mu[i,k],tau)
+         mu[i,k]<-alpha+(beta+b.beta[i])*tt[k]
+       }
+     }
+     alpha~dnorm(0,.001)
+     beta~dnorm(0,.001)
+     tau~dgamma(.01,.01)
+     tau.mu~dgamma(.01,.01)
+     }
+     ",fill=T)
> sink()
> ##
> filename<-"exemple_Effet_mixte2.txt"
> 
> ### parametres
> params <- c('alpha','b.beta','beta','tau')
> 
> ## BUGS
> library(R2OpenBUGS)
> 
> out01 <-bugs(donnees0,inits,params,filename,codaPkg=F,n.thin =1,
+                      n.iter=10000,debug=F,n.chains = 3,
+                      working.directory=getwd())
> 
> out11 <-bugs(donnees1,inits,params,filename,codaPkg=F,n.thin =1,
+            n.iter=10000,debug=F,n.chains = 3,
+            working.directory=getwd())
> out21 <-bugs(donnees2,inits,params,filename,codaPkg=F,n.thin =1,
+             n.iter=10000,debug=F,n.chains = 3,
+             working.directory=getwd())
  • Résumé des résultats
> library(R2OpenBUGS)
> out01
Inference for Bugs model at "exemple_Effet_mixte2.txt", 
Current: 3 chains, each with 10000 iterations (first 5000 discarded)
Cumulative: n.sims = 15000 iterations saved
            mean  sd  2.5%   25%   50%   75% 97.5% Rhat n.eff
alpha        4.7 0.3   4.0   4.5   4.7   4.9   5.4    1  4500
b.beta[1]    0.0 0.0  -0.1   0.0   0.0   0.0   0.1    1  5100
b.beta[2]   -0.1 0.0  -0.1  -0.1  -0.1   0.0   0.0    1 15000
b.beta[3]    0.0 0.0   0.0   0.0   0.0   0.1   0.1    1 15000
b.beta[4]   -0.1 0.0  -0.1  -0.1  -0.1   0.0   0.0    1  2500
b.beta[5]    0.0 0.0  -0.1  -0.1   0.0   0.0   0.0    1  5900
b.beta[6]    0.0 0.0  -0.1   0.0   0.0   0.0   0.1    1  3600
b.beta[7]    0.0 0.0  -0.1   0.0   0.0   0.0   0.1    1  2200
b.beta[8]    0.0 0.0   0.0   0.0   0.0   0.1   0.1    1  4400
b.beta[9]    0.0 0.0   0.0   0.0   0.0   0.1   0.1    1  3900
b.beta[10]   0.0 0.0   0.0   0.0   0.0   0.1   0.1    1  3700
beta         0.5 0.0   0.4   0.5   0.5   0.5   0.6    1  1800
tau          0.3 0.1   0.2   0.3   0.3   0.4   0.4    1 15000
deviance   279.8 4.7 272.5 276.4 279.2 282.5 290.5    1 15000

For each parameter, n.eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor (at convergence, Rhat=1).

DIC info (using the rule, pD = Dbar-Dhat)
pD = 10.1 and DIC = 289.9
DIC is an estimate of expected predictive error (lower deviance is better).
> out11
Inference for Bugs model at "exemple_Effet_mixte2.txt", 
Current: 3 chains, each with 10000 iterations (first 5000 discarded)
Cumulative: n.sims = 15000 iterations saved
            mean  sd  2.5%   25%   50%   75% 97.5% Rhat n.eff
alpha        4.1 0.4   3.3   3.8   4.1   4.3   4.8  1.0  4600
b.beta[1]    0.1 0.1   0.0   0.0   0.1   0.1   0.7  1.3    42
b.beta[2]   -0.1 0.1  -0.2  -0.1  -0.1  -0.1   0.4  1.3    43
b.beta[3]    0.2 0.1   0.0   0.1   0.1   0.2   0.7  1.3    43
b.beta[4]   -0.1 0.1  -0.2  -0.2  -0.1  -0.1   0.4  1.3    41
b.beta[5]   -0.1 0.1  -0.3  -0.2  -0.1  -0.1   0.4  1.3    42
b.beta[6]    0.1 0.1  -0.1   0.0   0.1   0.1   0.6  1.3    42
b.beta[7]    0.0 0.1  -0.1  -0.1   0.0   0.0   0.5  1.3    41
b.beta[8]    0.1 0.1   0.0   0.0   0.1   0.1   0.7  1.3    42
b.beta[9]    0.1 0.1   0.0   0.0   0.1   0.1   0.6  1.3    41
b.beta[10]   0.0 0.1  -0.1   0.0   0.0   0.1   0.6  1.3    42
beta         0.5 0.1  -0.1   0.5   0.5   0.5   0.6  1.3    39
tau          0.3 0.0   0.2   0.2   0.3   0.3   0.4  1.0 14000
deviance   294.6 5.4 286.4 290.7 293.9 297.7 307.1  1.0 10000

For each parameter, n.eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor (at convergence, Rhat=1).

DIC info (using the rule, pD = Dbar-Dhat)
pD = 11.2 and DIC = 305.9
DIC is an estimate of expected predictive error (lower deviance is better).
> out21
Inference for Bugs model at "exemple_Effet_mixte2.txt", 
Current: 3 chains, each with 10000 iterations (first 5000 discarded)
Cumulative: n.sims = 15000 iterations saved
            mean  sd  2.5%   25%   50%   75% 97.5% Rhat n.eff
alpha        4.7 0.4   4.0   4.4   4.7   4.9   5.4  1.0  4500
b.beta[1]    2.4 4.8  -1.9  -1.0  -0.7   9.0   9.5 23.6     3
b.beta[2]    3.0 4.8  -1.4  -0.5  -0.1   9.5  10.1 23.5     3
b.beta[3]    5.1 4.8   0.7   1.6   2.0  11.6  12.2  7.3     3
b.beta[4]    0.9 4.8  -3.4  -2.5  -2.2   7.4   8.0 23.6     3
b.beta[5]    4.1 4.8  -0.2   0.7   1.0  10.7  11.2 23.6     3
b.beta[6]    3.7 4.8  -0.6   0.3   0.6  10.3  10.9 23.5     3
b.beta[7]    4.8 4.8   0.5   1.4   1.7  11.4  12.0  6.3     3
b.beta[8]    2.8 4.8  -1.5  -0.6  -0.3   9.4  10.0 23.6     3
b.beta[9]    3.8 4.8  -0.5   0.4   0.7  10.4  11.0 23.6     3
b.beta[10]   3.7 4.8  -0.6   0.3   0.6  10.3  10.8 23.5     3
beta        -2.7 4.8  -9.9  -9.3   0.4   0.7   1.6 23.7     3
tau          0.3 0.1   0.2   0.3   0.3   0.4   0.4  1.0 14000
deviance   281.3 5.4 272.9 277.4 280.6 284.4 293.7  1.0 11000

For each parameter, n.eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor (at convergence, Rhat=1).

DIC info (using the rule, pD = Dbar-Dhat)
pD = 12.2 and DIC = 293.5
DIC is an estimate of expected predictive error (lower deviance is better).
  • On doit ici augmenter le nombre d’itérations pour les CM des cas out11 et out21. On fera 100,000 itérations
> out11_p <-bugs(donnees1,inits,params,filename,codaPkg=F,n.thin =1,
+              n.iter=100000,debug=F,n.chains = 3,
+              working.directory=getwd())
> out21_p <-bugs(donnees2,inits,params,filename,codaPkg=F,n.thin =1,
+              n.iter=100000,debug=F,n.chains = 3,
+              working.directory=getwd())
> out11_p
Inference for Bugs model at "exemple_Effet_mixte2.txt", 
Current: 3 chains, each with 1e+05 iterations (first 50000 discarded)
Cumulative: n.sims = 150000 iterations saved
            mean  sd  2.5%   25%   50%   75% 97.5% Rhat n.eff
alpha        4.1 0.4   3.3   3.8   4.1   4.3   4.8    1 15000
b.beta[1]    0.1 0.1   0.0   0.0   0.1   0.1   0.2    1 13000
b.beta[2]   -0.1 0.1  -0.2  -0.1  -0.1  -0.1   0.0    1 18000
b.beta[3]    0.1 0.1   0.0   0.1   0.1   0.2   0.2    1 13000
b.beta[4]   -0.1 0.1  -0.3  -0.2  -0.1  -0.1   0.0    1 15000
b.beta[5]   -0.1 0.1  -0.3  -0.2  -0.1  -0.1   0.0    1 11000
b.beta[6]    0.1 0.1  -0.1   0.0   0.1   0.1   0.2    1 12000
b.beta[7]    0.0 0.1  -0.1  -0.1   0.0   0.0   0.1    1  9900
b.beta[8]    0.1 0.1   0.0   0.0   0.1   0.1   0.2    1  9500
b.beta[9]    0.1 0.1   0.0   0.0   0.1   0.1   0.2    1 14000
b.beta[10]   0.0 0.1  -0.1   0.0   0.0   0.0   0.1    1 11000
beta         0.5 0.0   0.4   0.5   0.5   0.5   0.6    1  5400
tau          0.3 0.0   0.2   0.2   0.3   0.3   0.4    1 34000
deviance   294.6 5.4 286.3 290.7 293.9 297.7 307.0    1 55000

For each parameter, n.eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor (at convergence, Rhat=1).

DIC info (using the rule, pD = Dbar-Dhat)
pD = 11.2 and DIC = 305.8
DIC is an estimate of expected predictive error (lower deviance is better).
> out21_p
Inference for Bugs model at "exemple_Effet_mixte2.txt", 
Current: 3 chains, each with 1e+05 iterations (first 50000 discarded)
Cumulative: n.sims = 150000 iterations saved
            mean  sd  2.5%   25%   50%   75% 97.5% Rhat n.eff
alpha        4.7 0.4   4.0   4.5   4.7   4.9   5.4    1 15000
b.beta[1]   -1.1 0.4  -2.0  -1.3  -1.1  -0.8  -0.3    1   130
b.beta[2]   -0.5 0.4  -1.4  -0.7  -0.5  -0.3   0.2    1   130
b.beta[3]    1.6 0.4   0.7   1.3   1.6   1.8   2.3    1   190
b.beta[4]   -2.6 0.4  -3.5  -2.8  -2.6  -2.3  -1.8    1   130
b.beta[5]    0.6 0.4  -0.3   0.4   0.6   0.9   1.4    1   130
b.beta[6]    0.3 0.4  -0.6   0.0   0.3   0.5   1.0    1   130
b.beta[7]    1.3 0.4   0.5   1.1   1.4   1.6   2.1    1   130
b.beta[8]   -0.6 0.4  -1.5  -0.9  -0.6  -0.4   0.1    1   130
b.beta[9]    0.4 0.4  -0.5   0.1   0.4   0.6   1.1    1   130
b.beta[10]   0.2 0.4  -0.7   0.0   0.2   0.5   1.0    1   130
beta         0.8 0.4   0.0   0.5   0.7   1.0   1.6    1   120
tau          0.3 0.1   0.2   0.3   0.3   0.3   0.4    1 33000
deviance   281.3 5.4 272.9 277.4 280.6 284.4 293.7    1 49000

For each parameter, n.eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor (at convergence, Rhat=1).

DIC info (using the rule, pD = Dbar-Dhat)
pD = 12.2 and DIC = 293.5
DIC is an estimate of expected predictive error (lower deviance is better).
  • Remarquons que la taille de n.eff pour out21_p reste encore faible. Ce qui nous ramène à faire tourner un peu plus longtemps la CM.
> out21_pp <-bugs(donnees2,inits,params,filename,codaPkg=F,n.thin =10,
+                n.iter=500000,debug=F,n.chains = 3,
+                working.directory=getwd())
> out21_pp
Inference for Bugs model at "exemple_Effet_mixte2.txt", 
Current: 3 chains, each with 5e+05 iterations (first 250000 discarded), n.thin = 10
Cumulative: n.sims = 750000 iterations saved
            mean  sd  2.5%   25%   50%   75% 97.5% Rhat  n.eff
alpha        4.7 0.4   4.0   4.5   4.7   4.9   5.4    1 570000
b.beta[1]   -1.1 0.4  -1.9  -1.3  -1.1  -0.8  -0.2    1 180000
b.beta[2]   -0.5 0.4  -1.4  -0.8  -0.5  -0.2   0.4    1 180000
b.beta[3]    1.6 0.4   0.7   1.3   1.6   1.9   2.5    1 210000
b.beta[4]   -2.6 0.4  -3.4  -2.8  -2.6  -2.3  -1.7    1 220000
b.beta[5]    0.6 0.4  -0.2   0.4   0.6   0.9   1.5    1 190000
b.beta[6]    0.3 0.4  -0.6   0.0   0.3   0.6   1.2    1 200000
b.beta[7]    1.4 0.4   0.5   1.1   1.4   1.7   2.3    1 170000
b.beta[8]   -0.6 0.4  -1.5  -0.9  -0.6  -0.3   0.3    1 200000
b.beta[9]    0.4 0.4  -0.5   0.1   0.4   0.7   1.3    1 190000
b.beta[10]   0.3 0.4  -0.6   0.0   0.3   0.5   1.1    1 210000
beta         0.7 0.4  -0.2   0.4   0.7   1.0   1.6    1 200000
tau          0.3 0.1   0.2   0.3   0.3   0.3   0.4    1 750000
deviance   281.3 5.4 273.0 277.4 280.6 284.4 293.8    1 300000

For each parameter, n.eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor (at convergence, Rhat=1).

DIC info (using the rule, pD = Dbar-Dhat)
pD = 12.2 and DIC = 293.5
DIC is an estimate of expected predictive error (lower deviance is better).
  • Comparons les pD, et DIC
> res2=rbind(c(out01$DIC,out11$DIC,out21$DIC),
+           c(out01$pD,out11_p$pD,out21_pp$pD))
> colnames(res2)=c("out01","out11","out21")
> rownames(res2)=c("DIC","pD")
> res2
     out01  out11  out21
DIC 289.90 305.90 293.50
pD   10.08  11.17  12.19
  • Comparons les deux modèles.
> cbind(res1,res2)
       out0   out1   out2  out01  out11  out21
DIC 291.200 303.20 573.00 289.90 305.90 293.50
pD    4.451  11.15  11.49  10.08  11.17  12.19

Prédiction.

Valeurs ajustées.

  • Estimation des valeurs ajustées par le modèle à effets mixtes et prédiction de \(y\) pour \(t=36\) pour les 10 “Enfants”.
> y2=as.matrix(Yf2[,1:(ncol(Y)-1)])
> y2=rbind(y2,rep(NA,7))
> y2=cbind(y2,rep(NA,11))
> donnees2b=list(y=y2,tt=c(0,3,6,12,18,24,30,36),n=nrow(y2),K=ncol(y2))
>                
> donnees2b               
$y
            t0         t3          t6        t12        t18        t24
 [1,] 3.247092  3.8082774   0.7107242   3.454524  -1.195040  -5.613011
 [2,] 5.976649  6.5432589   5.6726157  11.090347  10.629864  10.391089
 [3,] 6.749862 11.2115243  18.0704024  33.593238  46.950791  60.098935
 [4,] 6.064273 -0.7069206 -10.1908046 -15.684551 -27.748561 -38.659996
 [5,] 3.543700  9.5086410  15.5628748  20.985456  30.311890  37.774453
 [6,] 3.670011  6.7096968  10.3779265  18.693157  24.016012  28.157166
 [7,] 5.893927 11.8972757  15.6903871  28.220807  42.932860  56.308660
 [8,] 6.262215  5.4273841   3.5382919   5.706929   3.028308   8.415426
 [9,] 3.765557  5.5768154  11.9696085  16.890229  28.293743  29.742197
[10,] 4.556004  5.7153981  10.2814737  11.697861  23.642777  26.422063
[11,]       NA         NA          NA         NA         NA         NA
             t30   
 [1,]  -5.115235 NA
 [2,]   8.987563 NA
 [3,]  74.351870 NA
 [4,] -52.002011 NA
 [5,]  43.473459 NA
 [6,]  33.976043 NA
 [7,]  67.114800 NA
 [8,]   9.772523 NA
 [9,]  37.530325 NA
[10,]  35.864669 NA
[11,]         NA NA

$tt
[1]  0  3  6 12 18 24 30 36

$n
[1] 11

$K
[1] 8
> params <- c('y')
> out21_pred <-bugs(donnees2b,inits,params,filename,codaPkg=F,n.thin =10,
+                 n.iter=100000,debug=F,n.chains = 3,
+                 working.directory=getwd())
> out21_pred
Inference for Bugs model at "exemple_Effet_mixte2.txt", 
Current: 3 chains, each with 1e+05 iterations (first 50000 discarded), n.thin = 10
Cumulative: n.sims = 150000 iterations saved
          mean   sd  2.5%   25%   50%   75% 97.5% Rhat  n.eff
y[1,8]    -7.7  2.4 -12.3  -9.3  -7.7  -6.2  -3.1    1 150000
y[2,8]    13.0  2.4   8.4  11.5  13.0  14.6  17.7    1 150000
y[3,8]    88.4  2.3  83.7  86.8  88.4  89.9  93.0    1 150000
y[4,8]   -61.7  2.4 -66.4 -63.3 -61.7 -60.1 -57.1    1  93000
y[5,8]    53.4  2.3  48.8  51.9  53.4  55.0  58.1    1 150000
y[6,8]    40.8  2.4  36.2  39.3  40.8  42.4  45.5    1 150000
y[7,8]    80.1  2.3  75.5  78.5  80.1  81.7  84.7    1 150000
y[8,8]     8.7  2.4   4.1   7.1   8.7  10.3  13.3    1 150000
y[9,8]    44.6  2.4  40.0  43.0  44.6  46.1  49.2    1 150000
y[10,8]   39.4  2.3  34.8  37.8  39.4  41.0  44.0    1 150000
y[11,1]    4.7  1.9   1.0   3.5   4.7   5.9   8.3    1 150000
y[11,2]    6.8  4.8  -2.7   3.8   6.8   9.8  16.2    1 130000
y[11,3]    8.8  9.0  -9.1   3.2   8.9  14.4  26.6    1  78000
y[11,4]   12.9 17.7 -22.3   2.0  13.0  23.9  48.0    1  87000
y[11,5]   17.1 26.5 -35.6   0.6  17.2  33.5  69.6    1  98000
y[11,6]   21.2 35.2 -49.0  -0.6  21.3  43.1  91.2    1  86000
y[11,7]   25.3 44.0 -62.5  -2.0  25.5  52.7 112.7    1  94000
y[11,8]   29.5 52.8 -75.9  -3.3  29.7  62.3 134.4    1  81000
deviance 281.3  5.4 273.0 277.4 280.6 284.4 293.8    1 150000

For each parameter, n.eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor (at convergence, Rhat=1).

DIC info (using the rule, pD = Dbar-Dhat)
pD = 12.2 and DIC = 293.5
DIC is an estimate of expected predictive error (lower deviance is better).

Représentation graphique

  • Les valeurs ajustées
> colnames(out21_pred$sims.matrix)
 [1] "y[1,8]"   "y[2,8]"   "y[3,8]"   "y[4,8]"   "y[5,8]"   "y[6,8]"  
 [7] "y[7,8]"   "y[8,8]"   "y[9,8]"   "y[10,8]"  "y[11,1]"  "y[11,2]" 
[13] "y[11,3]"  "y[11,4]"  "y[11,5]"  "y[11,6]"  "y[11,7]"  "y[11,8]" 
[19] "deviance"
  • Nommer les colonnes autrement.
> colnames(out21_pred$sims.matrix)[1:18]=c(paste("enf",1:10,sep=""),paste("t",t,sep=""),"t36")
  • Quelques statistiques
> x1=apply(out21_pred$sims.matrix[,11:17],2,function(x)c(quantile(x,probs = c(.025,.975)),mean(x)))
> x1=data.frame(x1,c("q2.5%","q97.5%","mean"))
> colnames(x1)[8]="statistic"
> dx1=melt(x1,id="statistic")
> dx1$t=sort(rep(t,3))
> colnames(dx1)[3]="Y"
> dx2=data.frame(t(x1[1:2,1:7]))
> dx2$t=t
> colnames(dx2)=c("q1","q3","t")
  • Le graphe
> gr<-ggplot()+geom_ribbon(data=dx2,aes(x=t, ymin=q1, ymax=q3), fill="grey", alpha=.4)+
+   geom_line(data=dx1,aes(x=t,y=Y,colour=statistic,group=statistic),size=3) +
+   geom_point(data=dx1,aes(x=t,y=Y,colour=statistic,group=statistic),size=6)
> gr<-gr+geom_line(data=Yf2dt,aes(x=t,y=Y,group=Enfant))+
+   geom_point(data=Yf2dt,aes(x=t,y=Y,group=Enfant))
> gr<-gr+theme_classic()
> gr+theme(legend.position="none")