Estimación de parámetros de una población normal con LaplacesDemon
Introducción
LaplacesDemon is an open-source statistical package that is intended to provide a complete environment for Bayesian inference. LaplacesDemon has been used in numerous fields. The user writes their own model specification function and selects a numerical approximation algorithm to update their Bayesian model. Some numerical approximation families of algorithms include Laplace’s method (Laplace approximation), numerical integration (iterative quadrature), Markov chain Monte Carlo (MCMC), and Variational Bayes.
For more information please consult the webpage.
Problema
El objetivo es estimar la media y la desviación de una población normal usando una muestra aleatoria e información a priori de los parámetros.
Los datos
Vamos a simular unos datos de una población con parámetros conocidos. El parámetro \(\tau\) se llama parámetro precisión y se define como \(\tau=1/\sigma\).
true_mean <- 15
true_sd <- 7
true_tau <- 1 / true_sd
set.seed(666) # Semilla del diablo
y <- rnorm(n=100, mean=true_mean, sd=true_sd)
hist(y, freq=FALSE)Distribuciones a priori
Vamos a usar las siguientes dos a priori para \(\mu\) y \(\tau\) respectivamente.
# Hyperparameters for mu
mu0 <- 0
sd0 <- 6
curve(dnorm(x, mean=mu0, sd=sd0), from=-20, to=20, las=1, lwd=3,
main='Prior to mu', ylab='Density', xla='mu')# Hyperparameters for tau
df0 <- 1
scale0 <- 1
library(LaplacesDemon)
curve(dinvchisq(x, df=df0, scale=scale0), from=0.01, to=15, las=1, lwd=3,
xlim=c(-3, 15),
main='Prior to tau', ylab='Density', xlab='tau')Definiendo el modelo
# Model function
Model <- function(parm, Data) {
mu <- parm[1]
tau <- exp(parm[2])
loglik <- sum(dnorm(Data$y, mean=mu, sd=1/tau, log=T))
postlik <- loglik + dnorm(mu, mean=mu0, sd=sd0, log = T)
postlik <- postlik + dinvchisq(tau, df=df0, scale=scale0, log=T)
Modelout<-list(LP=postlik,
Dev= -2 * loglik,
Monitor=c(postlik, loglik),
yhat=rnorm(n=length(Data$y), mean=mu, sd=1/tau),
parm=parm)
return(Modelout)
}
# Data object
MyData <- list(N = length(y),
mon.names = c("postlik", "loglik"),
parm.names = c("mu", "tau"),
y=y)
# Initial values for the chain
Initial.Values <- c(0, 1)Corriendo el modelo
library(LaplacesDemon)
N <- 10000
fit <- LaplacesDemon(Model=Model, Data = MyData,
Iterations = N, Status=1000,
Algorithm = "HARM", Thinning = 1,
Initial.Values=Initial.Values)##
## Laplace's Demon was called on Thu Mar 05 11:22:09 2020
##
## Performing initial checks...
## Algorithm: Hit-And-Run Metropolis
##
## Laplace's Demon is beginning to update...
## Iteration: 1000Iteration: 1000, Proposal: Multivariate, LP: -347
## Iteration: 2000Iteration: 2000, Proposal: Multivariate, LP: -347.2
## Iteration: 3000Iteration: 3000, Proposal: Multivariate, LP: -346.1
## Iteration: 4000Iteration: 4000, Proposal: Multivariate, LP: -346.6
## Iteration: 5000Iteration: 5000, Proposal: Multivariate, LP: -348.1
## Iteration: 6000Iteration: 6000, Proposal: Multivariate, LP: -346.6
## Iteration: 7000Iteration: 7000, Proposal: Multivariate, LP: -347.5
## Iteration: 8000Iteration: 8000, Proposal: Multivariate, LP: -346.1
## Iteration: 9000Iteration: 9000, Proposal: Multivariate, LP: -346.1
## Iteration: 10000Iteration: 10000, Proposal: Multivariate, LP: -348.3
##
## Assessing Stationarity
## Assessing Thinning and ESS
## Creating Summaries
## Estimating Log of the Marginal Likelihood
## Creating Output
##
## Laplace's Demon has finished.
Para ver un resumen del modelo usamos la siguiente instrucción. En esa salida aparecerá información valiosa.
##
## #############################################################
## # Consort with Laplace's Demon #
## #############################################################
## Call:
## LaplacesDemon(Model = Model, Data = MyData, Initial.Values = Initial.Values,
## Iterations = N, Status = 1000, Thinning = 1, Algorithm = "HARM")
##
## Acceptance Rate: 0.2399
## Algorithm: Hit-And-Run Metropolis
## Covariance Matrix: (NOT SHOWN HERE; diagonal shown instead)
## mu tau
## 3.37584682 0.02120749
##
## Covariance (Diagonal) History: (NOT SHOWN HERE)
## Deviance Information Criterion (DIC):
## All Stationary
## Dbar 824.4 824.4
## pD 10976408.5 10976408.5
## DIC 10977232.9 10977232.9
## Initial Values:
## [1] 0 1
##
## Iterations: 10000
## Log(Marginal Likelihood): NA
## Minutes of run-time: 0.01
## Model: (NOT SHOWN HERE)
## Monitor: (NOT SHOWN HERE)
## Parameters (Number of): 2
## Posterior1: (NOT SHOWN HERE)
## Posterior2: (NOT SHOWN HERE)
## Recommended Burn-In of Thinned Samples: 0
## Recommended Burn-In of Un-thinned Samples: 0
## Recommended Thinning: 40
## Specs: (NOT SHOWN HERE)
## Status is displayed every 1000 iterations
## Summary1: (SHOWN BELOW)
## Summary2: (SHOWN BELOW)
## Thinned Samples: 10000
## Thinning: 1
##
##
## Summary of All Samples
## Mean SD MCSE ESS LB
## mu 14.020045 1.8320830 0.2147966 35.77964 11.122464
## tau -1.975999 0.1425621 0.0097089 406.18349 -2.155096
## Deviance 824.400336 4685.3833422 195.9012293 1126.65157 677.808876
## postlik -419.280386 2342.6295956 97.9511875 1126.84331 -355.977930
## loglik -412.200168 2342.6916711 97.9506147 1126.65157 -349.919435
## Median UB
## mu 14.249027 15.746808
## tau -1.967926 -1.834977
## Deviance 679.262259 699.838869
## postlik -346.738514 -346.045123
## loglik -339.631129 -338.904438
##
##
## Summary of Stationary Samples
## Mean SD MCSE ESS LB
## mu 14.020045 1.8320830 0.2147966 35.77964 11.122464
## tau -1.975999 0.1425621 0.0097089 406.18349 -2.155096
## Deviance 824.400336 4685.3833422 195.9012293 1126.65157 677.808876
## postlik -419.280386 2342.6295956 97.9511875 1126.84331 -355.977930
## loglik -412.200168 2342.6916711 97.9506147 1126.65157 -349.919435
## Median UB
## mu 14.249027 15.746808
## tau -1.967926 -1.834977
## Deviance 679.262259 699.838869
## postlik -346.738514 -346.045123
## loglik -339.631129 -338.904438
##
## Demonic Suggestion
##
## Due to the combination of the following conditions,
##
## 1. Hit-And-Run Metropolis
## 2. The acceptance rate (0.2399) is within the interval [0.15,0.5].
## 3. At least one target MCSE is >= 6.27% of its marginal posterior
## standard deviation.
## 4. At least one target distribution has an effective sample size
## (ESS) less than 100. The worst mixing chain is: mu (ESS=35.77964).
## 5. Each target distribution became stationary by
## 1 iteration.
##
## Laplace's Demon has not been appeased, and suggests
## copy/pasting the following R code into the R console,
## and running it.
##
## Initial.Values <- as.initial.values(fit)
## fit <- LaplacesDemon(Model, Data=MyData, Initial.Values,
## Covar=NULL, Iterations=4e+05, Status=632, Thinning=40,
## Algorithm="HARM", Specs=list(alpha.star=NA, B=NULL)
##
## Laplace's Demon is finished consorting.
Como se obtuvo el mensaje Laplace’s Demon has not been appeased, vamos a usar la __Demonic Suggestion_ anterior y vamos a correr el modelo nuevamente con las condiciones sugeridas. El valor de Status se cambió a 100000 para no llenar la pantalla de tantos resultados. En este nuevo ajuste se usó Thinning=40 para hacer un adelgazamiento de la cadena, como Iterations=4e+05, la longitud de la cadena resultante será 4e+05/40=10000.
Initial.Values <- as.initial.values(fit)
fit <- LaplacesDemon(Model, Data=MyData, Initial.Values,
Covar=NULL, Iterations=4e+05, Status=100000, Thinning=40,
Algorithm="HARM")##
## Laplace's Demon was called on Thu Mar 05 11:22:10 2020
##
## Performing initial checks...
## Algorithm: Hit-And-Run Metropolis
##
## Laplace's Demon is beginning to update...
## Iteration: 100000Iteration: 100000, Proposal: Multivariate, LP: -346.4
## Iteration: 200000Iteration: 200000, Proposal: Multivariate, LP: -346.2
## Iteration: 300000Iteration: 300000, Proposal: Multivariate, LP: -346.3
## Iteration: 400000Iteration: 400000, Proposal: Multivariate, LP: -348.7
##
## Assessing Stationarity
## Assessing Thinning and ESS
## Creating Summaries
## Estimating Log of the Marginal Likelihood
## Creating Output
##
## Laplace's Demon has finished.
Para ver un resumen del modelo usamos la siguiente instrucción. En esa salida aparecerá información valiosa.
##
## #############################################################
## # Consort with Laplace's Demon #
## #############################################################
## Call:
## LaplacesDemon(Model = Model, Data = MyData, Initial.Values = Initial.Values,
## Covar = NULL, Iterations = 4e+05, Status = 1e+05, Thinning = 40,
## Algorithm = "HARM")
##
## Acceptance Rate: 0.23782
## Algorithm: Hit-And-Run Metropolis
## Covariance Matrix: (NOT SHOWN HERE; diagonal shown instead)
## mu tau
## 0.510509490 0.004836395
##
## Covariance (Diagonal) History: (NOT SHOWN HERE)
## Deviance Information Criterion (DIC):
## All Stationary
## Dbar 679.792 679.792
## pD 2.029 2.029
## DIC 681.820 681.820
## Initial Values:
## [1] 12.929896 -1.925251
##
## Iterations: 4e+05
## Log(Marginal Likelihood): -340.0374
## Minutes of run-time: 0.32
## Model: (NOT SHOWN HERE)
## Monitor: (NOT SHOWN HERE)
## Parameters (Number of): 2
## Posterior1: (NOT SHOWN HERE)
## Posterior2: (NOT SHOWN HERE)
## Recommended Burn-In of Thinned Samples: 0
## Recommended Burn-In of Un-thinned Samples: 0
## Recommended Thinning: 120
## Specs: (NOT SHOWN HERE)
## Status is displayed every 1e+05 iterations
## Summary1: (SHOWN BELOW)
## Summary2: (SHOWN BELOW)
## Thinned Samples: 10000
## Thinning: 40
##
##
## Summary of All Samples
## Mean SD MCSE ESS LB Median
## mu 14.327347 0.71439853 0.013566785 4659.500 12.957601 14.326115
## tau -1.968282 0.06954634 0.000685856 10000.000 -2.109866 -1.966135
## Deviance 679.791613 2.01430963 0.026078445 7547.505 677.806805 679.174719
## postlik -347.019001 0.99640045 0.012435297 7812.426 -349.624913 -346.707303
## loglik -339.895806 1.00715481 0.013039223 7547.505 -342.580604 -339.587360
## UB
## mu 15.751521
## tau -1.838438
## Deviance 685.161208
## postlik -346.046106
## loglik -338.903402
##
##
## Summary of Stationary Samples
## Mean SD MCSE ESS LB Median
## mu 14.327347 0.71439853 0.013566785 4659.500 12.957601 14.326115
## tau -1.968282 0.06954634 0.000685856 10000.000 -2.109866 -1.966135
## Deviance 679.791613 2.01430963 0.026078445 7547.505 677.806805 679.174719
## postlik -347.019001 0.99640045 0.012435297 7812.426 -349.624913 -346.707303
## loglik -339.895806 1.00715481 0.013039223 7547.505 -342.580604 -339.587360
## UB
## mu 15.751521
## tau -1.838438
## Deviance 685.161208
## postlik -346.046106
## loglik -338.903402
##
## Demonic Suggestion
##
## Due to the combination of the following conditions,
##
## 1. Hit-And-Run Metropolis
## 2. The acceptance rate (0.237825) is within the interval [0.15,0.5].
## 3. Each target MCSE is < 6.27% of its marginal posterior
## standard deviation.
## 4. Each target distribution has an effective sample size (ESS)
## of at least 100.
## 5. Each target distribution became stationary by
## 1 iteration.
##
## Laplace's Demon has been appeased, and suggests
## the marginal posterior samples should be plotted
## and subjected to any other MCMC diagnostic deemed
## fit before using these samples for inference.
##
## Laplace's Demon is finished consorting.
Como se obtuvo el mensaje Laplace’s Demon has been appeased podemos continuar.
Medias de las distribuciones a posteriori
Aquí se comparan los resultados de las medias de las distribuciones a posteriori con los valores reales.
res <- fit$Summary1
post_mean_mu <- res[1, 1]
post_mean_tau <- res[2, 1]
post_mean_sigma <- 1/exp(post_mean_tau)
w <- cbind(c(true_mean, true_sd, true_tau),
c(post_mean_mu, post_mean_sigma, post_mean_tau))
colnames(w) <- c('True', 'Posterior mean')
rownames(w) <- c('media', 'desviacion', 'tau')
w## True Posterior mean
## media 15.0000000 14.327347
## desviacion 7.0000000 7.158369
## tau 0.1428571 -1.968282
Algunos gráficos
Vamos a dibujar la evolución de los parámetros.
De la figura anterior no se observa ninguna tendencia, los kernel se ven normales y los valores de ACF son bajo, todo esto significa que tenemos distribuciones a posterior estables.
Ahora los gráficos de oruga para los cuantiles (2.5%, 50%, and 97.5%) de los parámetros.
Para obtener la distancia de Hellinger entre las cadenas se usa el código de abajo. Distancias mayores a 0.5 indican que la cadena es no estacionaria y que el proceso aún no converge. En la figura de abajo predominan los colores negros lo que indica que hay poca distancia entre las cadenas.