The AMLE algorithm, introduced by [1], consists of using Approximate Bayesian Computation (ABC) methods to approximate the likelihood function in cases where the likelihood is intractable and/or too computationally expensive to evaluate, and maximising this approximated likelihood function. The basic ABC algorithm is shown below:
Simulate \({\bf\theta}^{\prime}\) from the prior distribution \(\pi(\cdot)\).
Generate \({\bf y}\) from the model \(f(\cdot\vert{\bf\theta}^{\prime})\).
Accept \({\bf\theta}^{\prime}\) with probability \(\propto K_{\varepsilon}({\bf x} \vert {\bf y})\) otherwise return to step 1.
where \(K_{\varepsilon}({\bf x}\vert {\bf y})\) is a normalised Markov kernel. The ABC algorithm produces a sample from the following approximation to the posterior distribution: \[ \widehat{\pi}_{\varepsilon}({\bf\theta}\vert{\bf x}) = \dfrac{\widehat{f}_{\varepsilon}({\bf x}\vert{\bf\theta})\pi({\bf\theta})}{\int_{\Theta}\widehat{f}_{\varepsilon}({\bf x}\vert{\bf t})\pi({\bf t})d{\bf t}}, \] where \[ \widehat{f}_{\varepsilon}({\bf x}\vert{\bf\theta}) = \int_{{\mathbb R}^{q\times n}}K_{\varepsilon}({\bf x}\vert {\bf y}) f({\bf y}\vert{\bf\theta}) d{\bf{y}} , \] is an approximation of the likelihood function. Therefore, if we use a uniform prior on a compact set that contains the Maximum Likelihood Estimator (MLE), and the Maximum a Posteriori (MAP, the maximum of the posterior distribution) coincide. Thus, it is possible to approximate the MLE by using ABC methods to generate an approximate sample from the posterior and then approximating the MAP using this sample. This can be done by using the ABC algorithm in the first step of the AMLE algorithm, as shown below:
The AMLE Algorithm
Obtain a sample \({\bf \theta}^*_{\varepsilon}=({\bf \theta}^*_{\varepsilon,1},...,{\bf \theta}^*_{\varepsilon,k})\) from \(\widehat{\pi}_{\varepsilon}({\bf \theta}\vert{\bf x})\).
Using the sample \({\bf \theta}^*_{\varepsilon}\) construct a nonparametric estimator \(\widehat{\pi}_{k,\varepsilon}({\bf \theta}\vert{\bf x})\) of the density \(\widehat{\pi}_{\varepsilon}({\bf \theta}\vert{\bf x})\).
Calculate the maximum of \(\widehat{\pi}_{k,\varepsilon}({\bf \theta}\vert{\bf x})\), \(\tilde{{\bf \theta}}_{\varepsilon}\). This is an approximation of the MLE \(\widehat{{\bf \theta}}\).
Asymptotic properties on the AMLE are studied in [1]. The following R code shows two toy examples (binomial and normal models) that illustrate the calculation of the AMLE. These examples correspond to those presented in Sections 4.1 and 4.2 of [1].
References.
Example 1: Binomial model
#########################################################################################################################################
#### Calculates the Approximate Maximum Likelihood Estimator (AMLE) of the parameter $p$ of a Binomial distribution.
#### The summary statistic is the sample mean.
#### The metric employed is the Euclidean metric.
#########################################################################################################################################
rm(list=ls())
# Simulated data
data = c(6,6,6,8,3,6,5,7,4,7,7,7,4,6,6,6,5,5,7,3,5,4,4,5,6,5,6,4,7,6)
######################################################################################################
# Input variables
######################################################################################################
#--------------------------------------------------------------------------------------------------------
# dat : Vector of observations.
# N : ABC sample size to be used for the kernel density estimation
# tol : Tolerance for the ABC algorithm
# a : Left-end point of the support of the uniform prior distribution
# b : Right-end point of the support of the uniform prior distribution
#--------------------------------------------------------------------------------------------------------
AMLE = function(dat,N,tol,a,b){
pp = rep(0,N)
j=1
while(j<N+1){
p = runif(1,a,b)
datABC = rbinom(30,10,p)
if(abs(mean(datABC)-mean(dat))<tol) {
pp[j] = p
j = j+1
}
}
pB = density(pp)
return(pB$x[which(pB$y==max(pB$y))])
}
# AMLE of the simulated data
AMLE(data,10000,0.1,0.4,0.6)
## [1] 0.5562335
Example 2: Normal model
#########################################################################################################################################
#### Calculates the Approximate Maximum Likelihood Estimator (AMLE) of the parameters $(\mu,\sigma)$ of a Normal distribution.
#### The summary statistic is the sample mean and sample variance.
#### The metric employed is the Euclidean metric in two dimensions.
#########################################################################################################################################
rm(list=ls())
# Required packages
library(KernSmooth)
## KernSmooth 2.23 loaded
## Copyright M. P. Wand 1997-2009
library(mvtnorm)
library(misc3d)
library(ks)
## Warning: package 'ks' was built under R version 3.3.2
## Warning in rgl.init(initValue, onlyNULL): RGL: unable to open X11 display
## Warning: 'rgl_init' failed, running with rgl.useNULL = TRUE
# Simulated data
data = c(-0.14858096, 0.07707150, 0.96922964, -1.43173858, -0.20649986, -0.66617827,
1.08589636, 1.08440788, 0.34425089, 0.57356652, -0.40346415, -0.96862983,
-0.80275031, 0.26647296, 2.49183268, -1.10162050, -0.78695620, -1.43644030,
0.08592752, 2.00935874, 0.93525153, 0.68942103, 1.46106992, 0.98110375,
1.25898725, -0.71190799, 0.23366910, -0.10465786, -1.08424077, 0.46294997,
0.99890695, 0.70747561, -2.08774711, -0.95524464, -1.78411916, 1.11029231,
-0.90350745, 1.11962500, 1.26074829, 1.52763765, 0.07144053, 0.71504412,
-0.73598439, -0.15623768, -0.32990114, -1.23751435, 0.99929185, 0.25147602,
0.74972926, 0.50649467, 1.29911930, -1.41630256, -0.51830105, -0.24920388,
-0.14841598, 1.42094654, 1.27497412, -0.20233274, 0.26353656, 0.48123977,
-1.11826275, -0.37355127, -0.02052075, -0.66050695, 0.63988185, -0.29413887,
0.14280533, -0.12139647, 0.35824963, -0.84682013, 0.59795021, -0.11048339,
-1.27087873, -1.92605219, -0.46258777, -0.45827509, 2.98644450, -1.24410514,
1.95508486, -1.92738850, -0.16304102, -0.19533231, 0.29192288, -1.06087870,
-1.54370706, 0.29211985, -0.08889496, 0.62322394, -1.57392493, 0.85927803,
-0.10532303, -0.50732824, -0.34994640, -1.51477602, -0.58488975, 0.64886611,
0.01561432, -0.69437539, 0.65501940, -0.61570078)
######################################################################################################
# Input variables
######################################################################################################
#--------------------------------------------------------------------------------------------------------
# dat : Vector of observations.
# N : ABC sample size to be used for the kernel density estimation
# tol : Tolerance for the ABC algorithm
# a1 : Left-end point of the support of the uniform prior distribution on $\mu$
# b1 : Right-end point of the support of the uniform prior distribution on $\mu$
# a2 : Left-end point of the support of the uniform prior distribution on $\sigma$
# b2 : Right-end point of the support of the uniform prior distribution on $\sigma$
#--------------------------------------------------------------------------------------------------------
# Metric
dist = function(x,y) return( sqrt( mean(x-y)^2 + (sqrt(var(x))-sqrt(var(y)))^2 ) )
# AMLE function
AMLE = function(dat,N,tol,a1,b1,a2,b2){
pp = qq = rep(0,N)
j=1
while(j<N+1){
p = runif(1,a1,b1)
q = runif(1,a2,b2)
datABC = rnorm(100,p,q)
if(dist(datABC,dat)<tol) {
pp[j] = p
qq[j] = q
j = j+1
}
}
pp1 = cbind(pp,qq)
H.scv=Hscv(pp1)
den = kde(pp1, H=H.scv)
ind=which(den$estimate==max(den$estimate), arr.ind=TRUE)
return(c(den$eval.points[[1]][ind[1]],den$eval.points[[2]][ind[2]]))
}
# AMLE of the simulated data (~5 minutes)
AMLE(data,5000,0.01,-0.1,0.1,0.9,1.1)
## [1] -0.0192377 1.0078863