\[E_f\Big[h(X)\Big] = \int_{x} h(x)f(x)dx\]
Using a sample from \(f(x)\), \(X_1, X_2, ..., X_m\), one can approximate this by the emprirical average:
\[\bar{h} = \frac{1}{m} \sum_{j=1}^{m} h(x_j)\]
Since \(\bar{h}\) converges almost surely to \(E_f[h(x)]\), by the Strong Law of Large Numbers.
Problem 1: to generate samples \(x_r, r = 1,...,R\) from a given probability distribution \(P(x)\)
Problem 2: to estimate expectations of functions under this distribution, e.g.
\[\Phi = \int\phi(x)P(x)d^{N}x\]
where \(N\) refers to the dimension of the distribution.
\(X_1,X_2,...\) is a sequence of random variables, where \(X_i\) is exponentially distributed with \(\mu = X_{i-1}\).
We assume that the mean of \(X_1\) (\(\bar{X_1}\)), is a known constant, say \(x_0\).
We are interested in the distribution of \(S=\sum_{i=1}^N X_i\), where \(N\) is itself a random variable, say, a negative binomial, with parameters \(p\) and \(r\).
\[f(k;r,p)\equiv Pr(X=k)=\Big(\frac {(k+r-1)!}{k!(r-1)!} \Big)p^k(1-p)^r\]
for \(k=0,1,2,...\) where \(k\) is the number of successes, \(r\) is the number of failures and \(p\) is the probability of success.
\[SSQ=0\] \[Sum=0\]
\(\{BEGIN \space FOR-LOOP \space 1\)
\[Mean=x_0\space\space(given)\]
\[S=0\]
\(\{BEGIN \space FOR-LOOP \space 2\)
\[S=S+X\] \[Mean=X\]
\(END \space FOR-LOOP\space 2\}\)
\[SSQ=SSQ+\big((S-E[S])\times(S-E[S])\big)\]
\(BEGIN \space IF-STATEMENT\)
\[Sum = Sum+1\]
\(END \space IF-STATEMENT\)
\(END \space FOR-LOOP\space 1\}\)
\[V_{Estimate} = \frac{SSQ}{BIG}\]
\[P_{Estimate}=\frac{Sum}{BIG}\]
i.e. \(x = F^-1(u)\), where \(F^1(u)\) is the inverse function of \(u\).
The probability that \(X\) is less than or equal to any real value \(x\) is precisely equal to the probability that the random number \(u\) is less than or equal to \(F(x)\)
\(Pr[U \leq a]=a\) for any \(0\leq a \leq 1\), Thus:
\[Pr[X \leq x] = Pr[U \leq F(x)] = F(x)\]
Let’s assume \(X\) takes on non-negative integer values only, \(Pr[X=k]=p_{k}\) for any non-negative integer \(k\).
So \(F(0)=p_{0}\), while for \(k>0\), \(F(k)=F(k-1)+p_{k}\), for any non-negative integer \(x\) such that \(k<x<k+1\), \(F(x)=F(k)\); the algorithm to generate \(X\) is then:
Generate \(u\) from \(u \sim U(0,1)\), set \(k=0\) (integer) and \(F=p_{0}\) (real).
If \(u \leq F\), EXIT, returning \(X=k\); otherwise, set \(k=k+1\), \(F=F+p_k\) and repeat this step.
We explicitly have to solve \(F(x)=u\)
X has a probability density function defined by:
\[f(x)=\frac{1}{\mu}e^{\frac{-x}{\mu}}\] The distribution function is then:
\[F(x)=1-e^{\frac{-x}{\mu}}\]
And the solution to \(F(x)=u\) is then: \(x=-\mu \space ln(1-u)\)
There is no closed-form expression for the CDF of the Normal Distribution, but using a pair of independent standard normal deviates, \(X\) and \(Y\); the joint PDF is then given by:
\[f(x,y)=\frac{1}{2\pi} exp\Big[\frac{1}{2}(x^2+y^2)\Big]\]
Now, tranform to two random variables, \(V\) and \(\Phi\), defined such that \(X=\sqrt{2V}cos(\Phi)\) and \(Y=\sqrt{2V}sin(\Phi)\), s that the joint pdf of \(V\) and \(\Phi\) is given by:
\[f(v,\phi)=\frac{1}{2\pi} e^{-v}\]
over the region defined by \(0 < v < \infty\) and \(0<\phi<2\pi\)
\(V\) and \(\Phi\) are independent random variables, \(V\sim Exp(\lambda =1)\) and \(\Phi \sim U(0,2\pi)\)
We generate realizations of \(V\) and \(\Phi\) and transform them to the desired \(X\) and \(Y\).
To generate a random variable \(X\) from a distribution defined by the pdf \(f(x)\), that does not allow for an inverse transformation, we need another distribution \(h(x)\), from which we can generate random numbers and that has the property: \(h(x)=0\) only when \(f(x)=0\)
The density \(h(x)\) should be “similar” to \(f(x)\) in some known sense, so that when we generate values from \(h(x)\), we can reject a certain proportion of the values, to compensate for the differences.
First, let’s define two quantities:
\[C=max_{[x|h(x)>0]}\frac{f(x)}{h(x)}\]
and
\[g(x)=\frac{f(x)}{C.h(x)}\]
Assuming the following:
\[h(x)>0; 1\leq C; 0 \leq g(x) \leq 1\]
Let \(y\) and \(u\) be the values generated.
This generates random variable \(Y\) with the conditional distribution for \(Y\) given the event
\(\{U \leq g(Y)\}\)
The conditional density is given from Bayes’ theorem as follows: \(h(y|U \leq g(Y))=\frac{h(y)Pr[U \leq g(Y)|Y=y]}{Pr[U \leq g(Y)]}\)
This simplifies to:
\[h(y|U \leq g(y))=C.h(y).g(y)=f(y)\]
which is the desired density.
The probability of acceptance is \(\frac{1}{C}\), the number of iterations required is the number of rejections before the first acceptance, the number of iterations required then has a geometric distribution with parameter \(1/C\) and expected value \(C\).
It is important to insure that the tails of \(h(.)\) are at least as dense as the tails of \(f(.)\)
We want to generate values from a gamma distribution with pdf:
\[f(x)=\frac{x^{\alpha-1}e^{-x}}{\Gamma(\alpha)}\]
For integer values of \(\alpha\), \(x\) can be generated as the sum of \(\alpha\) exponentially distributed random variables
For half-integer values of \(\alpha\), \(x\) can be generated by the sum of squares of \(2 \alpha\) exponentially distributed random variables
This becomes computationally expensive for moderately large \(\alpha\) values, and only works if \(\alpha\) is integer or half-integer
For the density \(h(x)\), we would use the exponential distribution with the same mean as \(f(x)\), i.e. \(\alpha\), thus
\[h(x)=\frac{1}{\alpha}exp(\frac{-x}{\alpha})\]
nsamp<-1000 #number of samples drawn
lambda <- 5
x <- rexp(nsamp,lambda) #sample from exponential distribution with lambda=5
hist(x,prob=T) #plot the random draw of 1000 values from Exp(lambda=5)
mean(x)#mean of samples
## [1] 0.1992264
#should be roughly equal to 1/lambda
1/lambda
## [1] 0.2
Calculate \(E(X^2)\) i.e. \(\int x^2 f(x)dx\), where \(X \sim Exp(\lambda = 5)\)
nsamp=10000
lambda=5
x.square<-rexp(nsamp,lambda)^2 #NOTE the square!
hist(x.square,prob=T,sub=paste0("Mean of samples = ",mean(x.square)))
hx<-function(x){ (x^2)*5*exp(-5*x) } #define an exponential distrbution function with lambda=5, multiplied by x^2, to get the expected value of X^2
integrate(hx, lower=0, upper=Inf) #integrate from 0 to infinity
## 0.08 with absolute error < 8.2e-06
*Draw 5 obserations from \(Exp(\lambda = 5)\), i.e. a sample with n=5
mean(rexp(5,5))
## [1] 0.178771
1/5
## [1] 0.2
#increase sample size
mean(rexp(10,5))
## [1] 0.1729375
1/5
## [1] 0.2
#hundred
mean(rexp(100,5))
## [1] 0.1743087
1/5
## [1] 0.2
#thousand observations:
mean(rexp(1000,5))
## [1] 0.2082827
1/5
## [1] 0.2
#100 thousand observations:
mean(rexp(100000,5))
## [1] 0.2001176
1/5
## [1] 0.2
We are more likely to get close to the expected value of X, with a larger sample size:
n_sizes<-seq(from=5, by=15, length.out=1000)
draw_exp <- function(size){
sample <- (rexp(size,5))
mu <- mean(sample)
}
means <- sapply(n_sizes, draw_exp,simplify="vector")
hist(means,breaks=1000)
plot(means~n_sizes,type="l")
abline(h=1/lambda,col="red")
As one would expect from the Law of Large Numbers: Means are approximately normally distributed, and As sample size increases, means converge on the expected value of 0.2
n_sizes <- rep(5,5000)
meanOf5samples <- sapply(n_sizes, draw_exp,simplify="vector")
hist(meanOf5samples)
nsize<-c(5, 10, 300, 1500)
for(i in nsize){
n_sizes <- rep(i,5000)
means <- sapply(n_sizes, draw_exp,simplify="vector")
hist(means,main=paste0("Sample size = ",i),breaks=1000,border="blue")
abline(v=1/lambda,col="red",lwd=5)
}
Assume \(X\) is a random variable:
\[X \sim Exp(\lambda = 5)\]
Use Monte Carlo simulation methods to obtain:
lambda=5
#define pdf of exp(x,5)
f.x <- function(lambda=5,x=X){
beta <- 1/lambda
return((1/beta)*exp(-x/beta))
}
#define the CDF for the same distribution:
F.x <- function(lambda=5,x){
return(1-exp(-lambda*x))
}
#define inverse CDF
F.inv <- function(lambda,p){
return(-log(1-p)/lambda)
}
#so if F(x)=U
#we get the x value associated with a generated U, using this function
get.x <- function(lambda=5,U){
mu <- 1/lambda
x <- -mu*(log(U))
}
#sigma squared is the integral of x^2 * f(x)
sigma.est <- function(x){
return(x^2*f.x(x=x))
}
count <- 0
sigma <- c()
#varX <- c()
while(count<10000){
#Probability integral transform
#generatwe 10000 realizations from the uniform distribution
U <- runif(10000)
#get x as an inverse function of U
x <- get.x(U=U)
#x <- mean(x)
# varX <- c(varX,var(x))
#calculate the estimated sigma squared value using this x and add it to the results vector
sigma <- c(sigma,mean(sigma.est(x=x)))
#increment loop variable
count <- count+1
}
#plot a histogram of sigma squared estimates
plot(sigma)
abline(h=mean(sigma),col="red",lwd=3)
abline(h=1/(lambda^2),col="blue",lwd=3)
# U <- runif(100000)
# x <- get.x(U=U)
x <- seq(0,1,length.out = 1000000)
mu <- x*f.x(x=x)
hist(x)
plot(x=x,y=mu)
abline(v=mean(mu),col="blue")
abline(v=1/lambda,col="red")
plot(x=x,y=f.x(x=x))
plot(x=x,y=F.x(x=x))
sigma <- (x-mu)^2
hist(sigma,breaks = 1000,freq=F)
mean(sigma)
## [1] 0.2431394
U <- runif(1000000)
x <- get.x(U=U)
hist(x)
x.in.range <- x[x>0.1&x<0.3]
prob.x.in.range <- length(x.in.range)/length(x)
prob.x.in.range
## [1] 0.384018
#Test:
F.x(x=0.3)-F.x(x=0.1)
## [1] 0.3834005
Suppose \(X \sim Exp(\lambda)\)
\[f(x)=\lambda e^{-\lambda x}\]
\[F(x)=1-e^{-\lambda x}\] \[F^{-1}(x)=-\frac{1}{\lambda}log(1-x)\]
F. <- function(x,lambda){
return(1-exp(-lambda*x))
}
F.plot.vals <- F.(x=seq(1,30,length.out = 1000),lambda=2)
par(mfrow=c(3,3))
curve(F.(lambda=1,x),from=1,to=30)
curve(F.(lambda=2,x),from=1,to=30)
curve(F.(lambda=5,x),from=1,to=30)
curve(F.(lambda=10,x),from=1,to=30)
curve(F.(lambda=20,x),from=1,to=30)
curve(F.(lambda=0.9,x),from=1,to=30)
curve(F.(lambda=0.5,x),from=1,to=30)
curve(F.(lambda=0.2,x),from=1,to=30)
curve(F.(lambda=0.1,x),from=1,to=30)
Sample from \(u \sim U(0,1)\)
Plug \(u\) into the inverse pdf: \(F^-1(u)\) to get a sample from the pdf: f(x)
U <- runif(10000)
F.inv <- function(x,lambda){
return(-1/lambda*log10(1-x))
}
X <- F.inv(x=U,lambda=5)
f. <- function(x,lambda){
return(lambda*exp(-lambda*x))
}
Y <- f.(X,5)
x <- seq(0,1,length.out = 10000)
densiity.x <- dexp(x=x,rate=5)
par(mfrow=c(1,1))
plot(x=x,y=densiity.x,type="l",lwd=5)
points(x=X,y=Y,col="red",cex=0.1)
We want to sample from the following distribution:
\[f(x) \propto e^{-x^2/2}sin(2x)^2\]
We cannot use probability integral transform, since we don’t know the inverse CDF
So:
#define target distribution f(x)
f. <- function(x){
return(exp((-x^2)/2)*sin(2*x)^2)
}
#define candidate distribution h(x)
h. <- function(x){
return((1/sqrt(2*pi))*exp((-x^2)/2))
}
#define f(x)/h(x):
f.x_h.x <- function(x){
return(sqrt(2*pi*(sin(2*x))^2))
}
C <- max(f.x_h.x(x=-100:100))
plot(x=-100:100,y=f.x_h.x(-100:100),type="l")
abline(h=C,col="red",lwd=5)
abline(h=sqrt(2*pi),col="yellow",lwd=2)
#this should be roughly equal to the sqrt(2*pi) as can be seen from the plot
#So we set C equal to the ideal value
C <- sqrt(2*pi)
#define g(x)=f(x)/C.h(x)
# g. <- function(x){
#
# return((sqrt(2*pi)*(sin(2*x)/C)))
#
# }
g. <-function(x){
return(f.(x)/h.(x))
}
#sample a u
h <- c()
count <-1
x <- seq(-5,5,length.out = 100000)
fx <- f.(x)
plot(x,fx)
integrated.fx <- integrate(f.,-Inf,Inf)
gx <- g.(x)
plot(x,gx)
hx <- h.(x)
plot(x,hx)
x. <- c()
#########################################################################
#sample size
N <- 1000000
#scaling factor
C <- max(gx)
#get a sample of Y values from h(x)
Y <- rnorm(N)
gy <- g.(Y)/C
U <- runif(1000000)
index <- which((U<=gy)==T)
keep_y <- Y[index]
length(keep_y)/N
## [1] 0.500035
hist(keep_y,breaks=1000,prob=T)
lines(x,f.(x)/integrated.fx$value,col="red",lwd=5)
# while(count<=100000){
#
# u <- runif(1)
#
# X <- sample(x,size=1)
#
# Y <- h.(X)
#
# gy <- g.(Y)
#
# keep <- u<=gy
#
# if(keep==T){
# h <- c(h,g.(X))
# x. <- c(x.,X)
#
# }
#
# count <- count+1
# }
#
# hist(h,breaks=1000)
#
# plot(density(h))
#
# plot(x=x.,y=h,type="h",ylim=c(0,1))
#
#
#
# curve(f.,from=-3,to=3,col="red",lwd=3,add=T)
# #curve(h.,from=-3,to=3,col="blue",lwd=3,add=T)