Box-Müller transformation

Theory: suppose X and Y are independent, standard normal. Think of the point (X, Y) as being a random point in the plane. Suppose it has polar coordinates R, Theta; i.e., X = R cos(Theta), Y = R sin(Theta) where R >= 0 and Theta lies between 0 and 2 pi. It turns out that R and Theta are independent; Theta is uniformly distributed and R has the so-called Rayleigh distribution. Equivalently, R^2 is exponential with rate parameter 1/2 (mean value 2); therefore (R^2)/2 is standard exponential.

Now we can transform between standard exponential and uniform(0, 1) as follows: if V ~ Unif(0, 1), then - log(V) ~ Exp(1). Conversely if X ~ Exp(1) then exp(-X) ~ Unif(0, 1).

Thus going in the converse direction we can start with two independent uniforms on the interval (0, 1), U and V say.

Define Theta = 2 pi U ~ Unif(0, 2 pi)

Define R2 = -2 log(V) ~ Exp(1/2)

Define R = sqrt(R) ~ Rayleigh

Define X = R cos(Theta), Y = R sin(Theta) … then X and Y are independent N(0,1).

Let’s first go in the direction from two independent normals to two independent uniforms. The joint probability density of two standard normals is (1/2 pi) exp(-(x^2 + y^2)/2). It is constant on circles around the origin. The joint density is therefore invariant under rotations around the origin. Hence the probability distribution of the random point (X, Y) is invariant under rotation about the origin. The radial distance R and the polar angle Theta must therefore be independent of one another, and Theta must be uniform. We can “see” this in the scatter plot below.

rm(list = ls()) # empty workspace
library(MASS)   # in order to use truehist() instead of hist()
set.seed(12345)  # to make results reproducible
N <- 1000
x <- rnorm(N)
y <- rnorm(N)
plot(x, y, asp = 1, pty = "s", xlim = c(-4, 4), ylim = c(-4, 4))

plot of chunk unnamed-chunk-1

I want to determine the polar coordinates of the each point. In other words, for given x and y, solve the equations x = r sin(theta), y = r cos(theta) in unknowns r and theta. r is nonnegative; theta should lie between 0 and 2 pi. One sees that tan(theta) = y/x hence we might guess theta = arc tan (y/x). However this only tells us the value of theta modulo pi. Moreover, R’s function atan gives us a solution between -pi/2 and +pi/2. That is a correct answer if x is positive. If x is negative we should add pi. Now we have an answer between - pi /2 and + 3 pi /2. So if what we now have is negative we should add 2 pi to get a final answer between 0 and 2 pi.

theta <- atan(y/x)
theta <- ifelse(x > 0, theta, theta + pi)
theta <- ifelse(theta > 0, theta, theta + 2 * pi)

There is a simpler solution. The R function atan2(y, x) is the angle between the positive x-axis and the line joining the origin and the point (x, y) and lies between -pi and pi. So if the result is negative we should add 2 pi; the final answer lies now between 0 and 2 pi, as desired. We check here that the two solutions coincide.

thetaAlt <- atan2(y, x)
thetaAlt <- ifelse(thetaAlt > 0, thetaAlt, thetaAlt + 2 * pi)
sum(abs(theta - thetaAlt))

## [1] 6.308e-14

We will continue with theta. First I will scale it to lie between 0 and 1 and then draw a histogram - we should see a uniform distribution on (0, 1)

u <- theta / (2 * pi) 

truehist(u) # this should be a uniform distibution on the interval (0, 1)

plot of chunk unnamed-chunk-4

Next we calculate the radial distance r. It should have the so-called Rayleigh distribution. But its square has a nice distribution too: r^2 should be chi-square with two degrees of freedom. Now a chi-square (1) is Gamma(shape = 1/2, rate = 1/2), so a chi-square (2) is Gamma(shape = 1, rate = 1/2) or Exponential(rate = 1/2); rate is 1/2 implies mean is 2. Dividing by 2 gives us an Exponential(1) and if E ~ Exp(1) then exp(- E) ~ Unif(0, 1). Therefore we expect that exp(- r^2 / 2) should be uniform(0, 1).

r2 <- x^2 + y^2
r <- sqrt(r2)
v <- exp(- r2 / 2)
truehist(v)

plot of chunk unnamed-chunk-5

The radial distance and the radial angle should be independent, therefore having transformed both to uniform(0, 1) we should easily be able to “see” the independence in a scatter plot:

plot(u, v, asp = 1)

plot of chunk unnamed-chunk-6

Finally I check that I can get back from u and v to x and y:

theta <- 2 * pi * u
r2 <- - 2 * log(v)
r <- sqrt(r2)
sum(abs(x - r * cos(theta)))

## [1] 1.868e-13

sum(abs(y - r * sin(theta)))

## [1] 1.652e-13