1.

  1. What are the two transformations that take two independent random values to two correlated/dependent normal random values?
  2. What are the two rotational transformations that take the two correlated normal random values to two independent normal random values?
  3. Run the code provided in the RBVNsim3.R handout. Verify the transformations perform as described.

Solutions

  1. The two transformations that take the two independent random values to two dependent ones is: \[\begin{bmatrix} \sigma_1 &0 \\ \rho\sigma_1 &\sigma_2\sqrt{1-\rho^2}\end{bmatrix}\begin{bmatrix}X_1\\X_2\end{bmatrix} \] Which then results in the two desired linear equations.
  2. The two transformations that take the correlated values to independent ones, is \[\mathbf{Y} = \begin{bmatrix}\cos\theta &\sin\theta \\ -\sin\theta &\cos\theta\end{bmatrix}\mathbf{X}\] Where we define \(\theta\) to be \[\theta = \frac{1}{2}\tan^{-1}\Big(\frac{2\rho\sigma_{X_!}\sigma_{X_2}}{\sigma_{X_1}^2-\sigma_{X_2}^2}\Big)\]
  3. Here we do a run of the code provided. In the first part we define the Bivariate Normal with the uncorrelated normal values. Here is 3d plot of this provided.

Next we simulate the Independent normal values and show some plots that depict the independence of these:

# simulate independent random normals - Box-Muller Method

x1 = sqrt(-2*log(u1))*cos(2*pi*u2)
x2 = sqrt(-2*log(u1))*sin(2*pi*u2)

par(mfrow=c(2,2))
hist(x1);hist(x2);plot(x1,x2)

xmean = c(mean(x1),mean(x2))
xsd = c(sd(x1),sd(x2))
xcor = cor(x1,x2);xcor
## [1] 0.05633456

So clearly these are independent. Now we apply the transformation to make these correlated:

# trasnsform to correlated normals BVN(0,0,sigmasqw1,sigmassqw2,rho)

c11 = sigmaw1
c21 = rhow1w2*sigmaw2
c22 = sigmaw2*sqrt(1-rhow1w2^2)

w1 = c11*x1
w2 = c21*x1 + c22*x2

par(mfrow=c(2,2))
hist(w1);hist(w2);plot(w1,w2)

wmean = c(mean(w1),mean(w2));wmean
## [1] -0.07535926  0.06610320
wsd = c(sd(w1),sd(w2));wsd
## [1] 2.008271 2.896520
wcor = cor(w1,w2);wcor
## [1] -0.3634378

And now adding the means:

y1 = muw1 + w1
y2 = muw2 + w2

par(mfrow=c(2,2))
hist(y1);hist(y2);plot(y1,y2)

ymean = c(mean(y1),mean(y2));ymean
## [1]  9.924641 25.066103
ysd = c(sd(y1),sd(y2));ysd
## [1] 2.008271 2.896520
ycor = cor(y1,y2);ycor
## [1] -0.3634378

So as we can see above the distribution is that of a normal (we can see this from the histograms) and from the scatter plot we can see the negative trend that we have added. This is of course confirmed with a non-negligible value wcor. So indeed the simulation verifies what we had derived.

Now that we have these we work back to creating uncorrelated values.

theta = 0.5*atan((2*rhow1w2*sigmaw1*sigmaw2)/(sigmasqw1-sigmasqw2))
theta
## [1] 0.3824964
v1 = y1*cos(theta) + y2*sin(theta)
v2 = -y1*sin(theta) + y2*cos(theta)

par(mfrow=c(2,2))
hist(v1);hist(v2);plot(v1,v2)

vmean = c(mean(v1),mean(v2));vmean
## [1] 18.56306 19.55047
vsd = c(sd(v1),sd(v2));vsd
## [1] 1.782126 3.040890
vcor = cor(v1,v2);vcor
## [1] -0.003050197

And we are back uncorrelated values.

2.

Here we run the provided code to visualize the correlation of the simulated values above.

## [1] 10 25
##      [,1] [,2]
## [1,]  4.0 -2.4
## [2,] -2.4  9.0

## 
## ----------------------------
## 2-D Histogram Object
## ----------------------------
## 
## Call: hist2d(x = u1, y = u2, nbins = 50, col = c("white", heat.colors(16)))
## 
## Number of data points:  2000 
## Number of grid bins:  50 x 50 
## X range: ( 2.520834e-05 , 0.9988901 )
## Y range: ( 0.0005204277 , 0.9998806 )
## 
## ----------------------------
## 2-D Histogram Object
## ----------------------------
## 
## Call: hist2d(x = x1, y = x2, nbins = 50, col = c("white", heat.colors(16)))
## 
## Number of data points:  2000 
## Number of grid bins:  50 x 50 
## X range: ( -4.563378 , 2.801809 )
## Y range: ( -3.37672 , 3.86134 )
## 
## ----------------------------
## 2-D Histogram Object
## ----------------------------
## 
## Call: hist2d(x = w1, y = w2, nbins = 50, col = c("white", heat.colors(16)))
## 
## Number of data points:  2000 
## Number of grid bins:  50 x 50 
## X range: ( -9.126755 , 5.603618 )
## Y range: ( -9.107413 , 10.58508 )
## 
## ----------------------------
## 2-D Histogram Object
## ----------------------------
## 
## Call: hist2d(x = y1, y = y2, nbins = 50, col = c("white", heat.colors(16)))
## 
## Number of data points:  2000 
## Number of grid bins:  50 x 50 
## X range: ( 0.8732449 , 15.60362 )
## Y range: ( 15.89259 , 35.58508 )