Lesson 6.5

Introduction

Now that we have learned about the joint probability distribution of two random variables, we can extend our investigation of the relationship between two random variables by learning how to quantify the extent or degree to which two random variables and are associated or correlated. For example, Suppose \(X\) denotes the number of cups of hot chocolate sold daily at a local café, and \(Y\) denotes the number of apple cinnamon muffins sold daily at the same café. Then, the manager of the café might benefit from knowing whether \(X\) and \(Y\) are highly correlated or not. If the random variables are highly correlated, then the manager would know to make sure that both are available on a given day. If the random variables are not highly correlated, then the manager would know that it would be okay to have one of the items available without the other.

Covariance

The covariance between \(X\) and \(Y\) is denoted by \(\sigma_{xy}\) and is defined as \[\sigma_{xy} = E[(X-\mu_x)(Y-\mu_y)]\] An easier to compute formula is \[ \sigma_{xy} = E(XY) - E(X)E(Y)\] An optional proof of the equivalence of these formulas is given below.

If \(X\) and \(Y\) are discrete random variables then \[\sigma_{xy} = E(XY) - E(X)E(Y)=\sum_x \sum_y xyp(x,y) - \sum_x xp(x) \cdot \sum_y yp(y)\] where \(p(x,y)\) denotes the joint probability distribution, \(p(x)\) denotes the marginal distribution of \(X\) and \(p(y)\) denotes the marginal distribution of \(Y\).

Suppose the joint probability distribution between \(X\) and \(Y\) is

	\(x=1\)	\(x=2\)	\(x=3\)
\(y=1\)	\(0.25\)	\(0.25\)	\(0\)
\(y=2\)	\(0\)	\(0.25\)	\(0.25\)

To find the covariance between \(X\) and \(Y\) we must find \(E(XY)\), \(E(X)\) and \(E(Y)\).

\[E(XY) = \sum_{x=1}^{3} \sum_{y=1}^{2} x \cdot y \cdot p(x,y) \\= 1 \cdot 1 \cdot 0.25 + 1 \cdot 2 \cdot 0 + 2 \cdot 1 \cdot 0.25 + 2 \cdot 2 \cdot 0.25 + 3 \cdot 1 \cdot 0 + 3 \cdot 2 \cdot 0.25 = \\0.25+0.50+1+1.5=3.25\]

The marginal distribution of \(X\) is

\(x\)	\(1\)	\(2\)	\(3\)
\(p(x)\)	\(0.25\)	\(0.50\)	\(0.25\)

So \(\mu_x = 1(0.25)+2(0.50)+3(0.25) = 2\).

The marginal distribution of \(Y\) is

\(y\)	\(1\)	\(2\)
\(p(x)\)	\(0.50\)	\(0.50\)

So \(\mu_y = 1(0.50) + 2(0.50) = 1.5\)

Putting this all together, we get the covariance between \(X\) and \(Y\) \[\sigma_{xy} = E(XY) - E(X) \cdot E(Y) = 3.25 - (2)(1.5) = 0.25\]

Correlation

The correlation between \(X\) and \(Y\) is denoted by \(\rho\) and is defined as \[\rho = \frac{\sigma_{xy}}{\sigma_x \sigma_y}\]

Suppose the joint probability distribution between \(X\) and \(Y\) is

	\(x=1\)	\(x=2\)	\(x=3\)
\(y=1\)	\(0.25\)	\(0.25\)	\(0\)
\(y=2\)	\(0\)	\(0.25\)	\(0.25\)

To find the correlation between \(X\) and \(Y\) we need to find \(\sigma_x\) and \(\sigma_y\). We already found \(\sigma_{xy} = 0.25\).

\(\sigma_x = \sqrt{\sum_{x=1}^{3} (x-\mu_x)^2 p(x)} = \sqrt{(1-2)^2(0.25) + (2-2)^2(0.50) + (3-2)^2(0.25)} = \sqrt{0.50}\)

\(\sigma_y = \sqrt{\sum_{y=1}^{2} (y-\mu_y)^2 p(y)} = \sqrt{(1-1.5)^2(0.50) + (2-1.5)^2(0.50)} = \sqrt{0.25}\)

So the correlation is \[\rho = \frac{0.25}{\sqrt{0.50}\sqrt{0.25}} = 0.7071\]

The correlation coefficient measures the strength of the linear association between two random variables. It’s properties are the same as what was described when we considered the sample correlation coefficient.

Independence

Two random variables \(X\) and \(Y\) are independent if \[p(x,y) = p(x) \cdot p(y)\] otherwise, \(X\) and \(Y\) are dependent.

Are \(X\) and \(Y\) independent if they have the following joint distribution?

	\(x=0\)	\(x=1\)	\(x=2\)
\(y=3\)	\(0.1\)	\(0.2\)	\(0.2\)
\(y=4\)	\(0.1\)	\(0.2\)	\(0.2\)

We must check whether \(p(x,y) = p(x)p(y)\)

The marginal of X is

\(x\)	\(0\)	\(1\)	\(2\)
\(p(x)\)	\(0.2\)	\(0.4\)	\(0.4\)

The marginal of Y is

\(y\)	\(3\)	\(4\)
\(p(y)\)	\(0.5\)	\(0.5\)

Now check

\(x=0\), \(y=3\): \(p(0,3) = 0.1 = 0.2*0.5 = p(0)p(3)\)
\(x=0\), \(y=4\): \(p(0,4) = 0.1 = 0.2*0.5 = p(0)p(4)\)
\(x=1\), \(y=3\): \(p(1,3) = 0.2 = 0.4*0.5 = p(1)p(3)\)
\(x=1\), \(y=4\): \(p(1,4) = 0.2 = 0.4*0.5 = p(1)p(4)\)
\(x=2\), \(y=3\): \(p(2,3) = 0.2 = 0.4*0.5 = p(2)p(3)\)
\(x=2\), \(y=4\): \(p(2,4) = 0.2 = 0.4*0.5 = p(2)p(4)\)

Since they are all equal then we say that \(X\) and \(Y\) are independent random variables.

Independent and Uncorrelated

If \(X\) and \(Y\) are independent then \(\sigma_{xy} = 0\) and \(\rho = 0\). However, the converse is not necessarily true. If \(X\) and \(Y\) have zero correlation they can still be dependent because they can have a non linear dependence.

For example, consider the following joint distribution between \(X\) and \(Y\).

	\(x=-1\)	\(x=0\)	\(x=1\)
\(y=-1\)	\(0\)	\(0.5\)	\(0\)
\(y=1\)	\(0.25\)	\(0\)	\(0.25\)

The marginal of \(X\) is

\(x\)	\(-1\)	\(0\)	\(1\)
\(p(x)\)	\(0.25\)	\(0.50\)	\(0.25\)

The marginal of \(Y\) is

\(y\)	\(-1\)	\(1\)
\(p(x)\)	\(0.50\)	\(0.50\)

The covariance (and hence correlation) are equal to 0 so the two variables are uncorrelated. However, \(p(-1,0) = 0.5 \ne 0.25 \cdot 0.50 = p(-1) \cdot p(0)\) so they are not independent.

Continuous Case

Most of the concepts and formulas from above are analogous for the continuous case with integrals replacing the sums. A principal difference is in the definition of the joint distribution since the formula \(P(X=x,Y=y)\) is no longer valid. For this reason we define the joint cumulative distribution function of two continuous random variables \(X\) and \(Y\) to be \[F(x,y) = P(X < x \cap Y < y)\] And we say that two continuous random variables are independent if \[F(x,y) = F(x)F(y)\]

Optional Proof

\[\sigma_{xy} = E[(X-\mu_x)(Y-\mu_y)] = E[XY - \mu_x Y - \mu_y X + \mu_x \mu_y]\\=E(XY) - \mu_x E(Y) - \mu_y E(X) + \mu_x \mu_y\\=E(XY) - \mu_x \mu_y - \mu_y \mu_x + \mu_x \mu_y\\=E(XY)-\mu_x \mu_y \\= E(XY) - E(X)E(Y)\]