5.5 - Expectations of functions

Definition

  • Expectations of functions of two or more random variables is a direct analog of the univariate case.
  • If \(X\) and \(Y\) are jointly discrete random variables with joint pmf \(p_{X,Y}(x,y)\), then:

\[E(g(X,Y)) = \sum_{x}\sum_y g(x,y)\,p_{X,Y}(x,y)\] * If \(X\) and \(Y\) are jointly continuous random variables with joint pdf \(f_{X,Y}(x,y)\), then:

\[E(g(X,Y)) = \int_{-\infty}^\infty\int_{-\infty}^\infty g(x,y) f_{X,Y}(x,y)\, dx\,dy\]

  • Common \(g(X,Y)\) include:
    • \(g(X,Y) = aX+bY\)
    • \(g(X,Y) = XY\)
    • \(g(X,Y) = X/Y\)

Properties 1-2

  1. Constants get pulled out:

\[E(cg(X,Y)) = cE(g(X,Y))\]

  1. Bivariate linearity analog:

\[E\left(g_1(X,Y) + g_2(X,Y) + ... + g_k(X,Y)\right) = E(g_1(X,Y)) + E(g_2(X,Y)) + ... + E(g_k(X,Y))\]

Property 3

  1. Option to work with marginals.

For bivariate continuous (holds for bivariate discrete, and for \(X\) instead of \(Y\)):

\[E(g(Y)) = \int_{-\infty}^\infty \int_{-\infty}^\infty g(y) f_{X,Y}(x,y)\, dx\,dy\] \[ = \int_{-\infty}^\infty g(y) \left(\int_{-\infty}^\infty f_{X,Y}(x,y)\, dx\right)\,dy\] \[ = \int_{-\infty}^\infty g(y) f_Y(y)\,dy \]

In other words, if we need to find the mean of a function of just one of the random variables, we have the option of working with the marginal distribution of that random variable.

Property 4

  1. Independence simplification: if \(X\) and \(Y\) are independent, and \(h(X)\) and \(g(Y)\) are functions of \(X\) and \(Y\) only, then:

\[E(h(X)g(Y)) = E(h(X)) E(g(Y))\]

Proof for jointly continuous case:

\[ E(h(X)g(Y))= \int_{-\infty}^\infty\int_{-\infty}^\infty h(x)g(y) f_{X,Y}(x,y)\, dx\,dy\]

\[ = \int_{-\infty}^\infty\int_{-\infty}^\infty h(x)g(y) f_{X}(x)f_Y(y)\, dx\,dy\]

\[ = \int_{-\infty}^\infty g(y) f_{Y}(y)\left(\int_{-\infty}^\infty h(x) f_X(x)\,dx\,\right) dy= \int_{-\infty}^\infty g(y) f_{Y}(y) E(h(X))\, dy\]

\[ = E(h(X))\int_{-\infty}^\infty g(y) f_{Y}(y)\, dy= E(h(X)) E(g(X))\]

Example: joint discrete product

Suppose \(X\) and \(Y\) are jointly discrete with joint pmf:

\(y\) \(x= 0\) \(x = 1\) \(x = 2\)
0 1/9 2/9 1/9
1 2/9 2/9 0
2 1/9 0 0

Find \(E(XY)\).

\[E(XY) = \sum_{x=0}^2 \sum_{y=0}^2 xy\cdot p(x,y)\]

\[ \begin{align} &= 0^2\cdot \frac{1}{9} + 0\cdot1\cdot \frac{2}{9} + 0\cdot2\cdot \frac{1}{9}\\ &+ 1\cdot 0\cdot \frac{2}{9} + 1^2\cdot \frac{2}{9} + 1\cdot 2\cdot 0\\ &+ 2\cdot 0 \cdot\frac{1}{9} + 2\cdot 1 \cdot 0 + 2\cdot 2\cdot 0 \end{align}\]

\[ = \frac{2}{9}\]

Example: joint discrete difference (1)

Find \(E(X-Y)\) using joint pmf.

\(y\) \(x= 0\) \(x = 1\) \(x = 2\)
0 1/9 2/9 1/9
1 2/9 2/9 0
2 1/9 0 0

\[E(X-Y) = \sum_{x=0}^2 \sum_{y=0}^2 (x-y)\cdot p(x,y)\]

\[ \begin{align} &= (0-0)\cdot \frac{1}{9} + (0-1)\cdot \frac{2}{9} + (0 - 2)\cdot \frac{1}{9}\\ &+ (1- 0)\cdot \frac{2}{9} + (1-1)\cdot \frac{2}{9} + (1- 2)\cdot 0\\ &+ (2- 0) \cdot\frac{1}{9} + (2- 1) \cdot 0 + (2-2)\cdot 0 = 0 \end{align} \]

Example: joint discrete difference (2)

Find \(E(X-Y)\) using marginals and linearity.

\(y\) \(x= 0\) \(x = 1\) \(x = 2\)
0 1/9 2/9 1/9
1 2/9 2/9 0
2 1/9 0 0
\(y\) \(p_Y(y)\)
0 4/9
1 4/9
2 1/9

\(E(Y) = 0\cdot 4/9 + 1\cdot 4/9 + 2 \cdot 1/9 = 6/9\)

\(x\) 0 1 2
\(p_X(x)\) 4/9 4/9 1/9

\(E(X) = 0\cdot 4/9 + 2\cdot 4/9 + 2\cdot 1/9 = 6/9\)

\[E(X-Y) = E(X)-E(Y) = 6/9-6/9 = 0\]

Example: joint continuous

Suppose \(X\) and \(Y\) are jointly continuous with joint pdf:

\[f(x,y) = \begin{cases}3x & 0 < y < x < 1 \\ 0 & otherwise \end{cases}\]

Joint support:

Lower triangular support over (0,1)

Find \(E(XY)\).

\[E(XY) = \int_0^1\int_0^x xy \cdot 3x \, dy\,dx\]

\[\int_0^1\int_0^x 3x^2 y \, dy\,dx = \int_0^1\left(\frac{3}{2}x^2y^2\big|_{y=0}^x\right)\,dx\]

\[= \int_0^1\frac{3}{2}x^4\,dx = \frac{3}{10}\]

Example: Independent uniforms

  • Suppose \(X\) and \(Y\) are independent \(UNIF(1,2)\) random variables.
  • Find \(E(X/Y)\).

Approach 1, directly using joint:

\[f(x,y) = f_X(x)f_Y(y) = \begin{cases}1 & 1 \le x \le 2, 1 \le y \le 2 \\ 0 & otherwise \\ \end{cases}\]

\[E\left(\frac{X}{Y}\right) = \int_1^2 \int_1^2 \frac{x}{y} \cdot 1 \, dx\,dy\]

\[= \int_1^2 \left(\frac{x^2}{2y}\big|_{x=1}^2\right)\,dy = \int_1^2\frac{3}{2y}\,dy \]

\[= \frac{3}{2}\ln(y)\big|_{y=1}^2 = \frac{3}{2}\ln(2)\]

Approach 2, use Properties 3-4:

\[E\left(\frac{X}{Y}\right) = E(X)E\left(\frac{1}{Y}\right)\]

\[E\left(\frac{1}{Y}\right) = \int_1^2 \frac{1}{y} \cdot 1 \,dy = \ln(y)\big|_1^2 = \ln(2)\]

\[E(X) = \frac{3}{2}, \] using property of uniforms.

\(\Rightarrow E\left(\frac{X}{Y}\right) = \frac{3}{2}\ln(2)\)

Covariance

Covariance measures how much \(X\) and \(Y\) vary together, and is defined as an expectation of a function.

\[Cov(X,Y) = E\left[(X-E(X))(Y-E(Y))\right] = E\left[(X-\mu_X)(Y-\mu_Y)\right],\]

where \(E(X) = \mu_X\), \(E(Y) = \mu_Y\)

Covariance visualized

Visual of positive negative and zero covariance

Covariance shortcut

\[Cov(X,Y) = E(XY)-E(X)E(Y) = E(XY)-\mu_X\mu_Y\]

Proof:

\[Cov(X,Y) = E\left[(X-\mu_X)(Y-\mu_Y)\right] = E\left[XY-X\mu_Y-Y\mu_X + \mu_X \mu_Y\right]\]

\[ = E(XY)-E(X\mu_Y)-E(Y\mu_X) + E(\mu_X \mu_Y)\]

\[ = E(XY)-\mu_YE(X)-\mu_XE(Y) + \mu_X \mu_Y\]

\[ = E(XY)-\mu_Y\mu_X-\mu_X\mu_Y + \mu_X \mu_Y = E(XY)-\mu_X\mu_Y\]

Independence \(\Rightarrow Cov(X,Y) = 0\)

  • If \(X\) and \(Y\) are independent, then \(Cov(X,Y) = 0\).

Proof:

\[Cov(X,Y) = E(XY)-E(X)E(Y) = E(X)E(Y)-E(X)E(Y) = 0\]

  • Important: The converse is not necessarily true: \(Cov(X,Y)=0 \nRightarrow\) \(X\) and \(Y\) are independent

Visual of uncorrelated but not independent data

Motivating correlation

  • Consider the following two covariance matrices from two data sets, df1 and df2.
head(df1)
          X         Y
1  1.660539  11.66731
2  2.191474 -71.87120
3  8.046897  71.85071
4 17.885137 -24.60673
5 -1.271061 -33.89970
6 -5.033030 -22.13571
with(df1, cov(X,Y))
[1] 11.72315
head(df2)
           X           Y
1  0.4069442  0.40370078
2 -0.1702315  0.04259686
3  0.1514382  0.33582950
4  0.5435059  0.28636860
5 -0.2041371 -0.08561012
6  0.8290332  0.72405766
with(df2, cov(X,Y))
[1] 0.1098948
  • Which dataset exhibits stronger relationship between \(X\) and \(Y\)?

Plotting the two data sets

Two different scatterplots one with weak and one with strong relationship
  • Which data set has more relationship?
  • What differences do you notice about between these two data sets?

Correlation defined

  • Correlation, \(\rho\), is defined as:

\[\rho = Cor(X,Y) = \frac{Cov(X,Y)}{\sqrt{Var(X)Var(Y)}} = \frac{Cov(X,Y)}{\sigma_X\sigma_Y}\]

  • Given data pairs \((X_1,Y_1),...,(X_n,Y_n)\), the correlation is estimated as :

\[\hat{\rho} = \frac{\frac{1}{n-1}\sum_{i=1}^n(x_i-\bar x)(y_i - \bar y)}{\sqrt{\frac{1}{n-1}\sum_{i=1}^n(x_i-\bar x)^2\frac{1}{n-1}\sum_{i=1}^n(y_i-\bar y)^2}}\]

  • Facts:
    • \(-1 \le \rho \le 1\)
    • \(\rho = 1\) if and only if \(Y = aX + b\)

Two data sets revisited

with(df1, cor(X,Y))
[1] 0.02487709
with(df2, cor(X,Y))
[1] 0.9232244

Finding covariance example

\(X\) and \(Y\) are jointly continuous random variables with joint pdf:

\[f(x, y) = \begin{cases} 6(1-y), & 0 \le x \le y \le1 \\ 0, & \text{otherwise} \end{cases}\]

Marginals:

  • \(X\sim BETA(1,3)\)
  • \(Y\sim BETA(2,2)\)

upper triangular support over unit square

Find \(Cov(X,Y)\).

  • \(E(XY) = \int_0^1 \int_0^y xy\cdot 6(1-y)\cdot \, dx\, dy= \frac{3}{20}\)
  • \(E(X) = \frac{1}{4}\)
  • \(E(Y) = \frac{2}{4}\)
  • \(\Rightarrow Cov(X,Y) = \frac{3}{20}-\frac{1}{4}\frac{1}{2} = 0.025\)

Finding correlation example

\(X\) and \(Y\) are jointly continuous random variables with joint pdf:

\[f(x, y) = \begin{cases} 6(1-y), & 0 \le x \le y \le1 \\ 0, & \text{otherwise} \end{cases}\]

Marginals:

  • \(X\sim BETA(1,3)\)
  • \(Y\sim BETA(2,2)\)

upper triangular support over unit square

Find \(Cor(X,Y)\).

  • \(Cov(X,Y) = 0.025\)
  • \(\sigma_X = \sqrt{\frac{1\cdot 3}{(1+3)^2(1+3+1)}} = \sqrt{\frac{3}{80}} = 0.194\)
  • \(\sigma_Y = \sqrt{\frac{2\cdot 2}{(2+2)^2(2+2+1)}} = \sqrt{\frac{4}{80}} = 0.224\)
  • \(\rho = \frac{0.025}{0.194\cdot 0.224} = 0.575\)