5.6 - Variance of linear combinations and conditional expectation

Linear combinations

  • A linear combination of random variables \(Y_1\), \(Y_2\),…,\(Y_n\) is simply a weighted sum:

\[U = w_1Y_1 + w_2Y_2 + ... + w_n Y_n = \sum_{i=1}^n w_i Y_i\]

  • Especially important for when we move into sampling distribution thinking (see STAT 460)
  • If \(n=2\), then we can simplify notation:

\[U = w_1X + w_2 Y\]

Mean and variance of linear combinations

  • If \(Y_1\), \(Y_2\),…,\(Y_n\) are random variables with \(E(Y_i) = \mu_i\) and \(Var(Y_i) = \sigma^2_i\).
  • If \(U = \sum_{i=1}^n w_i Y_i\), then:

\[E(U) = \sum_{i=1}^n w_i \mu_i\]

\[Var(U) = \sum_{i=1}^n w_i^2 \sigma^2_i + 2 \sum \sum_{1\le i < j \le n} w_iw_j Cov(Y_i, Y_j)\]

Proof of \(E(U)\)

\[E(U) = E\left(\sum_{i=1}^n w_i Y_i\right) =\sum_{i=1}^n E\left(w_i Y_i\right)\]

\[=\sum_{i=1}^n w_iE\left(Y_i\right)=\sum_{i=1}^n w_i\mu_i\]

Proof of \(Var(U)\)

\[Var(U) = E\left[(U - E(U))^2\right]\]

\[ = E\left[\left(\sum_{i=1}^n w_i Y_i-\sum_{i=1}^n w_i \mu_i\right)^2\right] = E\left[\left(\sum_{i=1}^n w_i (Y_i-\mu_i)\right)^2\right]\]

\[= E\left[\sum_{i=1}^n w_i^2 (Y_i-\mu_i)^2 + \sum_{i=1}^n \sum_{j=1, j\ne i}^nw_iw_j(Y_i-\mu_i)(Y_j-\mu_j)\right]\]

\[= \sum_{i=1}^n w_i^2E[ (Y_i-\mu_i)^2] + \sum_{i=1}^n \sum_{j=1, j\ne i}^nw_iw_jE[(Y_i-\mu_i)(Y_j-\mu_j)]\]

\[= \sum_{i=1}^n w_i^2\sigma^2_i + 2\sum_{1\le i<j} \sum w_iw_j Cov(Y_i, Y_j)\]

Simple example

  • Suppose \(X\) and \(Y\) are random variables with:
    • \(\sigma^2_X = 4\)
    • \(\sigma^2_Y = 6\)
    • \(Cov(X,Y) = 1\)
  • Find \(Var(3X-4Y)\).

\[Var(3X - 4Y) = 3^2 \sigma^2_X + (-4)^2 \sigma^2_Y + 2(3)(-4)Cov(X,Y)\]

\[=9\cdot 4 +16\cdot 6 - 24\cdot 1 = 108\]

Example: independent exponentials

Suppose \(X\) and \(Y\) are independent \(EXP(\lambda)\) random variables. Define \(U = \frac{X+Y}{2}\) as the mean of the two random variables. Find \(E(U)\) and \(Var(U)\).

\[E(U) = E\left(\frac{1}{2}X + \frac{1}{2}Y\right)=\frac{1}{2}E(X) + \frac{1}{2}E(Y) = \frac{1}{2}\frac{1}{\lambda}+\frac{1}{2}\frac{1}{\lambda} = \frac{1}{\lambda}\]

\[Var(U) = Var\left(\frac{1}{2}X + \frac{1}{2}Y\right)=\frac{1}{4}Var(X) + \frac{1}{4}Var(Y) + 2\cdot \frac{1}{2}\frac{1}{2}Cov(X,Y)\]

\[= \frac{1}{4}\frac{1}{\lambda^2}+\frac{1}{4}\frac{1}{\lambda^2} +\frac{1}{2}\cdot 0 = \frac{1}{2\lambda^2}\]

Actuary example

If:

  • \(\rho(X,Y) = \frac{1}{3}\)
  • \(\sigma^2_X = a\)
  • \(\sigma^2_Y = 4a\)
  • \(Z = 3X-4Y\), and \(\sigma^2_Z = 114\)

Find \(a\).

\[114=\sigma^2_Z = Var(3X-4Y) = 3^2\sigma^2_X+(-4)^2 \sigma^2_Y + 2(3)(-4)Cov(X,Y)\] \[= 9a+64a-24Cov(X,Y) = 73a - 24Cov(X,Y)\]

\[\frac{1}{3}= \rho(X,Y)= \frac{Cov(X,Y)}{\sigma_X\sigma_Y} = \frac{Cov(X,Y)}{2a} \implies \frac{2a}{3} = Cov(X,Y)\]

\[\therefore a = 2\]

Conditional expectation

  • Fundamental concept in regression: model mean of \(Y\) conditional on \(X\).
  • Formal definition:

\[E(g(Y)|X = x) = \int_{-\infty}^\infty g(y) f_{Y|X}(y|x) dy\]

if \(X\) and \(Y\) are jointly continuous, and

\[E(g(Y)|X = x) = \sum_{all\ y} g(y) p_{Y|X}(y|x)\]

if \(X\) and \(Y\) are jointly discrete.

Conditional variance

  • Conditional variance, in a regression setting, measures variability of \(Y\) about the regression line, i.e. variability of residuals
  • Formally, just set \(g(Y) = Y^2\):

\[E(Y^2|X = x) = \int_{-\infty}^\infty y^2f_{Y|X}(y|x) dy\]

if \(X\) and \(Y\) are jointly continuous, and

\[E(Y^2|X = x) = \sum_{all\ y} y^2 p_{Y|X}(y|x)\]

if \(X\) and \(Y\) are jointly discrete.

  • Then \(Var(Y|X=x) = E(Y^2|X=x) - E(Y|X=x)^2\).

Simple conditional expectation example

\(X\) and \(Y\) are jointly continuous with joint pdf given by

\[f(x,y) = \begin{cases} 1/2 & 0 \le y \le x \le 2 \\ 0 & otherwise \\ \end{cases}\]

Find \(E(Y|X=1.5)\)

\[f_{X}(x) = \int_0^x \frac{1}{2} dy = \frac{x}{2}, 0 \le x \le 2\]

\[f_{Y|X}(y|x) = \frac{f(x,y)}{f_X(x)} = \frac{1/2}{x/2} = \frac{1}{x}, 0 \le y \le x\]

\[\Rightarrow Y|X=x \sim UNIF(0,x)\]

\[E(Y|X=1.5) = \int_0^{1.5}y\cdot \frac{1}{1.5} dy = 0.75\]

Or use uniform properties,

\[E(UNIF(0,1.5)) = \frac{1.5+0}{2} = 0.75\]

Simple conditional variance example

\(X\) and \(Y\) are jointly continuous with joint pdf given by

\[f(x,y) = \begin{cases} 1/2 & 0 \le y \le x \le 2 \\ 0 & otherwise \\ \end{cases}\]

Find \(Var(Y|X=1.5)\).

\[f_{Y|X}(y|x) = \frac{f(x,y)}{f_X(x)} = \frac{1/2}{x/2} = \frac{1}{x}, 0 \le y \le x\]

\[\Rightarrow Y|X=x \sim UNIF(0,x)\]

\[E(Y^2|X=1.5) = \int_0^{1.5}y^2\cdot \frac{1}{1.5} dy = 0.75\]

\[ \Rightarrow Var(Y) = E(Y^2|X=1.5) - E(Y|X=1.5)^2 = 0.75 - 0.75^2 = 0.1875\]

Or use uniform properties,

\[Var(UNIF(0,1.5)) = \frac{(1.5-0)^2}{12} = 0.1875\]

Marginal mean through iteration

  • Iteration provides a very useful way of finding marginal means by averaging conditional means.
  • Specifically:

\[E(g(Y)) = E_X\left[E_{Y|X}(g(Y)|X)\right]\]

  • The subscripts indicate which density the expectation is with respect to.
  • If \(g(Y) = Y\), this simplifies to:

\[E(Y) = E_X\left[E_{Y|X}(Y|X)\right]\]

  • Can save you a lot of work; look for this always!

Proof: Marginal mean through iteration

  • We will prove in the jointly continuous case, joint discrete case replace \(\int\) with \(\sum\).

\[E(g(Y)) = E_X\left[E_{Y|X}(g(Y)|X)\right]\]

Proof:

\[E_X\left[E_{Y|X}(g(Y)|X)\right] = E_X\left[\int_{-\infty}^\infty g(y)\cdot f_{Y|X}(y|x)\,dy\right] \]

\[=\int_{-\infty}^\infty\left[\int_{-\infty}^\infty g(y)\cdot f_{Y|X}(y|x)\,dy\right]f_X(x)\,dx\]

\[=\int_{-\infty}^\infty\int_{-\infty}^\infty g(y)\cdot f_{Y|X}(y|x)f_X(x)\,dy\,dx\]

\[=\int_{-\infty}^\infty\int_{-\infty}^\infty g(y)\cdot f(x,y)\,dy\,dx = E(g(Y))\]

Example: Normal given uniform

  • Suppose \(X\sim UNIF(0,10)\), and \(Y|X=x\sim N(x,1)\):
Conditional and marginal distributions of Y
  • Find \(E(Y)\) and \(Cov(X,Y)\).

\(E(Y)\): Two approaches

  • Brute force:

\[E(Y) = \int_{-\infty}^\infty \int_{-\infty}^\infty y\cdot f_{Y|X}(y|x)\cdot f_X(x) \,dy\,dx\]

\[=\int_0^{10}\int_{-\infty}^\infty y\cdot \frac{1}{\sqrt{2\pi}}e^{-(y-x)^2/2}\cdot \frac{1}{10} \,dy\,dx\]

  • Iteration:

\[E(Y) = E_X(E_{Y|X}(Y|X)) = E_X(X) = \frac{10}{2} = 5\]

\(Cov(X,Y)\): Two approaches

  • Brute force:

\[E(XY) = \int_{-\infty}^\infty \int_{-\infty}^\infty xy\cdot f_{Y|X}(y|x)\cdot f_X(x) \,dy\,dx\]

\[=\int_0^{10}\int_{-\infty}^\infty xy\cdot \frac{1}{\sqrt{2\pi}}e^{-(y-x)^2/2}\cdot \frac{1}{10} \,dy\,dx\]

  • Iteration:

\[E(XY) = E_X(E_{Y|X}(XY|X)) = E_X(X\cdot E_{Y|X}(Y|X))=E_X(X^2) \]

\[= Var(X) + E_X(X)^2 = \frac{(10-0)^2}{12}+5^2 = 33.3\]

\[\Rightarrow Cov(X,Y) = E(XY)-E(X)E(Y) = 33.3 - 5\cdot 5 = 8.33\]

ANOVA-like example

  • Suppose \(X \in\{1,2,3\}\)
    • \(p_X(1) = 0.6\)
    • \(p_X(2) = 0.2\)
    • \(p_X(3) =0.2\)
  • \(Y|X=x \sim N(x^2, 1)\)
  • Find \(E(Y)\).
  • Brute force?
    • \(X\) is discrete, \(Y\) is continuous - there is no joint pmf or pdf!
    • Marginal of \(Y\) looks bimodal - no well-known form there!

Conditional and marginal distributions of Y

Solving ANOVA-like example

  • Iteration:
\(X\) \(1\) \(2\) \(3\)
\(E(Y|X)\) \(1^2\) \(2^2\) \(3^2\)
\(p_X(E(Y|X))\) 0.6 0.2 0.2

\[E(Y) = E_X(E_{Y|X}(Y|X)) = E_X(X^2) = 1^2\cdot 0.6 + 2^2\cdot 0.2 + 3^2 \cdot 0.2 = 3.2\]

Marginal variance through iteration

  • We can also find marginal variance using conditional means and variances in a similar way:

\[Var(Y) = E_X(Var_{Y|X}(Y|X)) + Var_X(E_{Y|X}(Y|X))\] \[ \mbox{Total variance} = \mbox{Average conditional variance + Variability of conditional means}\]

Proof:

\[Var(Y) = E(Y^2) - E(Y)^2 = E_X(E_{Y|X}(Y^2|X)) - \left[E_X(E_{Y|X}(Y|X))\right]^2\]

\[ = E_X\left(Var_{Y|X}(Y|X) + E_{Y|X}(Y|X)^2\right) - \left[E_X(E_{Y|X}(Y|X))\right]^2\]

\[ = E_X\left(Var_{Y|X}(Y|X) \right)+ E_X\left(E_{Y|X}(Y|X)^2\right) - \left[E_X(E_{Y|X}(Y|X))\right]^2\]

\[ = E_X(Var_{Y|X}(Y|X)) + Var_X\left(E_{Y|X}(Y|X)\right)\]

Revisiting ANOVA-like example

  • Suppose \(X \in\{1,2,3\}\)
    • \(p_X(1) = 0.6\)
    • \(p_X(2) = 0.2\)
    • \(p_X(3) =0.2\)
  • \(Y|X=x \sim N(x^2, 1)\)

\[Var(Y) = E_X(Var_{Y|X}(Y|X)) + Var_X(E_{Y|X}(Y|X))\]

\[= E_X(1) + Var_X(X^2)\]

\[Var_X(X^2) = E(X^4) - E(X^2)^2\]

\[\small = (1^4 \cdot 0.6 + 2^4 \cdot 0.2 + 3^4 \cdot 0.2)- (1^2 \cdot 0.6 + 2^2 \cdot 0.2 + 3^2 \cdot 0.2)^2 = 9.76\]

\[\Rightarrow Var(Y) = 10.76\]

Conditional and marginal distributions of Y

\[SSTotal = SSWithin + SSBetween\]

Revisiting regression-like example

  • \(X \sim UNIF(0,10)\)
  • \(Y|X=x \sim N(x, 1)\)

\[Var(Y) = E_X(Var_{Y|X}(Y|X)) + Var_X(E_{Y|X}(Y|X))\]

\[ \small= \mbox{Average squared residual + variance of fitted values}\]

\[= E_X(1) + Var_X(X)\]

\[Var_X(X) = \frac{(10-0)^2}{12}= 8.33\]

\[\Rightarrow Var(Y) = 9.33\]

Conditional and marginal distributions of Y

\[SSTotal = SSError + SSModel\]

Regression with constant variance violated

  • \(X \sim UNIF(0,10)\)
  • \(Y|X=x \sim N(x, x^2)\)

\[Var(Y) = E_X(Var_{Y|X}(Y|X)) + Var_X(E_{Y|X}(Y|X))\]

\[= E_X(X^2) + Var_X(X)\]

\[ = Var_X(X) + E_X(X)^2 + Var_X(X)\]

\[= 8.33 + 5^2 + 8.33 = 41.66\]

  • Note the biggest contribution to total variance now comes from the large average squared residual part of the decomposition

Conditional and marginal distributions of Y

\[SSTotal = SSError + SSModel\]