6.3 - The MGF method

The MGF method

The MGF method for finding the distribution of \(U\) relies on moment generating functions
Approach: find \(M_U(t) = E(e^{tU})\), compare it to well-known MGFs
Especially useful for (but not limited to) finding distributions of sums of independent random variables
Limitation: only really applicable for moving between well-known forms

Example: adding two gammas

Suppose \(X\) and \(Y\) are independent \(GAM(\alpha, \lambda)\) random variables.
Use the MGF method to find distribution of \(U = X+Y\).
\(GAM(\alpha,\lambda)\) MGF:

\[M_X(t) = M_Y(t) = \left(\frac{\lambda}{\lambda-t}\right)^\alpha\]

\[M_U(t) = E(e^{tU}) = E(e^{t(X+Y)})\]

\[= E(e^{tX}e^{tY}) = E(e^{tX})E(e^{tY}) \mbox{ (by independence of X and Y)}\]

\[= M_X(t) M_Y(t) = \left(\frac{\lambda}{\lambda-t}\right)^\alpha \left(\frac{\lambda}{\lambda-t}\right)^\alpha =\left(\frac{\lambda}{\lambda-t}\right)^{2\alpha},\] which is the MGF of a \(GAM(2\alpha,\lambda)\) random variable! \(\therefore U \sim GAM(2\alpha,\lambda)\)

Verifying via simulation

library(tidyverse)
library(purrrfect)

N <- 10000


# Simulating over a grid:

(gam_sum_sim <- parameters(~alpha, ~lambda, 
           c(1,2,3), c(1, 1.5, 2)
           )  
  %>% add_trials(N)
  %>% mutate(Y = pmap_dbl(list(alpha,lambda), .f = \(a,l) rgamma(1, shape = a, rate =l)),
             X = pmap_dbl(list(alpha,lambda), .f = \(a,l) rgamma(1, shape = a, rate =l)))
  %>% mutate(U = X+Y)
  %>% mutate(fU = dgamma(U, shape = 2*alpha, rate = lambda))
) %>% head

# A tibble: 6 × 7
  alpha lambda .trial       Y       X       U      fU
  <dbl>  <dbl>  <dbl>   <dbl>   <dbl>   <dbl>   <dbl>
1     1      1      1 1.44    4.42    5.86    0.0167 
2     1      1      2 2.36    0.360   2.72    0.179  
3     1      1      3 0.123   0.912   1.03    0.368  
4     1      1      4 0.0585  0.323   0.381   0.260  
5     1      1      5 0.00153 0.00166 0.00319 0.00318
6     1      1      6 2.50    1.02    3.52    0.104

Plotting results

ggplot(data = gam_sum_sim) + 
    geom_histogram(aes(x = U,y=after_stat(density)), 
                   fill = 'goldenrod', bins = 40)+
    geom_line(aes(x = U, y = fU), col='cornflowerblue', size = 1) + 
    theme_classic(base_size = 16) + 
    facet_grid(alpha~lambda, scales = 'free_y', labeller = label_both)

simulation study results for the sum of two gammas

Example: adding two normals

Suppose \(X\) and \(Y\) are independent \(N(\mu, \sigma^2)\) random variables.
Use the MGF method to find distribution of \(U = X+Y\).
\(N(\mu,\sigma^2)\) MGF:

\[M_X(t) = M_Y(t) = e^{\mu t + \sigma^2 t^2 /2}\]

\[M_U(t) = E(e^{tU}) = E(e^{t(X+Y)})\]

\[= E(e^{tX}e^{tY}) = E(e^{tX})E(e^{tY}) \mbox{ (by independence of X and Y)}\]

\[= M_X(t) M_Y(t) = e^{\mu t + \sigma^2 t^2 /2}e^{\mu t + \sigma^2 t^2 /2} =e^{(2\mu) t + (2\sigma^2) t^2 /2},\]

which is the MGF of a \(N(2\mu, 2\sigma^2)\) random variable!

\[\therefore U \sim N(2\mu, 2\sigma^2)\]

Verifying via simulation

N <- 10000


# Simulating over a grid:

(norm_sum_sim <- parameters(~mu, ~sigma, 
           c(-1,0,1), c(.2, .4,.6)
           )  
  %>% add_trials(N)
  %>% mutate(Y = pmap_dbl(list(mu,sigma), .f = \(m,s) rnorm(1, mean = m, sd = s)),
             X = pmap_dbl(list(mu,sigma), .f = \(m,s) rnorm(1, mean = m, sd = s)))
  %>% mutate(U = X+Y)
  %>% mutate(fU = dnorm(U, mean = 2*mu, sd = sqrt(2)*sigma))
) %>% head

# A tibble: 6 × 7
     mu sigma .trial      Y      X     U    fU
  <dbl> <dbl>  <dbl>  <dbl>  <dbl> <dbl> <dbl>
1    -1   0.2      1 -0.755 -0.656 -1.41 0.161
2    -1   0.2      2 -1.05  -0.926 -1.98 1.41 
3    -1   0.2      3 -0.767 -1.21  -1.98 1.41 
4    -1   0.2      4 -0.813 -0.862 -1.67 0.726
5    -1   0.2      5 -1.02  -1.09  -2.11 1.30 
6    -1   0.2      6 -1.08  -0.938 -2.02 1.41

Plotting results

ggplot(data = norm_sum_sim) + 
    geom_histogram(aes(x = U,y=after_stat(density)), 
                   fill = 'goldenrod', bins = 40)+
    geom_line(aes(x = U, y = fU), col='cornflowerblue', size = 1) + 
    theme_classic(base_size = 16) + 
    facet_grid(mu~sigma, scales = 'free_y', labeller = label_both)

Example: adding two uniforms

If \(X\) and \(Y\) are independent \(UNIF(a,b)\) random variables, is \(U=X+Y\) also uniform?
\(UNIF(a,b)\) MGF:

\[M_X(t) = M_Y(t) = \frac{e^{tb}-e^{ta}}{t(b-a)}\]

\[M_U(t) = E(e^{tU}) = E(e^{t(X+Y)})= E(e^{tX}e^{tY}) = E(e^{tX})E(e^{tY}) =M_X(t) M_Y(t)\]

\[=\frac{e^{tb}-e^{ta}}{t(b-a)}\frac{e^{tb}-e^{ta}}{t(b-a)} = \left(\frac{e^{tb}-e^{ta}}{t(b-a)}\right)^2\]

\(\therefore U \not \sim UNIF!\) (or any well-known form.) Need to use CDF method to find its distribution.

Verifying via simulation

library(patchwork)
sim_df <- data.frame(Y = runif(10000, 0, 1),
                     X = runif(10000, 0,1)) %>% 
          mutate(U = X+Y)

ggplot(data = sim_df) + 
    geom_point(aes(x = X, y = Y), alpha = 0.5)+
    theme_classic() + 


ggplot(data = sim_df) + 
    geom_histogram(aes(x = U,y=after_stat(density)), 
                   fill = 'cornflowerblue',
                   color ='black',
                   center = 0.02,  binwidth = 0.04)+
    geom_function(fun = \(u) case_when( 0 < u & u <= 1 ~ u,
                                        1 < u & u < 2 ~ 2-u,
                                        .default = 0),
                  linewidth = 1) +
    xlim(c(0,2)) + ylim(c(0,2.5))  + 
    labs(x = 'u', y = expression(f[U](u))) + 
    theme_classic()

Simulated densities of Y and U with analytic densities superimposed

Example: squaring a standard normal

Suppose \(Y\sim N(0,1)\)
Let \(U = Y^2\). Find the distribution of \(U\) using the MGF method.

\[M_U(t)= E(e^{tU}) = E\left(e^{tY^2}\right)\]

\[ = \int_{-\infty}^\infty e^{ty^2} \cdot \frac{1}{\sqrt{2\pi}}e^{-y^2/2}\, dy = \int_{-\infty}^\infty \frac{1}{\sqrt{2\pi}}e^{-y^2(1-2t)/2}\, dy\]

\[ = \frac{1}{\sqrt{2\pi}}\int_{-\infty}^\infty e^{-y^2/\left(2\cdot \frac{1}{1-2t}\right)}\, dy = \frac{\sqrt{2\pi \cdot \frac{1}{1-2t}}}{\sqrt{2\pi}}\]

\[= \left(\frac{1}{1-2t}\right)^{1/2}=\left(\frac{1/2}{1/2-t}\right)^{1/2},\]

MGF of a \(GAM(\alpha = 1/2, \lambda = 1/2) \equiv \chi^2_1\)! Thus \(U\sim \chi^2_1\)

Verifying via simulation study

sim_df <- data.frame(Y = rnorm(10000, 0, 1)) %>% 
          mutate(U =Y^2)

ggplot(data = sim_df) + 
    geom_histogram(aes(x = Y,y=after_stat(density)), 
                   fill = 'goldenrod',
                   color ='black')+
    geom_function(fun = \(y) dnorm(y, 0, 1), 
                  linewidth = 1) + 
    labs(x = 'y', y = expression(f[Y](y))) + 
    theme_classic(base_size = 14)  +


ggplot(data = sim_df) + 
    geom_histogram(aes(x = U,y=after_stat(density)), 
                   fill = 'cornflowerblue',
                   color ='black')+
    geom_function(fun = \(u) dchisq(u, df = 1),
                  linewidth = 1) +
   xlim(c(0, 5)) + 
    labs(x = 'u', y = expression(f[U](u))) + 
    theme_classic(base_size = 14)