The name already seems familiar, recalling the Moment Generating
Functions from the probability course. We define the Probability
Generating Function, or PGF, in a similar way. For a discrete random
variable
\(X\) with support \(0,1,2, \ldots\), we have the PGF as: \[
G_X(s) = E(s^X).
\] We use \(G\) instead of \(P\) because, well, we already use \(P\) for probabilities! Also note that we
have a note-keeping variable \(s\). We
can expand this using the Similarly, we can use the ‘Law of the
Unthinking Statistician,’ (LOTUS) (see Lecture 1): \[
G_X(s) = E(s^X) =\sum_{x=0}^\infty P(X = x)s^x
= P(X = 0)s^0 + P(X = 1)s^1 + P(X = 2)s^2 + \ldots
\] This expansion reveals where the name of the PGF comes from;
specifically, the coefficient on \(s^n\) is just \(P(X=n)\) (since each term in this expansion
is \(P(X=n)s^n\) for \(n = 0,1,2, \ldots\)). This is the
‘Probability’ part in Probability Generating Function; we can generate
these probabilities that \(X\) takes on
different values quite easily by using the PGF.
First, if we just plug in 0 to our PGF, we can easily get the probability that \(X\) takes on 0. \[ G_X(0) = \sum_{x=0}^\infty P(X=x)0^x = P(X = 0)0^0 + P(X = 1)0^1 + \ldots= P(X = 0), \]. since all of the \(0^x\) terms are 0 except for \(0^0\), which is 1.Therefore, we have that, as discussed: \[ G_X(0)=P(X=0). \] This can also be a nice check to make sure that we have a valid PGF; plug in 0, and you should get \(P(X=0)\).
How about generating the probability that \(X\) takes on the value 1? Returning to our expansion of the PGF \[ G_X(s) = P(X=0)s^0 + P(X = 1)s^1 + P(X = 2)s^2 + \ldots \] and taking the first derivative w.r.t. \(s\), we have \[ G'_X(s) = P(X=1) + 2s\cdot P(X = 2)s^2 + 3s^2\cdot P(X=3)+ \ldots \] If we plug in 0 for \(s\), we are left with \[ G'_X(0)=P(X=1). \] Continuing with this pattern, if we take another derivative of our PGF expansion, we get \[ G''_X(s) = 2P(X=2) + 6s\cdot P(X=3)+ \ldots \] Plugging in 0 for \(s\) here, we get \[ \frac{G''_X(0)}{2}=P(X=2). \] In general (deriving further), we see that \[ \frac{G^{(n)}_X(0)}{n!}=P(X=n), \] where the \((n)\) in the superscript simply means \(n^{th}\) derivative.
Fantastic! We have an easy way to calculate the probabilities of a random variable using its PGF (and now understand its name!). One final useful property comes into play when we plug 1 into our PGF: \[ G_X(1)=P(X=0)1^0 + P(X=1)1^1 + P(X=2)1^2+ \ldots \\ = P(X = 0) + P(X = 1) + P(X = 2)+ \ldots =1. \]
Because all of the \(s\) terms simply become 1. Thus, we have that \(G_X(1)=1\) (if you plug in 1 for \(s\) and don’t get 1 back, it’s not a valid PGF!).
Now that we are familiar with some properties, let’s actually try to calculate a PGF in order to introduce some concreteness. The simplest example is finding the PGF of \(X\sim Bern(p)\). We obtain \[ G_X(s)=E\left( s^X\right)=P(X=0)s^0+P(X=1)s^1\\ =(1-p)+p\cdot s. \]
That was easy! Let’s check that our two conditions, \(G_X(0)=P(X=0)\) and \(G_X(1)=1\), are satisfied. Indeed, \[ G_X(0)=(1-p)+p\cdot 0=(1-p)=P(X=0), \qquad G_X(1)=(1-p)+p=1. \]
The conditions check out! At this point you may watch to watch the first video in this folder.
Let’s now try to calculate the PGF of a slightly more complicated distribution: \(X\sim Binom(n,p)\). We could do the expansion above and perform a brute force calculation, or we could take this opportunity to learn about another useful property of PGFs.
Let \(Y\) and \(Z\) and be independent random variables. Now let \(W=Y+Z\). What is the PGF of \(W\)? Let’s start by writing out the PGF of \(W\) and plugging in \(Z+Y\): \[ G_W (s)=E(s^W )=E(s^{Z+Y})=E(s^Zs^Y). \]
Now, because \(Y\) and \(Z\) are independent, we can write: \[ E(s^Zs^Y)=E(s^Z)E(s^Y). \]
If you’re confused about this part, remember that we can expand \(E(s^Zs^Y)\) as follows \[ E(s^Zs^Y ) = \sum_{z=0}\sum_{y=0} s^zs^y f(z, y), \] where \(f(z,y)\) is the joint mass function of \(Z\) and \(Y\). Since \(Z\) and \(Y\) are independent, the joint mass function is a product of the individual mass functions:\(f(z,y)=f(z)f(y)\). We can plug in and factor out: \[ = \sum_{z=0}\sum_{y=0}s^zs^yf(z)f(y)= \sum_{z=0}s^zf(z)\ \sum_{y=0} s^yf(y))\\ = E(s^Z)E(s^Y). \] Anyways, we are left with \(E(s^Zs^Y ) = E(s^Z)E(s^Y )\), which essentially says the PGF of the sum of independent random variable is the product of the individual PGFs. This is something we can use for \(X\sim Binom(n,p)\), which is just a sum of i.i.d. \(Bern(p)\), something we already have the PGF for! We thus have: \[ G_X(s)=(1-p+p\cdot s)^n. \] Checking our conditions: \[ G_X(0)=(1-p)^n=P(X=0)\\ G_X(1)=(1-p+p)^n=1^n=1. \] It checks out!
Exercise 5 Let \(X\) be uniformly distributed on \(\{0,1,2\}\). Find the PGF \(G_X(s)\) of \(X\).
Exercise 6 Suppose that the random variable \(X\) has a probability mass function given by \[ p_j=\frac{1}{2^{j+1}}, \qquad j=0,1,2,\ldots. \] Find the PGF \(G_X(s)\) of \(X\).
Exercise 7 Let \(X\) be a Poisson random variable, i.e., for \(\lambda>0\) \[ P(X=k)=\frac{\lambda^k}{k!}e^{-\lambda}, \qquad k=0,1,2,\ldots \]
Find the PGF of \(X\).
Assume that \(Y\) is another Poisson random variable with parameter \(\mu\). If \(X\) and \(Y\) are independent, find teh PGF of \(X+Y\).
What is the distribution of \(X+Y\)?
Let’s turn to how the PGF can actually generate things other than probabilities.We will make use of derivatives. Specifically, we can calculate the first derivative w.r.t. \(s\): \[ G'_X(s)=E\left( X\cdot s^{X-1}\right). \] If we plug 1 for \(s\), we get something familiar: \[ G'_X(1)=E(X). \] Not bad; so taking the first derivative of the PGF and plugging in generates the expectation of the random variable!
Let’s try taking another derivative: \[ G''_X(s)=E(X(X-1)s^{X-2}). \] If we plug 1 in for \(s\) in this function, we are left with: \[ G''_X(1)=E(X^2-X)=E(X^2)-E(X). \] That is, the second moment minus the first moment! We thus have, combining these two results: \[ E(X^2)=G''_X(1)+G'_X(1). \] Finally, you can check out this type of example in Video 2.