Sampling Distribution of Mean Estimator w/Exponential Input Samples

Statistical Inference

Johns Hopkins University

Author

David McCabe

Published

July 6, 2025

In this study I’ve drawn \(40\) input samples from an exponential distribution \(X_1, X_2, \dots, X_{40} \sim \operatorname{exp}(\lambda=1)\) and put them through a mean estimator: \(\hat{\mu} = \bar{X} = \frac{1}{n} \sum_{i=1}^n X_i\). This is identical format to previous studies found in McCabe (2025a).

Simulation Results

The sum of \(\alpha\) independent exponential random variables produces a gamma distribution \(\quad f_{\operatorname{sum}\alpha\operatorname{exp}(\lambda)}(X)=\frac{\lambda^\alpha}{\Gamma(\alpha)}x^{\alpha-1}e^{-\lambda x}\)

Similarly the mean of \(n\) independent exponential random variables produces a variable with a scaled gamma distribution in accordance with linearity of expectation: \[ \begin{align} f_{\operatorname{ave}n\operatorname{exp}(\lambda)}(X)=\frac{1}{n}\cdot\frac{\lambda^n}{\Gamma(n)}x^{n-1}e^{-\lambda x} \qquad\text{...specifically...}\qquad f_{\operatorname{ave}n=40\operatorname{exp}(\lambda=1)}(X)=\frac{1}{40!}x^{39}e^{-x} \end{align} \]

Observations:

The sampling distribution closely aligns with the expected shape of the gamma distribution \(\Gamma_{\alpha=40,\lambda=1}\)
The distribution is right-skewed: it exhibits a long right tail, and the population mean lies to the right of the distribution’s peak. Consequently, the left endpoint of the confidence interval tends to be closer to the mean than the right endpoint.
Approximately \(95\%\) of the estimated confidence intervals appear to capture the true population mean (\(\mu\)), based on visual inspection.

Simulation Code

# Population parameters:
lambda <- 1                      # exponential distribution rate
mu <- 1 / lambda                 # population mean

# Simulation parameters:
N <- 40                          # samples per trial
nTrial <- 1000                   # number of trials 
alpha <- 0.05                    # significance level

# Trial result buffers:
vSampleMean <- numeric(nTrial)   # mean estimates
vLowerCI <- numeric(nTrial)      # lower CI estimates
vUpperCI <- numeric(nTrial)      # upper CI estimates
insideCI <- logical(nTrial)      # does estimate fall within CI

# Monte Carlo simulation:
for (i in 1:nTrial) {
  xVec <- rexp(N, rate = lambda) # generate set of samples
  vSampleMean[i] <- mean(xVec)   # estimate population mean
}

# Post processing:
deltaUpperCI <- qchisq(alpha/2,df=2*N)
deltaLowerCI <- qchisq(1-alpha/2,df=2*N)
for (i in 1:nTrial) {
  vLowerCI[i] <- 2*N*vSampleMean[i]/deltaLowerCI
  vUpperCI[i] <- 2*N*vSampleMean[i]/deltaUpperCI
  insideCI[i] <- (mu>vLowerCI[i]) & (mu<vUpperCI[i])
}

Appendices

Here’s the machinery used to find population parameters, derive closed form expressions for sampling distributions of the estimator and construct the confidence interval using analytic methods. See (McCabe, 2025b) for the expanded version and (@ Casella & Berger, 2002) for original proofs.

Appendix: Linearity of Expectation

For any random variables \(X\), \(Y\) and scalar coefficients \(a\), \(b\): \[ \mathbb{E}[aX + bY] = a\mathbb{E}[X] + b\mathbb{E}[Y] \]

Continuous case:

Assume \(X\) and \(Y\) are continuous random variables with a joint probability density function \(\operatorname{pdf}(q)\). Then:

\[ \begin{align*} \mathbb{E}[aX+bY] &= \int_{-\infty}^{\infty}\left(a\cdot x(q) + b\cdot y(q)\right)\operatorname{pdf}(q)\:dq\\ &= a\int_{-\infty}^{\infty} x(q)\operatorname{pdf}(q)\:dq + b\int_{-\infty}^{\infty} y(q) \operatorname{pdf}(q)\:dq\\ &= a\mathbb{E}[X]+b\mathbb{E}[Y] \end{align*} \]

Appendix: Exponential Distribution Population Mean

Mean (Expected Value): \(\boxed{\mathbb{E}[X] = \frac{1}{\lambda}}\) (McCabe, 2024a)

The exponential distribution has infinite support with range \([0,\infty]\): \[ \begin{align} E(x)&=\int_{-\infty}^\infty xf(x)\:dx =\int^\infty_0\lambda xe^{-\lambda x}\:dx\\ \\ &\boxed{\begin{aligned} &\text{integration by parts...}\\ &\:(uv)' = u'v + u v'\quad\therefore\quad\int u\cdot v'=u\cdot v-\int u'\cdot v\\\\ &\:\text{let } u=x\quad\text{and }v'=\lambda e^{-\lambda x}\\ &\;\therefore\quad u'=1\quad\text{and }v= -e^{-\lambda x}\\\\ \end{aligned}} \\ E(x)&=\left[-xe^{-\lambda x}\right]_0^\infty -\int_0^\infty -e^{-\lambda x}\:dx\\ &=\left[0\right]+\left[\frac{e^{-\lambda x}}{-\lambda}\right]_0^\infty =\frac{1}{\lambda} \end{align} \]

Appendix: Sum of Exponential Random Variables

The gamma distribution is the sum of \(\alpha\) independent exponential random variables.

The Gamma Distribution

The gamma distribution is a generalisation of the exponential distribution - it models the time until an event occurs \(k\) times (McCabe, 2024a). It’s pdf is expressed using the gamma function \(\Gamma(\alpha)\).

\[\boxed{P(X)\equiv\frac{\lambda^\alpha}{\Gamma(\alpha)}x^{\alpha-1}e^{-\lambda x}}\]

Details

1st “normalisation” constant \(\frac{1}{\Gamma(\alpha)\theta^\alpha}\)
2nd “power” term \(x^{\alpha-1}\) accounts for more events requiring more time to accumulate
- \(\alpha>1\): the function starts low and increases (rises at the beginning)
- \(\alpha=1\): exponential distribution - the curve starts flat and just decays.
- \(\alpha<1\): the curve is asymptotic at \(x=0\) i.e. \(\lim_{x\to 0} \Gamma = \infty\) at the start and immediately drops.
3rd “decay” term \(e^{-\frac{x}{\theta}}\), here \(\theta\) is the average time between events - its function is to stretch time and stretch the exponential distribution in time..

Let \(S_\alpha\) be the sum of \(\alpha\) independent exponential random variables (the PDF of \(S_\alpha\) is the Gamma distribution): \[S_a=X_1+X_2+\dots+X_\alpha\] The PDF of the sum of independent random variables is the convolution of their individual PDFs: \[f_{S_s}(s)=\int_0^s f_X(x)f_X(s-x)\:dx\] For the sum of \(\alpha\) independent exponential random variables, we need to compute the convolution of the exponential PDFs \(\alpha\) times.

So for two exponentials \[ \begin{align} f_{S_2}(s)&=\int_0^s \lambda e^{-\lambda x}\cdot\lambda e^{-\lambda (s-x)}\:dx \\ &=\lambda^2e^{-\lambda s}\int_0^s\:dx \\ &=\lambda^2e^{-\lambda s}\cdot s \end{align}\] This is the PDF for the gamma distribution with shape parameter \(2\) and rate \(\lambda\),

or the general case of \(\alpha\) independent exponential random variables, the final result from performing the convolution \(\alpha\) times is:

\[f_{S_\alpha}(s)=\frac{\lambda^\alpha s^{\alpha-1}e^{-\lambda s}}{\left(\alpha-1\right)!}\]

Gamma Function

The Gamma Function \(\Gamma(z)\) where \(z\in\mathbb{C}\) and \(\text{Re}(z)>1\) is defined as: \[\boxed{\Gamma(z)=\int_0^\infty t^{z-1}e^{-t}\:dt}\]

The gamma function generalises the factorial function to non-integer and complex values

It works for non-integer real and complex values.
It’s undefined at non-positive integers (it has poles).
It’s analytic (differentiable everywhere) except at its poles.

Properties

Considering these three following results we can see from induction that for natural numbers: \[\boxed{\Gamma(n)=(n-1)!}\qquad\text{where}\quad n\in\mathbb{N}\]

\(\boxed{0!=1}\) …definition of zero factorial
\[0!\equiv 1\qquad\text{why not?}\]
\(\boxed{\Gamma(z+1)=z\Gamma(z)}\) …incrementing input scales result by the previous input
\[ \begin{align} \Gamma(z+1)&=\int_0^\infty t^ze^{-t}\:dt\qquad\dots\text{from definition and substitution }u\to u+1\\ &=\left[-t^ze^{-t}\right]^\infty_0+\int_0^\infty zt^{z-1}e^{-t}\:dt\qquad\dots\text{from integration by parts}\\ &=\displaystyle \lim_{t \to \infty}\left[-t^ze^{-t}\right]+z\int_0^\infty t^{z-1}e^{-t}\:dt=z\Gamma(z) \end{align} \]
\(\boxed{\Gamma(1)=1}\) …anchor the gamma function to \(\Gamma(1)=0!\)
\[ \begin{align} \Gamma(1)&=\int_0^\infty t^{1-1}e^{-t}\:dt \;=\int_0^\infty e^{-t}\:dt \;=1 \end{align} \]

Appendix: Mean Estimator Confidence Intervals

Test Statistic & Gamma/Chi-Squared Correspondence

We use the following transformation here (this is sometimes called gamma/chi-squared correspondence): \[\text{if}\quad X\sim\operatorname{Gamma}(\alpha,\lambda)\quad\text{then}\quad Y\sim\chi^2_{2\alpha}\qquad\text{where}\quad Y=2\lambda X\]

The standardised form of a Gamma statistic is the chi-squared statistic: \[\boxed{\frac{2n\bar{X}}{\mu} \sim\chi^2_n}\]

Reminder: \(\chi^2\) is a particular instance of \(\Gamma\) distribution

We need to recall that the chi-squared distribution is a special instance of a gamma distribution i.e. \(\chi^2_k \sim\Gamma\left(\frac{k}{2},\frac{1}{2}\right)\) as was shown in McCabe (2024a).

for \(X\sim\operatorname{Gamma}(\alpha,\lambda)\) the pdf for \(X\) is: \[ \boxed{f_{\operatorname{Gamma}}(x)=\frac{\lambda^\alpha}{\Gamma(\alpha)}x^{\alpha-1}e^{-\lambda x}\qquad\text{where}\quad\Gamma(\alpha)=\int_0^\infty t^{\alpha-1}e^{-t}\:dt} \]

for \(X\sim\chi^2_k\) substitute \(\alpha=k/2\) and \(\lambda=1/2\), the pdf for \(X\) is: \[ \boxed{f_{\chi^2_k}(y)=\frac{1}{2^{k/2}\Gamma({k/2})}y^{{k/2}-1}e^{-\frac{y}{2}}\qquad\text{where}\quad\Gamma({k/2})=\int_0^\infty t^{{k/2}-1}e^{-t}\:dt} \]

Scaling \(\Gamma\) appropriately turns it to a \(\chi^2\) distribution:

Generally, we want to map any gamma distribution in terms of chi-squared through a linear transformation - specifically a particular scaling factor \(a=2\lambda\): \[X\sim\operatorname{Gamma}(\alpha,\lambda)\quad\Rightarrow\quad Y=aX\sim\chi^2_{2\alpha}\].

Proposition: change of variables results in chi-squared distribution

Consider the Gamma disribution PDF, a linear transformation \(y=ax\) stretches the domain, so the range must also be scaled accordingly: \[ \begin{aligned} f_Y(y) &= \frac{1}{a} f_X\left(\frac{y}{a}\right) \qquad \text{(valid for strictly monotonic and differentiable scaling, } a > 0\text{)} \\ f_{\chi^2_k}(y) &\overset{?}{=} \frac{1}{a} f_{\operatorname{Gamma}}\left(\frac{y}{a}\right) \end{aligned} \] proof:

Starting with the Gamma distribution pdf - let’s include the range scaling factor:

\[\frac{1}{a}\cdot f_{\operatorname{Gamma}}(x)=\frac{1}{a}\cdot\frac{\lambda^\alpha}{\Gamma(\alpha)}x^{\alpha-1}e^{-\lambda x}\] We perform the change of variables \(x=y/a\), substitution: \[ \frac{1}{a}\cdot f_{\operatorname{Gamma}}(y/a)=\frac{\lambda^\alpha}{a\Gamma(\alpha)}\left(\frac{y}{a}\right)^{\alpha-1}e^{-\frac{\lambda y}{a}} =\frac{\lambda^\alpha}{a^\alpha\Gamma(\alpha)}y^{\alpha-1}e^{-\frac{\lambda y}{a}} \]

Now let : \[ \begin{align} \frac{1}{2\lambda}\cdot f_{\operatorname{Gamma}}\left(\frac{y}{2\lambda}\right)&=\frac{\lambda^\alpha}{(2\lambda)^\alpha\Gamma(\alpha)}y^{\alpha-1}e^{-\frac{y}{2}}\\ &=\frac{1}{2^\alpha\Gamma(\alpha)}y^{\alpha-1}e^{-\frac{y}{2}}\qquad\text{(simplifying)} \end{align} \]

Compare this with the \(\chi^2_k\) PDF \(f_{\chi^2_k}(y)\)…

\[ \begin{array}{rl} \frac{1}{2\lambda}\cdot f_{\operatorname{Gamma}}\left(\frac{y}{2\lambda}\right) & = &\frac{1}{2^\alpha\Gamma(\alpha)} &y^{\alpha-1} & e^{-\frac{y}{2}}\\ f_{\chi^2_k}(y)\qquad & = & \frac{1}{2^{k/2}\Gamma({k/2})} & y^{{k/2}-1} &e^{-\frac{y}{2}} \end{array} \]

We see that if \(\boxed{\alpha=k/2}\) then \[\frac{1}{2\lambda}\cdot f_{\operatorname{Gamma}}\left(\frac{y}{2\lambda}\right)=f_{\chi^2_k}(y)\] Therefore we conclude: \[\Rightarrow \qquad\boxed{f_{\operatorname{Gamma}}\left(\frac{y}{2\lambda}\right)=2\lambda f_{\chi^2_k}(y)}\qquad\text{Q.E.D.}\]

The standardised Gamma statistic is given by \(2\lambda n\bar{X} \sim\chi^2_n\) where \(\lambda\) is the reciprocal of the population mean so therefore:

\[\boxed{\frac{2n\bar{X}}{\mu} \sim\chi^2_n}\]

where

\(\bar{X}\) is the sample mean
\(n\) is the number of samples
\(\mu\) is the (unknown) population mean
\(\chi^2_n\) is the chi-squared distribution with a dof of \(n\)

Converting Test Statistics to CI

The confidence intervals for a chi-squared test statistic are (and specifically for our mean estimator): \[ \boxed{\mu \in \left[\frac{2n \bar{X}}{\chi^2_{2n,\, 1 - \alpha/2}},\;\frac{2n \bar{X}}{\chi^2_{2n,\, \alpha/2}}\right]} \]

Starting with the test statistic \(\frac{2n \bar{X}}{\mu} \sim \chi^2_{2n}\) we use algebraic pivoting (McCabe, 2025a) to find the confidence intervals for the unknown parameter \(\mu\):

\[ \begin{align} 1-\alpha &= P\left(\chi^2_{2n, \alpha/2} < \frac{2n \bar{X}}{\mu} < \chi^2_{2n, 1 - \alpha/2}\right) = P\left(\frac{\chi^2_{2n, \alpha/2}}{2n\bar{X}} < \frac{1}{\mu} < \frac{\chi^2_{2n, 1 - \alpha/2}}{2n\bar{X}}\right)\\ &= P\left(\frac{2n\bar{X}}{\chi^2_{2n, \alpha/2}} > \mu > \frac{2n\bar{X}}{\chi^2_{2n, 1 - \alpha/2}}\right) = P\left(\frac{2n\bar{X}}{\chi^2_{2n, 1 - \alpha/2}} < \mu < \frac{2n\bar{X}}{\chi^2_{2n, \alpha/2}} \right)\\ \end{align} \]

References

Casella, G., & Berger, R. L. (2002). Statistical inference (2nd ed.). Duxbury Press.

McCabe, D. (2024a). Catalogue of continuous probability distributions. https://rpubs.com/mccabe08/1264465.

McCabe, D. (2024b). Degrees of freedom in statistical modelling. https://rpubs.com/mccabe08/1312256.

McCabe, D. (2024c). Derivation - the central limit theorem. https://rpubs.com/mccabe08/1265303.

McCabe, D. (2025a). Inference i - parameter estimation. RPubs. https://rpubs.com/mccabe08/1285773

McCabe, D. (2025b). Statistical inference course project - statistical inference course project. RPubs. https://rpubs.com/mccabe08/1333213