Stat 6205 Lecture Notes

We now look at the mean and variance of the ecdf. We start with the simpler of the two, the mean. To find the mean, and prove the unbiasedness of the ecdf we simply apply the definition of Expecation.

Mean of the ecdf

\[E[F_n(x)] = E[\frac{1}{n}\sum_{i = 1}^n \mathbf{I}_{(-\infty, x]}(X_i)] = \frac{1}{n}\sum_{i = 1}^n E[\mathbf{I}_{(-\infty, x]}(X_i)] = \] \[\frac{1}{n}\sum_i(P(X_i \leq x)) = \frac{1}{n}(nP(X \leq x)) = P(X \leq x) = F(x)\] where \(\mathbf{I}\) is the usual indicator function.

Variance of the ecdf

We now find the variance of the ecdf. First note that \(E([\mathbf{I}_{(-\infty, x]}(X_i)]) = F(x)\) So we simply apply the definition of variance to the ecdf, \[Var(F_n(x)) = E[F_n^2(x)] - E[F_n(x)]^2 = \] \[E\Big[\frac{1}{n}\sum_i\mathbf{I}(X_i)\frac{1}{n}\sum_j\mathbf{I}(X_j)\Big] - F^2(x)\] We get the second part of the last equation from what we noted in the start.Now we look closer at the first part of the last eqaution, we consider the cases where \(i = j\) and \(i \neq j\). Recall that the indicator function can be thought of independent random variables, so we can write it as: \[\frac{1}{n^2}\sum_i\sum_j[P(X_i \leq x, X_j \leq x) - F^2(x)]\]

Stat 6205 Lecture Notes

Emanuel Rodriguez

January 6, 2016

Mean of the ecdf

Variance of the ecdf