Statistics 2: Topic 2 (continue)
So far we have been introduced to the concepts of unbiasness and efficiency.
Naturally, one would ask if an (unbiased and) efficient estimator of \(\psi(\boldsymbol{\theta})\) is the best possible estimator for \(\psi(\boldsymbol{\theta})\). To answer this, we need to know what do we mean by the best estimator of \(\psi(\boldsymbol{\theta})\).
An estimator of \(\psi(\boldsymbol{\theta})\) is the best if it minimizes the error in estimating \(\psi(\boldsymbol{\theta})\) on an average (for all \(\boldsymbol{\theta}\)).
Definition[Mean-Squared Error (MSE)]
Let \(T({\bf X})\) be an estimator of \(\psi(\boldsymbol{\theta})\). The MSE of \(T({\bf X})\) in estimating \(\psi(\boldsymbol{\theta})\) is given by \[ \]
\[ E_{\boldsymbol{\theta}} \left[ \left\{ T({\bf X}) - \psi(\boldsymbol{\theta}) \right\}^{2} \right] . \]
An estimator of \(\psi(\boldsymbol{\theta})\) is the best if it minimizes the MSE in estimating \(\psi(\boldsymbol{\theta})\) (for all \(\boldsymbol{\theta}\)).
An (unbiased and) efficient estimator can have higher MSE compared to a biased estimator.
Example: Let \(X_{1}, \ldots, X_{n}\) be a random sample from \(\mathtt{Exp}(\lambda)\) distribution with mean \(1/\lambda\).
Verify that the sample mean \(\bar{X}_{n}\) is an unbiased estimator of \(1/\lambda\).
Show that the CRLB provides \[ \mathtt{var}_{\lambda} (\bar{X}_{n} ) \geq \frac{1}{n \lambda^{2}}. \]
Also verify that the variance of \(\mathtt{var}_{\lambda} (\bar{X}_{n}) = 1/ (n\lambda^{2})\). Therefore, \(\bar{X}_{n}\) is an efficient (unbiased) estimator of \(1/\lambda\).
Next consider another estimator \(T ({\bf X}) = (n+1)^{-1} \sum_{i=1}^{n} X_{i}\).
Show that \(T({\bf X})\) has a negative bias of magnitude \(1/(n+1)\). Further, \(\mathtt{var}_{\lambda} (T({\bf X})) = n/\{ (n+1)^{2} \lambda^{2}\}\).
Hence, show that MSEs of \(\bar{X}_{n}\) and \(T({\bf X})\) in estimating \(1/\lambda\) are \(1/(n\lambda^{2})\) and \(1/\{ (n+1) \lambda^{2}\}\), respectively.
Thus, \(T({\bf X})\) is a biased but better estimator than \(\bar{X}_{n}\).
Unfortunately, it is not usually possible to find the estimator with lowest possible MSE among the class of all estimators of \(\psi(\boldsymbol{\theta})\).
Therefore, we restrict our search to a subclass of the estimators and find the best estimator (estimator with lowest MSE) within that subclass.
The subclass usually considered is the class of unbiased estimators of \(\psi(\boldsymbol{\theta})\) , say \(\mathcal{U}_{\psi(\boldsymbol{\theta})}\).
Within the class \(\mathcal{U}_{\psi(\boldsymbol{\theta})}\), we search for the estimator \(T^{\star}\) for which the MSE is minimum.
As the MSE of the estimator \(T_{n}\) of \(\psi(\boldsymbol{\theta})\) can be expressed as \[ MSE_{T_{n}} (\psi(\boldsymbol{\theta})) = \mathtt{Bias}_{T_{n}}^{2} (\psi(\boldsymbol{\theta})) + \mathtt{var}_{\boldsymbol{\theta}}(T_{n}),\]
it is easy see that for an unbiased estimator \(T({\bf X})\) \(MSE_{T_{n}} (\psi(\boldsymbol{\theta})) = \mathtt{var}{\boldsymbol{\theta}}(T_{n})\).
Therefore, for the class of unbiased estimators, the lowest MSE estimator is the estimator with minimum variance.
Definition [UMVUE]
An estimator of \(\theta\), \(T({\bf X})\), is called a uniformly minimum variance unbiased estimator (UMVUE) if
\(E_{\theta}\{ T({\bf X})\}=\theta\) for all \(\theta\), and
for any estimator \(T^{\prime}({\bf X})\) with \(E_{\theta}\{ T^{\prime}({\bf X})\}=\theta\), \(\mathrm{var}_{\theta}\{T({\bf X})\}\leq \mathrm{var}_{\theta}\{T^{\prime}({\bf X})\}\) for all \(\theta\).
Note: UMVUE is unique.
Note: If an (unbiased and) efficient estimator exists then it must be the UMVUE. However, the converse is not true.
How to find the UMVUE?
One way to find the UMVUE is through the complete-sufficient statistics (CSS).
Result: Let \(T({\bf X})\) be complete-sufficient statistic, and there exists a function \(g\) such that \(E_{\boldsymbol{\theta}} \left[ g(T({\bf X}) \right] = \psi(\boldsymbol{\theta})\) for all \(\boldsymbol{\theta}\) , i.e, \(g(T({\bf X}))\) is an unbiased estimator of \(\psi(\boldsymbol{\theta})\). Then \(g(T({\bf X}))\) is the UMVUE of \(\psi(\boldsymbol{\theta})\).
Example: Let \(X_{1}, \ldots, X_{n}\) be a random sample from \(\mathrm{uniform}(0, \theta)\). Suppose it is known that the highest order statistics \(X_{(n)} = \max\{ X_{1}, \ldots, X_{n}\}\) is a complete-sufficient statistic.
Verify that \(E_{\theta} (X_{(n)}) = n \theta/ (n+1)\), for all \(\theta\), which implies
\[ E_{\theta} \left[\frac{n+1}{n} X_{(n)} \right] = \theta.\]Therefore, by the above result \(T({\bf X}) = (n+1) X_{(n)}/n\) is the UMVUE of \(\theta\).
How to identify a complete-sufficient statistic?
Usually it is not easy to obtain a complete-sufficient statistic (beyond the scope of this course).
However, when the class of distributions under consideration belongs to the \(\mathrm{exponential~ family}\), then there is a simple way out.
Definition [Exponential family]:
A family of pmfs of pdfs is called a \(d\)-parameter exponential family if it can be expressed as
\[ f_{\boldsymbol{\theta}}({\bf x}) = \exp \left\{h(x) + c(\boldsymbol{\theta}) + \sum_{i=1}^{k} w_{i} (\boldsymbol{\theta}) T_{i}(x)\right\} . \] Here \(h, T_{1}, \ldots, T_{k}\) are real valued functions of \(x\), not depending on \(\boldsymbol{\theta}\). Further, \(c(\boldsymbol{\theta}), w_{1} (\boldsymbol{\theta}), \ldots, w_{k} (\boldsymbol{\theta})\) are real-valued functions of \(\boldsymbol{\theta}\), not depending on \(x\).
Result: Let \(X_{1},\ldots, X_{n}\) be a random sample from a distribution with pmf or pdf \(f_{\boldsymbol{\theta}}, ~ \boldsymbol{\theta} \in \boldsymbol{\Theta}\) contains an open subset of \(\mathbb{R}^{d}\) which belongs to an exponential family with \(d\leq k\). Then the statistic \({\bf T}({\bf X})\) is jointly complete-sufficient for \(\boldsymbol{\theta}\), where \[ {\bf T}({\bf X}) = \left(\sum_{i=1}^{n} T_{1} (X_{i}), \cdots, \sum_{i=1}^{n} T_{k} (X_{i}) \right). \]
Note: A one-one function of a complete-sufficient statistic is also a complete-sufficient statistic.
Example: Let \(X_{1}, \ldots, X_{n}\) be a random sample from \(\mathtt{normal}(\mu, \sigma^{2})\) distribution.
Express the joint pdf of \(X_{1}, \ldots, X_{n}\) in the form of exponential family.
Hence show that \((\sum_{i=1}^{n} X_{i} , \sum_{i=1}^{n} {X^{2}_{i}})\) is jointly complete-sufficient.
Using the above result find the UMVUEs of \(\mu\) and \(\sigma^{2}\).