Theory of Point Estimation

Introduction

Let \(X_1, X_2, \dots, X_n\) be a random sample from a normal distribution with mean \(\mu\)
and variance \(\sigma^2\), denoted \(X_i \sim N(\mu, \sigma^2)\).

The problem of point estimation is to pick a statistic \(T(X_1,\dots,X_n)\)
that “best” estimates the parameter \(\mu\).

An estimator is a statistic (random variable).
An estimate is its realized numerical value (constant).

Definition 1.
The set of all admissible values of the parameter of a distribution is called the parameter space \(\Omega\).

Examples

If \(X_i \sim \text{Poisson}(\lambda)\), then \(\Omega = (0,\infty)\).
Possible estimators of \(\lambda\):
\[ T_1 = \frac{1}{n}\sum_{i=1}^n X_i, \quad T_2 = \frac{1}{n-1}\sum_{i=1}^n (X_i - \bar{X})^2 \]
If \(X_i \sim \text{Bernoulli}(\theta)\), then \(\Omega = (0,1)\).
Possible estimators of \(\theta\):
\[ T_1 = \frac{1}{n}\sum_{i=1}^n X_i, \quad T_2 = X_1, \quad T_3 = \frac{X_1+X_2}{2} \]
If \(X_i \sim N(\mu, \sigma^2)\), then \(\Omega = \{(\mu,\sigma^2): -\infty < \mu < \infty, \sigma^2 > 0\}\).
Possible estimators:
\[ T_1 = \bar{X}, \quad T_2 = S^2 = \frac{1}{n-1}\sum_{i=1}^n (X_i - \bar{X})^2 \]

Properties of Estimators and Statistics

Estimator properties:

Consistency
Unbiasedness
Minimum variance
Efficiency

Statistic properties:

Sufficiency
Completeness

Since estimators are statistics, they inherit properties of the underlying statistic.

Consistency

Definition.
A sequence \(\{X_n\}\) converges in probability to \(b\) if
\[ \Pr(|X_n - b| > \epsilon) \to 0 \quad \text{as } n \to \infty, \; \forall \epsilon > 0. \]
Denoted \(X_n \xrightarrow{p} b\).

Definition.
An estimator sequence \(\{T_n\}\) is consistent for \(\theta\) if \(T_n \xrightarrow{p} \theta\).

Theorem 1: Weak Law of Large Numbers

If \(X_i\) are iid with \(E[X_i] = \mu\) and \(\text{Var}(X_i)=\sigma^2 < \infty\), then
\[\bar{X}_n = \frac{1}{n}\sum_{i=1}^n X_i \xrightarrow{p} \mu.\]

Proof.
By Chebyshev’s inequality, for \(\epsilon>0\):
\[ \Pr(|\bar{X}_n - \mu| > \epsilon) \le \frac{\sigma^2}{n\epsilon^2} \to 0. \]
Thus \(\bar{X}_n \xrightarrow{p} \mu\).

Theorem 2

If \(\lim_{n\to\infty} E[T_n] = \theta\) and \(\lim_{n\to\infty} \text{Var}(T_n) = 0\), then \(T_n\) is consistent for \(\theta\).

Proof. By Chebyshev’s inequality:
\[ \Pr(|T_n - E[T_n]| > \epsilon) \le \frac{\text{Var}(T_n)}{\epsilon^2} \to 0. \]
And since \(E[T_n] \to \theta\), we get \(T_n \xrightarrow{p} \theta\).

Sufficiency

Definition.
A statistic \(T(X)\) is sufficient for \(\theta\) if the conditional distribution of the sample given \(T\) is independent of \(\theta\).

Factorization Theorem

\(T(X)\) is sufficient for \(\theta\) iff:
\[ f(x_1,\dots,x_n;\theta) = g(T(x);\theta)h(x) \] where \(h(x)\) does not depend on \(\theta\).

Example 1: Binomial

If \(X_i \sim \text{Bernoulli}(p)\), then \(T=\sum X_i\) is sufficient for \(p\).

Proof.
Joint pmf:
\[ f(x_1,\dots,x_n;p) = p^{\sum x_i}(1-p)^{n-\sum x_i}. \]
This can be written as \(g(T;p)h(x)\), so \(T\) is sufficient.

Example 2: Poisson

If \(X_1,X_2 \sim \text{Poisson}(\lambda)\), then \(T=X_1+X_2\) is sufficient for \(\lambda\).

Minimal Sufficiency

Definition (Lehmann-Scheffé).
A statistic is minimal sufficient if it is a function of every other sufficient statistic.

Construction (likelihood ratio method).
Two sample points \(x_1, x_2\) are equivalent if
\[ \frac{L(x_1;\theta)}{L(x_2;\theta)} \text{ is independent of } \theta. \]
The partition induced corresponds to a minimal sufficient statistic.

Completeness

Definition.
A family \(\{f(x;\theta): \theta \in \Omega\}\) is complete if:
\[ E[g(X)] = 0 \; \forall \theta \in \Omega \;\; \Rightarrow \;\; g(X)=0 \;\text{a.s.} \]

Example 1. The family \(\text{Binomial}(n,\theta)\) is complete.

Example 2. The family \(N(\theta,\theta)\) is not complete.

Example 3. Uniform\((0,\theta)\) is complete.

Exponential Families

Definition.
A distribution belongs to an exponential family if it can be written as:
\[ f(x;\theta) = \exp\{Q(\theta)T(x) + D(\theta) + S(x)\}. \]

Examples

Binomial: exponential family with \(T(x)=x\).
Normal with known mean: exponential family in \(\sigma^2\).
Gamma: exponential family in \(\theta\).

Main Theorem

In exponential families, the canonical statistic \(T\) is sufficient.
If \(Q(\theta)\) ranges over an open set, then \(T\) is also complete sufficient.

References

Lecture notes: Statistical Inference Theory by Henry Athiany

Statistical Inference Theory

Henry Athiany| Herbert Imbogah

Theory of Point Estimation

Introduction

Examples

Properties of Estimators and Statistics

Estimator properties:

Statistic properties:

Consistency

Theorem 1: Weak Law of Large Numbers

Theorem 2

Sufficiency

Factorization Theorem

Example 1: Binomial

Example 2: Poisson

Minimal Sufficiency

Completeness

Exponential Families

Examples

Main Theorem

References