Main Theorem
In exponential families, the canonical statistic \(T\) is sufficient.
If \(Q(\theta)\) ranges over an open
set, then \(T\) is also
complete sufficient.
Let \(X_1, X_2, \dots, X_n\) be a
random sample from a normal distribution with mean \(\mu\)
and variance \(\sigma^2\), denoted
\(X_i \sim N(\mu, \sigma^2)\).
The problem of point estimation is to pick a
statistic \(T(X_1,\dots,X_n)\)
that “best” estimates the parameter \(\mu\).
Definition 1.
The set of all admissible values of the parameter of a distribution is
called the parameter space \(\Omega\).
If \(X_i \sim
\text{Poisson}(\lambda)\), then \(\Omega = (0,\infty)\).
Possible estimators of \(\lambda\):
\[
T_1 = \frac{1}{n}\sum_{i=1}^n X_i, \quad
T_2 = \frac{1}{n-1}\sum_{i=1}^n (X_i - \bar{X})^2
\]
If \(X_i \sim
\text{Bernoulli}(\theta)\), then \(\Omega = (0,1)\).
Possible estimators of \(\theta\):
\[
T_1 = \frac{1}{n}\sum_{i=1}^n X_i, \quad
T_2 = X_1, \quad
T_3 = \frac{X_1+X_2}{2}
\]
If \(X_i \sim N(\mu,
\sigma^2)\), then \(\Omega =
\{(\mu,\sigma^2): -\infty < \mu < \infty, \sigma^2 >
0\}\).
Possible estimators:
\[
T_1 = \bar{X}, \quad
T_2 = S^2 = \frac{1}{n-1}\sum_{i=1}^n (X_i - \bar{X})^2
\]
Since estimators are statistics, they inherit properties of the underlying statistic.
Definition.
A sequence \(\{X_n\}\) converges in
probability to \(b\) if
\[
\Pr(|X_n - b| > \epsilon) \to 0 \quad \text{as } n \to \infty, \;
\forall \epsilon > 0.
\]
Denoted \(X_n \xrightarrow{p} b\).
Definition.
An estimator sequence \(\{T_n\}\) is
consistent for \(\theta\) if \(T_n
\xrightarrow{p} \theta\).
If \(X_i\) are iid with \(E[X_i] = \mu\) and \(\text{Var}(X_i)=\sigma^2 < \infty\),
then
\[\bar{X}_n = \frac{1}{n}\sum_{i=1}^n X_i
\xrightarrow{p} \mu.\]
Proof.
By Chebyshev’s inequality, for \(\epsilon>0\):
\[
\Pr(|\bar{X}_n - \mu| > \epsilon) \le \frac{\sigma^2}{n\epsilon^2}
\to 0.
\]
Thus \(\bar{X}_n \xrightarrow{p}
\mu\).
If \(\lim_{n\to\infty} E[T_n] = \theta\) and \(\lim_{n\to\infty} \text{Var}(T_n) = 0\), then \(T_n\) is consistent for \(\theta\).
Proof. By Chebyshev’s inequality:
\[
\Pr(|T_n - E[T_n]| > \epsilon) \le \frac{\text{Var}(T_n)}{\epsilon^2}
\to 0.
\]
And since \(E[T_n] \to \theta\), we get
\(T_n \xrightarrow{p} \theta\).
Definition.
A statistic \(T(X)\) is
sufficient for \(\theta\) if the conditional distribution of
the sample given \(T\) is independent
of \(\theta\).
\(T(X)\) is sufficient for \(\theta\) iff:
\[
f(x_1,\dots,x_n;\theta) = g(T(x);\theta)h(x)
\] where \(h(x)\) does not
depend on \(\theta\).
If \(X_i \sim \text{Bernoulli}(p)\), then \(T=\sum X_i\) is sufficient for \(p\).
Proof.
Joint pmf:
\[
f(x_1,\dots,x_n;p) = p^{\sum x_i}(1-p)^{n-\sum x_i}.
\]
This can be written as \(g(T;p)h(x)\),
so \(T\) is sufficient.
If \(X_1,X_2 \sim \text{Poisson}(\lambda)\), then \(T=X_1+X_2\) is sufficient for \(\lambda\).
Definition (Lehmann-Scheffé).
A statistic is minimal sufficient if it is a function
of every other sufficient statistic.
Construction (likelihood ratio method).
Two sample points \(x_1, x_2\) are
equivalent if
\[
\frac{L(x_1;\theta)}{L(x_2;\theta)} \text{ is independent of } \theta.
\]
The partition induced corresponds to a minimal sufficient statistic.
Definition.
A family \(\{f(x;\theta): \theta \in
\Omega\}\) is complete if:
\[
E[g(X)] = 0 \; \forall \theta \in \Omega \;\; \Rightarrow \;\; g(X)=0
\;\text{a.s.}
\]
Example 1. The family \(\text{Binomial}(n,\theta)\) is complete.
Example 2. The family \(N(\theta,\theta)\) is not complete.
Example 3. Uniform\((0,\theta)\) is complete.
Definition.
A distribution belongs to an exponential family if it can be written
as:
\[
f(x;\theta) = \exp\{Q(\theta)T(x) + D(\theta) + S(x)\}.
\]
In exponential families, the canonical statistic \(T\) is sufficient.
If \(Q(\theta)\) ranges over an open
set, then \(T\) is also
complete sufficient.