To estimate the weight of women gradutes(normally distributed), we have taken some samples randomly (weighted in pounds):
125 132 120 137 123 159 113 165 143 176
Q: what is the population weight of woment graduates?
Our intuition tells us to take the average of the sample to estimate the population mean. And this is correct.
But why this works? Because the sample mean is the best guess to the population mean, or in other words, the sample mean maximize the possibility(or “likelyhood”) of getting the population mean.-This is the basic idea of the “method of maximum likelihood”. And at application, we need to prove it mathmatically
Why do we need “method of moments” then? Because sometimes it is too difficult to calculate through the maximum likelihood method.
直觉是用样本平均值估计总体体重。
为啥管用?因为用样本均值估计总体体重,猜对的可能性最大。——这个就是maximum likelihood的理论基础。
1.写出密度函数f(xi;θ)=…
2.构建联合密度函数L(θ)=f(x1;θ)f(x2;θ)…f(xn;θ)
3.L(θ)求导=0,求出θ值
\(\begin{equation*} f(x_i;\theta_1,\theta_2)=\dfrac{1}{\sqrt{\theta_2}\sqrt{2\pi}}\text{exp}\left[-\dfrac{(x_i-\theta_1)^2}{2\theta_2}\right] \end{equation*}\)
2.构建 the joint probability mass (ordensity) function of X1, X2, …,Xn
\(\begin{equation*} L(\theta_1,\theta_2)=\prod\limits_{i=1}^n f(x_i;\theta_1,\theta_2)=\theta^{-n/2}_2(2\pi)^{-n/2}\text{exp}\left[-\dfrac{1}{2\theta_2}\sum\limits_{i=1}^n(x_i-\theta_1)^2\right] \end{equation*}\)
3.求导
先ln
\(\begin{equation*} \text{log} L(\theta_1,\theta_2)=-\dfrac{n}{2}\text{log}\theta_2-\dfrac{n}{2}\text{log}(2\pi)-\dfrac{\sum(x_i-\theta_1)^2}{2\theta_2} \end{equation*}\)
再对 θ1和θ2分别求偏导
求导θ1得到 \(\begin{equation*} \sum x_i-n\theta_1=0 \end{equation*}\)
所以:
\[\begin{equation*} \hat{\theta}_1=\hat{\mu}=\dfrac{\sum x_i}{n}=\bar{x} \end{equation*}\]
求导θ2得到 \(\begin{equation*} -n\theta_2+\sum(x_i-\theta_1)^2=0 \end{equation*}\)
所以:
\[\begin{equation*} \hat{\theta}_2=\hat{\sigma}^2=\dfrac{\sum(x_i-\bar{x})^2}{n} \end{equation*}\]
把参数当作未知数,构建方程求解。1个参数就构建1个方程,2个参数就构建2个方程,以此类推。
第一个方程\(\begin{equation*} E(X) \end{equation*}\),记作 \(\begin{equation*} M1=E(X)=\mu=\dfrac{1}{n}\sum\limits_{i=1}^n X_i \end{equation*}\)
第二个方程\(\begin{equation*} E(X^2) \end{equation*}\),记作 \(\begin{equation*} M2=E(X^2)=\sigma^2+\mu^2=\dfrac{1}{n}\sum\limits_{i=1}^n X_i^2 \end{equation*}\)
以此类推
第k个方程 \(\begin{equation*} E[X^k] \end{equation*}\),记作\(\begin{equation*} M^k \end{equation*}\)
第一个方程\(\begin{equation*} E(X) \end{equation*}\),记作 \(\begin{equation*} M_1=\dfrac{1}{n}\sum\limits_{i=1}^n X_i=\bar{X} \end{equation*}\)
第二个方程\(\begin{equation*} E[(X-\mu)^2] \end{equation*}\),记作\(\begin{equation*} M_2^\ast=\dfrac{1}{n}\sum\limits_{i=1}^n (X_i-\bar{X})^2 \end{equation*}\)
以此类推
第k个方程\(\begin{equation*} E[(X-\mu)^k] \end{equation*}\),记作\(\begin{equation*} M^\ast_k \end{equation*}\)
有两个参数需估计,所以2个方程就够了:
第一个方程为\(\begin{equation*} E(X)=\mu=\dfrac{1}{n}\sum\limits_{i=1}^n X_i \end{equation*}\)
求得u的method of moments方法的估计值:\(\begin{equation*} \hat{\mu}_{MM}=\dfrac{1}{n}\sum\limits_{i=1}^n X_i=\bar{X} \end{equation*}\)
第二个方程为\(\begin{equation*} E(X^2)=\sigma^2+\mu^2=\dfrac{1}{n}\sum\limits_{i=1}^n X_i^2 \end{equation*}\)
用它求第二个参数: \(\begin{equation*} \hat{\sigma}^2_{MM}=\dfrac{1}{n}\sum\limits_{i=1}^n X_i^2-\mu^2=\dfrac{1}{n}\sum\limits_{i=1}^n X_i^2-\bar{X}^2 \end{equation*}\)
即:
\(\begin{equation*} \hat{\sigma}^2_{MM}=\dfrac{1}{n}\sum\limits_{i=1}^n( X_i-\bar{X})^2 \end{equation*}\)
求u的方法同一
第一个方程为\(\begin{equation*} E(X)=\mu=\dfrac{1}{n}\sum\limits_{i=1}^n X_i \end{equation*}\)
求得u的method of moments方法的估计值:\(\begin{equation*} \hat{\mu}_{MM}=\dfrac{1}{n}\sum\limits_{i=1}^n X_i=\bar{X} \end{equation*}\)
第二个方程为 \(\begin{equation*} Var(X_i)=E(x_i-μ)^2=σ^2 \end{equation*}\)
求得σ2的method of moments方法的估计值:\(\begin{equation*} \hat{\sigma}^2_{MM}=\dfrac{1}{n}\sum\limits_{i=1}^n (X_i-\bar{X})^2 \end{equation*}\)
含义:Knowing the value of A is equivalent to knowing the value of B, and hence A is also sufficient for B.
If A is a single-valued function of B with a single-valued inverse.
or
If the conditional distribution Y does not depend on parameter p, then Y is a sufficient statistic for p.
已知f(x1, x2, …, xn; θ)取决于 θ。如果这个函数能分解成两部分,则说它is sufficient for θ ,即: \(\begin{equation*} f(x_1, x_2, ... , x_n;\theta) = \phi [ u(x_1, x_2, ... , x_n);\theta ] h(x_1, x_2, ... , x_n) \end{equation*}\)
如果f(x;θ)能写成这种形式: \(\begin{equation*} f(x;\theta) =exp\left[K(x)p(\theta) + S(x) + q(\theta) \right] \end{equation*}\)
那么 \(\begin{equation*} \sum_{i=1}^{n} K(X_i) \end{equation*}\) is sufficient for θ.