Stat 6501 Homework 5

Ch8: 6b, ~~16d~~, ~~21c~~, ~~47d~~, ~~52d~~, 53, 69, 70, 71.

6

Suppose that \(X\sim bin(n,p)\)

Show that the mle of \(p\) is \(\hat{p} = X/n\)
Show that the mle of part(a) attains the Cramer-Rao Lower Bound
If \(n=10\) and \(X = 15\) plot the log-likelihood

Solution

We can write the log likelihood function as: \[l(p) = \ln\binom{n}{x} + x\ln(p) + (n-x)\ln(1-p)\] Taking the first derivative and setting to zero to maximize we get: \[l'(p) = x/p + (n-x)/(1-p) = 0\] which yields the estimate of \(p\): \[\hat{p} = X/n\] We check to make sure this is a max by taking the second derivative and evaluating at this derived estimate. So we have: \[l''(\hat{p}) = -\frac{n}{x} - \frac{n-x}{{(1 - (x/n))}^2} < 0\] since both of these components are positive so this subtractions is clearly negative
To show that the mle above attains Cramer-Rao Lower Bound we first find the asymptotic variance of this mle. \[A.V(\hat{p}) = 1/(nI(\hat{p}))\] So first we find the Fisher Information. \[I(\hat{p}) = -E\Big[\frac{\partial^2}{\partial p^2} \ln f(x|p)\Big]\] \[ = -E\Big[-\frac{x}{p^2} - \frac{n-x}{(1-p)^2}\Big] = -\Big[-\frac{E(x)}{p^2} - \frac{n-E(x)}{(1-p)^2}\Big]\] \[ = -\Big[\frac{-np}{p} - \frac{n(1-p)}{(1-p)^2} \Big] = \frac{n}{p} + \frac{n}{1-p} = \frac{n}{p(1-p)}\] \[\therefore 1/(nI(p)) = \frac{p(1-p)}{n^2}\] Now we find the variance of the mle \(\hat{p}\) to compare: \[Var(\hat{p}) = Var(\frac{1}{n}X) = Var(X)/n^2 = \frac{p(1-p)}{n^2}\] and so indeed the mle has attained Cramer-Rao Lower Bound.

16

Consider an i.i.d sample of random variables with density function \[f(x|\sigma) = \frac{1}{2\sigma}\exp\Big(-\frac{|x|}{\sigma}\Big)\]

Find a sufficient statistic for \(\sigma\)

Solution

Using the factorization theorem we do: \[f(x_1, x_2, \cdots, x_n| \sigma) = \prod_{i=1}^n\frac{1}{2\sigma}\exp\Big(-\frac{|x|}{\sigma}\Big) = \frac{1}{(2\sigma)^n}\exp\Big( -\frac{\sum |x_i|}{\sigma} \Big)\] Now noting that we have factored the conditional distribution into: \[h(x)g[Y;\sigma]\] Where: \[h(x) = 1/2^n \;\;\text{ and }\;\;g(Y|\sigma) = \frac{1}{\sigma^n}e^{-Y/\sigma}\] and so by factorization theorem \(Y = \sum |x_i|\) is sufficient.

21

Suppose that \(X_1, X_2, \cdots, X_n\) are i.i.d with density function \[f(x|\theta) = e^{-(x-\theta)}, \;\;\; x\geq \theta\] and \(f(x|\theta) = 0\) otherwise.

Find a sufficient statistic for \(\theta\).

Solution

Building upon the reasoning behoind the mle, observing the smallest value of the sample gives us a sufficient statistic, knowing the rest adds nothing. So \(min(X_1,\cdots, X_n)\) is a sufficient statistic for \(\theta\).

47

The Pareto Distribution has been used in economics as a model for a density function with slowly decaying tail: \[f(x|x_0, \theta) = \theta x_0^\theta x^{-\theta-1}; \;\;\; x\geq x_0, \;\; \theta > 1\]

Find a sufficient statistic for \(\theta\).

Solution

We write the joint ditribution of this as: \[f(x_1, ..., x_n|x_0, \theta) = \prod\Big[ \frac{\theta x_0^\theta}{x^{\theta+1}} \Big] = \frac{\theta^n x_0^{n\theta}}{\prod x^{\theta+1}}\times (1)\] It follows then by the factorization theorem that \(Y = \prod X_i\) is sufficient for \(\theta\).

52

Let \(X_1, \cdots, X_n\) be i.i.d random variables with density function \[f(x|\theta) = (\theta + 1)x^\theta, \;\;\; 0\leq x \leq 1\]

Find a sufficient statistic for \(\theta\).

Solution

We write the pdf in exponential form: \[f(x|\theta) = \exp\Big\{ \ln[(\theta + 1)x^\theta] \Big\}\] \[ = \exp\Big\{ \ln(\theta + 1) + \theta\ln(x) \Big\}\] By independence the joint distribution is the product of the marginals so we have that: \[f(x_1, \cdots, x_n|\theta) = \prod\exp\Big\{ \ln(\theta + 1) + \theta\ln(x) \Big\}\] \[ = \exp\Big\{ \sum\ln(\theta + 1) + \theta\sum\ln(x_i) \Big\}\] \[ = \exp\Big\{ \theta\sum\ln(x_i) + n\theta + n \Big\}\] and so by factorization theorem we have that \(Y = \sum\ln(X_i)\) is a sufficient statistic for \(\theta\)

53

Let \(X_1, X_2, \cdots , X_n\) be i.i.d uniform on \([0,\theta]\).

Find the method of moments estimate of \(\theta\) and its mean and variance.
Find the mle of \(\theta\)
Find the probability density of the mle, and calculate its mean and variance. Compare the variance, the bias, and the mean squared error to those of the method of moments estimate.
Find a modification of the mle that renders it unbiased.

Solution

First we find the mme of \(\theta\), so \(M_1 = \bar{X}\) and we find \(\mu_1\) \[\mu_1 = E(X) = \int_0^\theta x\frac{1}{\theta}dx = \frac{1}{\theta}\int_0^\theta xdx = \frac{1}{2\theta}[x^2]_0^\theta = \frac{\theta}{2}\] \[\Rightarrow \mu_1 = \theta/2 \Rightarrow \hat{\theta} = 2\bar{X}\] Now we find its mean and variance. \[E(\hat{\theta}) = E(2\bar{X}) = 2E(\bar{X})\] so we find the mean of \(\bar{X}\) \[E(\bar{X}) = E(\theta/2) = \theta/2 \Rightarrow E(\hat{\theta}) = \theta\] Now we find the variance \(\hat{\theta}\). \[Var(\hat{\theta}) = Var(2\bar{X}) = 4Var(\bar{X})\] So we find the \(Var(\bar{X})\) to get the solution. \[Var(\bar{X}) = \frac{1}{n}Var(X_i)\] by i.i.d it suffices to find the \(Var(X_1)\) \[Var(X_1) = E(X_1^2) - (E(X_1))^2\] So we find \(E(X_1^2)\) \[E(X_1^2) = \frac{1}{\theta}\int_0^\theta x^2dx = \frac{1}{3\theta} x^3\Big|_0^\theta = \frac{\theta^2}{3}\] \[\therefore Var(X_1) = \frac{\theta^2}{3} - \frac{\theta^2}{4} = \frac{\theta^2}{12}\] \[\Rightarrow Var(\bar{X}) = \frac{\sigma^2}{n12}\Rightarrow Var(\hat{\sigma}) = 4(\frac{\sigma^2}{n12}) = \frac{\sigma^2}{3n}\]
To find the mle of \(\theta\) we express the likelihood function as: \[L(x_1,\cdots,x_n;\theta) = 1/\theta^n \;\; \text{ for } x\in [0,\theta] \;\; \text{ and } 0\;\;\text{o/w}\] Clearly we have that the likelihood is increasing as the value of \(\theta\) decreases, but we are restricted to have \(\theta \geq X_i\;\forall i\), so we have that \(max(X_1, \cdots, X_n)\) is the mle of \(\theta\)
First we formulate the CDF of the mle. By definition: Let \(X_m\) be the max of the observed data. \[F_\theta(x) = P(X_m \leq x) = P(X_1 \leq x, X_2 \leq x, ..., X_m \leq x)\] and by i.i.d we write: \[ = \prod P(X_i \leq x) = (\frac{x}{\theta})^n\] by i.i.d. Now to find the pdf we take the derivative of this: \[\partial F/\partial x = \frac{n}{\theta}(\frac{x}{\theta})^{n-1}\] and we formally write the pdf as: \[f_\theta(x) = \begin{cases} \frac{n}{\theta}(\frac{x}{\theta})^{n-1}\;\; \text{ for }\;\; x\in [0,\theta]\\ 0 \;\; \text{ otherwise } \end{cases} \] Now we find its mean and variance to compare. For the mean we do: \[E(\hat{\theta}) = \int_0^\theta x \frac{n}{\theta}(\frac{x}{\theta})^{n-1}dx\] \[ = \frac{n}{\theta^n}\int_0^\theta x^n dx = \frac{n\theta}{n+1}\] To find the variance we use \(Var(\hat{\theta}) = E(\hat{\theta}^2) - (E(\hat{\theta}))^2\) , so we find E(^2). \[E(\hat{\theta}^2) = \frac{n}{\theta^n}\int_0^\theta x^{n+1}dx = \frac{n}{\theta^n(n+2)}x^{n+1}\Big|_0^\theta = \frac{n\theta^2}{n+2}\] Applying the formula we get: \[Var(\hat{\theta}) = \frac{n\theta^2}{(n+2)(n+1)^2}\] Lastly we calculate the bias with \[E(\hat{\theta}) - \theta = \frac{n\theta}{n+1} - \frac{\theta n + \theta}{n+1} = \frac{\theta}{n+1}\] We see that the mme is unbiased on the other hand the mle is biased. Due to this the MSE of the mme is the variance \(\sigma^2/3n\) and we note that this term gets smaller then the mse of the mle, which leads us to believe that the mle is just as good when sample size is large.
All we have to do is multiply it by \(\frac{n+1}{n}\).

69

Use the factorization theorem to conclude that \(T = \sum_{i = 1}^n X_i\) is a sufficient statistic when \(X_i\) are an i.i.d sample from a geometric distribution.

Solution

The geometric distribution has pdf \[f(x;p) = p(1-p)^{x-1}\] We will write in exponential form and apply the factorization theorem: \[f(x_1,...,x_n; p) = \prod \exp\Big\{ x\ln(1-p) + \ln(p) - ln(1-p) + ln(1) \Big\}\] \[ = \prod\exp\Big\{x\ln(1-p) + \ln(p/(1-p)) + ln(1) \Big\}\] and so by factorization theorem we have that \(Y = \sum X_i\) is a sufficient statistic for \(p\).

70

Use the factorization theorem to find a sufficient statistic for the exponential distribution.

Solution

Consider an i.i.d sample \(X_1, X_2, ..., X_n\) from an exponential. We can write the joint distribution as \[f(x_1, ..., x_n|\theta) = \frac{1}{\theta^n}\Big( -\frac{1}{\theta}\sum x_i \Big)\] We can write as: \[f(x_1, ..., x_n|\theta) = \frac{1}{\theta^n}\Big( -\frac{1}{\theta}\sum x_i \Big) \times 1\] So we havat that \(\phi(u(\sum x_i);\theta) = \frac{1}{\theta^n}[ -\frac{1}{\theta}\sum x_i ]\) and so \(T = \sum_{i = 1}^n X_i\) is sufficient for \(\theta\).

71

Let \(X_1, ..., X_n\) be an i.i.d sample taken from a distribution with density function: \[f(x|\theta) = \frac{\theta}{(1+x)^{\theta+1}}\] Find a sufficient statistic for \(\theta\)

Solution

By i.i.d we can write the joint distribution of the sample as \[f(x_1, ..., x_n|\theta) = \theta^n\Big( \prod(1+x_i) \Big)^{-\theta -1}\] and so by the factorization theorem we have that \(\prod (1+X_i)\) is sufficient for \(\theta\).

Stat 6501 Homework 5

Enrique Rodriguez

November 24, 2015

6

16

21

47

52

53

69

70

71