Stat 6401 Homework 6

5.1

The Rayleigh distribution has pdf: \[f(x) = xe^{-x^2/2}, \;\; x > 0\]

Find \(P(1 < X < 3)\)
Find the first quartile, median and third quartile of \(X\).

Solution

To find this probability we set up an integral as follows: \[\int_1^3 xe^{-x^2/2}dx\] To evaluate we use a substitution. Let \(u = x^2/2\) then we have that \(du = 2x/2 dx = xdx\) This allows us to rewrite the integral as: \[\int_1^3e^{-u}du = -\Big[ e^{-u}\Big]_1^3 = -e^{-9/2} + e^{-1/2} \approx 0.5954217\]
First we find the CDF of the Rayleigh Distribution we do so as follows: \[\int_0^{x_0} xe^{-x^2/2}dx = -e^{-x^2/2}\Big|_0^{x_0} = 1 - e^{-x_0^2/2}\] For the first quartile we look for: \[.25 = F(p_1) \Rightarrow .25 = 1 - e^{-p_1^2/2}\] \[ = e^{-p_1^2/2} = .75\] \[ = -p_1^2/2 = \ln(.75)\] \[p_1 \approx 0.7585276\] Similarly for the median \[.5 = F(p_2) \Rightarrow .5 = 1 - e^{-p_2^2/2}\] \[ \Rightarrow .5 = e^{-p_2^2/2}\] \[\ln(.5) = -p_2^2/2 \Rightarrow p_2 \approx 1.17741\] Lastly for the third quartile we do the same to get: \[p_3 \approx 1.665109\]

5.6

The \(68-95-99.7\%\) rule gives approximate probabilities of a Normal r.v. being within 1, 2 and 3 standard deviations of its mean. Derive analogous rules for the following distributions.

\(Unif(0,1)\)
\(Expo(1)\)
\(Expo(1/2)\)

Solution

For the standard uniform distribution we know that: \[f(x) = 1\;\;\forall x\] Moreover the mean is \(1/2\) and standard deviation is \(1/(2\sqrt{3})\approx 0.2886751\), so the problem reduces to calculating what probability lies within the following intervals: \[(0.2113249, 0.7886751), (-0.0773502, 1.0773502), (-0.3660253, 1.3660253)\]

We can easily calculate this using R

Unif_mean <- 1/2
Unif_sd <- c(-0.2886751,  0.2886751)
# declare the intervals
first_int <- Unif_mean + Unif_sd
second_int <- Unif_mean + 2*Unif_sd
third_int <- Unif_mean + 3*Unif_sd
# calculate probability in these intervals 
first_rule <- punif(first_int[2]) - punif(first_int[1])
second_rule <- punif(second_int[2]) - punif(second_int[1])
third_rule <- punif(third_int[2]) - punif(third_int[1])

So the rule is as follows:

Within 1 S.d 57.73502%
Within 2 s.d 100%
Within 3 s.d \(100\%\)

\(X \sim Expo(1) \Rightarrow f(x) = e^{-x}\;\; x \geq 0\) Moreover the mean is 1 and s.d is 1. So like above we find the probabilities of being within 1, 2 and 3 standard deviation from the mean. We do this in R

e_mean <- c(1,1); e_sd <- c(-1,1)
# intervals to evaluate over
within_1 <- e_mean + e_sd
within_2 <- e_mean + 2*e_sd
within_3 <- e_mean + 3*e_sd
# rules
e_rule_1 <- pexp(within_1[2]) - pexp(within_1[1])
e_rule_2 <- pexp(within_2[2]) - pexp(within_2[1])
e_rule_3 <- pexp(within_3[2]) - pexp(within_3[1])

So the rules are as follows:

Within 1 s.d we have 86.46%
Within 2 s.d we have 95.02%
Within 3 s.d we have 98.16%

\(X\sim Expo(1/2) \Rightarrow f(x) = 1/2e^{-x/2}\;\; \forall x \geq 0\) Moreover we have that the mean is 2 and s.d is also 2. We complete the excercise like we did above:

e2_mean <- c(2,2); e2_sd <- c(-2,2)
# intervals to evaluate over
within_11 <- e2_mean + e2_sd
within_22 <- e2_mean + 2*e2_sd
within_33 <- e2_mean + 3*e2_sd
# rules
e2_rule_1 <- pexp(within_1[2],.5) - pexp(within_1[1],.5)
e2_rule_2 <- pexp(within_2[2],.5) - pexp(within_2[1],.5)
e2_rule_3 <- pexp(within_3[2],.5) - pexp(within_3[1],.5)

So the rules are as follows:

Within 1 s.d we have 63.21%
Within 2 s.d we have 77.68%
Within 3 s.d we have 86.46%

5.8

The Beta distribution with parameters \(a = 3\) and \(b = 2\) has the PDF \[f(x) = 12x^2(1 - x); \;\; 0 \leq x \leq 1\]

Find the CDF of \(X\)
Find \(P(0 < X < 1/2)\)
Find the mean and variance of \(X\)

Solution

To find the CDF we set up an integral as follows: \[F(x) = \int_0^{x_0}12x^2(1 - x)dx = 12\int_0^{x_0}x^2(1 - x)dx = 12\Big[ \int_o^{x_0}x^2dx - \int_0^{x_0}x^3dx \Big]\] \[ = 12\Big[ \frac{x^3}{3}\Big|_0^{x_0} - \frac{x^4}{4}\Big|_0^{x_0} \Big] = 4x^3\Big|_0^{x_0} - 3x^4\Big|_0^{x_0}\] \[ = 4x_0^3 - 3x_0^4\]
To find the desired probability we use the derived CDF above and do: \[F(1/2) - F(0) = [4(1/2^3) - 3(1/2^4)] - 0 = .3125\]
To find the mean we do: \[E(X) = \int_0^1 x12x^2(1-x)dx = 12\int_0^1 x^3(1-x)dx = 12\Big[ \int_0^1x^3dx - \int_0^1x^4dx \Big]\] \[ = 12\Big[ \frac{x^4}{4}\Big|_0^1 - \frac{x^5}{5}\Big|_0^1 \Big] = 12(\frac{1}{4} - \frac{1}{5}) = .6\] To find the variance we use: \(Var(X) = E(X^2) - [E(X)]^2\), so we need to find \(E(X^2)\) \[E(X^2) = \int_0^1 x^2 12x^2(1-x)dx\] \[12 \int_0^1 x^4(1-x)dx = 12\Big[ \frac{x^5}{5}\Big|_0^1 - \frac{x^6}{6}\Big|_0^1 \Big] = 12(1/5 - 1/6) = .4\] Plugging the results into the Variance formula we get: \[Var(X) = E(X^2) - [E(X)]^2 = .4 - .6^2 = .04\]

5.11

Let \(U\) be a Uniform r.v on the interval (-1,1),

Compute \(E(U), Var(U)\), and \(E(U^4)\)
Find the CDF and PDF of \(U^2\). Is the distribution of \(U^2\) Uniform on (0,1)?

Solution

For calculations in pdf’s I use \(x\) instead of what should be \(u\).

Although \(E(U)\) this is obviously \(0\), here is the calculation: \[E(U) = \int_{-1}^1 x/2 dx = \frac{1}{2}\int_{-1}^1 xdx = \frac{1}{2}(x^2/2)\Big|_{-1}^1 = \frac{1}{4}x^2\Big|_{-1}^1 = \frac{1}{4}\times 0 = 0\] For \(Var(U)\) we need to first calculate \(E(U^2)\) we do that now: \[E(U^2) = \frac{1}{2}\int_{-1}^1 x^2dx = \frac{1}{2}(x^3/3)\Big|_{-1}^1 = \frac{1}{6}(2) = 1/3\] Plugging this in the formula above we have: \[Var(U) = E(U^2) - [E(U)]^2 = (1/3) - (0) = 1/3\] Lastly we compute \(E(U^4)\) \[E(U^4) = \frac{1}{2}\int_{-1}^1 x^4dx = \frac{1}{2}(x^5/5)\Big|_{-1}^1 = \frac{1}{10}x^5\Big|_{-1}^1 = 1/5\]
Let \(Y = U^2\), we first find the CDF of \(Y\). Note that the support for \(U\) is \([-1,1]\) it follows then that the Support for \(Y\) is \([0,1]\). We derive the CDF now: \[F_Y(y) = P(Y \leq y) = P(U^2 \leq y) = P(-\sqrt{y} \leq U \leq \sqrt{y}) = F_U(\sqrt{y}) - F_U(-\sqrt{y})\] \[ = \frac{\sqrt{y}+1}{2} - \frac{-\sqrt{y}+1}{2} = \sqrt{y}\] So we can write out the cdf as follows: \[F_Y(y) = \begin{cases} 0 \;\; \text{ for } y\leq 0\\ \sqrt{y} \;\;\text{for }\;\; x \in (0,1)\\ 1 \;\; \text{ for } y \geq 1 \end{cases} \] Now we find the pdf of the distribution by taking the derivative of the CDF. We do that now: \[\frac{\partial}{\partial y}F_Y(y) = \frac{\partial}{\partial y}\sqrt{y} = \frac{1}{2\sqrt{y}}\] We can write out the PDF as follows: \[f_Y(y) = \begin{cases} \frac{1}{2\sqrt{y}}\;\; \text{ for } \;\; x\in (0,1)\\ 0 \;\;\text{ otherwise } \end{cases} \] We can check to make sure this integrates to 1, \[\int_0^1 \frac{1}{2\sqrt{y}}dy = \frac{1}{2}\int_0^1 y^{-1/2}dy = \sqrt{y}\big|_0^1 = 1\] This distribution is clearly not \(Unif(0,1)\) as a matter of fact we can manipulate it to show that it is exponential. Moreover a quick plot shows this:

5.14

Let \(U_1,U_2, \cdots, U_n\) be i.i.d \(Unif(0,1)\) and \(X = max(U_1, U_2, \cdots, U_n)\). What is the pdf of \(X?\) What is \(EX?\)

Solution

We first find the CDF as follows: \[F_X(x) = P(X \leq x) = P(max(U_1, U_2, \cdots, U_n) \leq x) = P(U_1 \leq x, \cdots U_n \leq x)\] and by i.i.d properties we can write this as: \[\prod_{i = 1}^n P(U_i \leq x) = \prod_{i=1}^n F_U(x) = x^n\] And we can write the CDF as follows: \[F_X(x) = \begin{cases} 0; \;\; x \leq 0\\ x^n; \;\; x \in (0,1)\\ 1; \;\; x \geq 1 \end{cases} \] To find the PDF we simply take the derivative of the CDF we derived above. So we can write it as: \[f_X(x) = nx^{n-1}\;\; \text{for}\;\; 0 \leq x \leq 1\] And to find the expected value of \(X\) we simply apply the definition to yield: \[E(X) = \int_0^1 xnx^{n-1}dx = n\int_0^1 x^n dx = \frac{n}{n+1}x^{n+1}\Big|_0^1= \frac{n}{n+1}\]

5.15

Let \(U\sim Unif(0,1)\). Using \(U\) construct \(X\sim Exp(\lambda)\).

Solution

First we want to find the quantile function of an exponential (i.e \(F^{-1}(x)\)). We know from previous problems that the cdf of an expontial is given by: \[F(t) = 1 - e^{-\lambda t}\] Using algebra we derive the inverse: \[y = 1 - e^{-\lambda t} \Rightarrow y - 1 = -e^{-\lambda t}\] \[\Rightarrow e^{-\lambda t} = 1 - y\] \[-\lambda t = \ln(1 - y)\] \[t = - \ln(1-y)/\lambda\] \[\therefore F^{-1}(t) = - \ln(1-y)/\lambda\] Now by the universality of the Uniform we have that \(X = F^{-1}(U) \Rightarrow X\) is a r.v with CDF \(F\). Carrying this out yields: \[F^{-1}(U) = -\frac{\ln(1 - U)}{\lambda}\sim Exp(\lambda)\] A quick simulation can validate this:

F_exp <- function(x,l) {return (-log(1-x))/l}
ru <- runif(100000, 0, 1)
unif_to_exp <- F_exp(ru, 1)
# mean is about right
mean(unif_to_exp)

## [1] 1.002318

# and a plot 
qplot(unif_to_exp)

5.18

The Pareto distribution with \(a > 0\) has PDF \(f(x) = a/x^{a+1}\) for \(x\geq 1\) and 0 otherwise.

Find the CDF of the Pareto r.v with the paramter \(a\) and check that it is a valid CDF.
Suppose that for a simulation you want to run, you need to generate i.i.d. Pareto(a) r.v.s. You have a computer that knows how to generate i.i.d. Unif(0, 1) r.v.s but does not know how to generate Pareto r.v.s. Show how to do this.

Solution

We find the cdf in the usual fashion: \[F_X(x)= \int_1^x a/t^{a+1}dt = a\int_1^x t^{-(a+1)}dt = a[t^{-a}/-a]_1^x =1 -x^{-a}\] which yields \[F_X(x) = \begin{cases} 0 \;\; \text{ for }\;\; x \leq 0\\ 1-x^{-a}\;\; \text{ for }\;\; x \geq 0 \end{cases}\] We now show that this is a valid cdf. First we show that it is strictly increasing: Let \(x_1 \leq x_2\) we show that \(F_X(x_1) \leq F_X(x_2)\) or equivalently that \(F_X(x_1) - F_X(x_2) \leq 0\) So, \[F_X(x_1) - F_X(x_2) = -x_1^{-a} + x_2^{-a} = \frac{1}{x_2^a} - \frac{1}{x_1^a}\] It is clear that since \(x_1 \leq x_2 \Rightarrow 1/x_1 \geq 1/x_2\) moreover recall that the parameter \(a > 0\) so it follows that: \[\frac{1}{x_2^a} - \frac{1}{x_1^a} < 0\] \[\Rightarrow F_X(x_1) - F_X(x_2) \leq 0\] Now we show the two limit criterias. \[\lim_{x \rightarrow +\infty}F_X(x) = \lim_{x \rightarrow +\infty}(1 - \frac{1}{x^a}) = 1 - 0 = 1 \;\;\text{ as desired}\] \[\lim_{x \rightarrow 1}F_X(x) = \lim_{x \rightarrow 1}(1 - \frac{1}{x^a}) = 1 - 1 = 0\;\;\text{ as desired}\]
By the Universality of the Uniform, first we find the quantile function of the Pareto: \[F_X(x) = 1 - x^{-a}\] \[ y = 1 - x^{-a}\] \[ x^{-a} = 1 - y\] \[ -a\ln(x) = ln(1-y)\] \[ln(x) = -\ln(1-y)/a \Rightarrow x = \exp\Big\{ -\ln(1-y)/a \Big\}\] It follows then that: \[X = -\frac{\ln(1 - U)}{a}\sim Pareto\] So one would simply generate say 100000 Uniform (0,1) random variables \(u\) using the computer, and transform these with \(-\frac{\ln(1 - u)}{a}\), these would in turn follow the pareto distribution.

5.24

The distance between two points needs to be measured in meters. The true distance between the points is 10 meters, but due to measurement error we can’t measure the distance exactly. Instead we will observe a value of \(10+\varepsilon\), where \(\varepsilon\) is distributed \(N(0, 0.04)\). Find the probability that the observed distance is within \(0.4\) meters of the true distance (10 meters). Give both an exact answer in terms of \(\Phi\) and an approximate numerical answer.

Solution

What we are looking for is the following probability \[P(-.4 \leq \varepsilon \leq .4)\] We are given some information about the distribution of \(\varepsilon\), namely \(\varepsilon \sim N(0, .04)\) so we can write this probability as: \[\Phi_\varepsilon(.4) - \Phi_\varepsilon(-.4)\] where \(\Phi_\varepsilon\) is the CDF of \(\varepsilon \sim N(0, 0.04)\). This in fact gives the exact values for the desired probabilities. The numerical approximation of these are:

upper_b <- pnorm(.4, mean = 0, sd = sqrt(.04))
lower_b <- pnorm(-.4, mean = 0, sd = sqrt(.04))
is_approx <- upper_b - lower_b

So we have that \(\Phi_\varepsilon(.4) - \Phi_\varepsilon(-.4) \approx\) 0.9544997.

5.30

Let \(Y\sim N(\mu, 1)\). Use the fact that \(P(|Y-\mu| < 1.96\sigma) \approx 0.95\) to construct a random interval \((a(Y), b(Y))\) such that the probability that \(\mu\) is in this interval is approximately \(0.95\).

Solution

We can achive this by manipulating the expression \(P(|Y-\mu| < 1.96\sigma) \approx 0.95\) algebraically: \[P(|Y-\mu| < 1.96\sigma) \approx 0.95\] \[P(-1.96\sigma < Y-\mu < 1.96\sigma) \approx 0.95\] \[P(-1.96\sigma - Y< -\mu < 1.96\sigma - Y )\approx 0.95\] \[P(Y - 1.96\sigma < \mu < Y + 1.96\sigma)\approx 0.95\] So define: \[a(x) = x - 1.96\sigma \;\;\text{ and }\;\;b(x) = x + 1.96\sigma\] then we have that \(P(a(Y) < \mu < b(Y)) \approx 0.95\) as desired.

5.31

Let \(Y = |X|\), with \(X\sim N(\mu, \sigma^2)\). This is a well-defined continuous r.v even though the absolute value function is not differentiable at 0.

Find the CDF of \(Y\) in terms of \(\Phi\)
Find the PDF \(Y\)
Is the PDF of \(Y\) continuous at 0? If not, is this a problem as far as using the PDF to find probabilities?

Solution

We approach this in the usual way, namely: \[F_Y(y) = P(Y \leq y) = P(|X| \leq y) = P(-y \leq X \leq y) = \Phi(y) - \Phi(-y)\] and by symmetry of the Normal we can write this as: \[1 - 2\Phi(-y)\] Now the support for \(X\) is \((-\infty, +\infty)\) therefore the support for \(Y = |X|\) is \((0, +\infty)\). We formally write out the cdf as: \[F_Y(y) = \begin{cases} 0 \;\;\text{ for } \;\;y \leq 0\\ 1 - 2\Phi(-y)\;\; \text{ for }\;\;y \in (0, \infty) \end{cases} \]
To find the pdf of \(Y\) we simply take the derivative of the CDF we derived above: \[\frac{\partial}{\partial y}F_Y(y) = \frac{\partial}{\partial y}(1 - 2\Phi(-y))\] \[ = 0 -2f_y(-y)(-1) = 2f_Y(-y) = 2f_Y(y)\;\; \text{ for }\;\; 0 \leq y < \infty\] The last equation above arises from the fact that the normal density is an even function.
The pdf of \(Y\) appears to not be continuous from both sides, a plot can confirm this:

x <- seq(0,5, by = .001)
qplot(x, 2*dnorm(x),geom = "line")

5.32

Let \(Z\sim N(0,1)\) and let \(S\) be a random sign independent of \(Z\). Show that \(SZ\sim N(0,1)\).

Solution

We can define the sign function as a Bernouilli Trial namely its either 0 (-1) or 1 with probability of success \(p = .5\). So let \(S\sim Bern(.5)\), then we have that \(P(X = 1) = .5\) Now we have that \(Z\sim N(0,1)\), then we can write

6.3

Solution

By a previous example we have that the cdf of the pareto is given by: \[1 - 1/x^a\] to find the median we simply find the inverse of this, set to .5 and solve for the value that achieves this. \[F_X(x) = 1 - 1/x^a\] \[.5 = 1 - 1/x^a \Rightarrow -.5 = -1/x^a \Rightarrow .5 = 1/x^a\] \[\Rightarrow x^a = 2 \Rightarrow ln(x) = \ln(2)/a \Rightarrow \bar{x} = \exp{(\ln(2)/a)}\]

6.7

Solution

As recommended we will take the the log of the proportion given: \[f(x) = \log[x^{a-1}(1-x)^{b-1}] = (a-1)\log(x) + (b-1)\log(x-1)\] Now we take the first derivative, set to zero and solve for \(x\) to find the desired value. \[f'(x) = (a-1)/x + (b-1)/(x-1) = 0 \Rightarrow (x-1)(a-1) + x(b-1) = 0\] \[\Rightarrow xa - x - a + 1 + bx -x = 0 \Rightarrow xa - 2x + bx = a - 1\] \[\Rightarrow x(a - 2 + b) = a -1 \Rightarrow x = (a-1)/(a + b - 2)\] and this is the desired value.

6.8

Solution

Using the information in the previous problem and plugging in the values of the parameters we find the cdf of the beta to be: \[\frac{1}{3}x^3\] so we simply solve: \[1.5(3) = x^3 \Rightarrow median = 1.144714\]

6.15

Solution

Find the MGF of W. We simply apply the definition of the MGF \[M(W) = E(e^{t(W)}) = E(e^{t(X^2 + Y^2)}) = E(e^{t(X^2)}e^{t(Y^2)})\] Then by independence: \[E(e^{t(X^2)})E(e^{t(Y^2)}) = (1 - 2t)^{-1} = 1/(1-2t)\]

6.17

Solution

\[E(Z_n) = E(\frac{\bar{X}_n - \mu}{\sigma/\sqrt{n}}) = (\frac{\sqrt{n}}{\sigma})E[(\bar{X}_n) - \mu] = 0\] and \[Var(Z_n) = (n/\sigma^2)Var(\bar{X}_n - \mu) = (n/\sigma^2)(\sigma^2/n) = 1\]
\[Z_n = \frac{(X_1+\cdots + X_n)/n - \mu}{\sigma/\sqrt{n}} = \sum X_i/(\sigma/\sqrt{n}) - \mu\sqrt{n}/\sigma\] by properties of the MGF and the fact of independence we have that: \[M_Z(t) = \exp\Big\{ -t(\mu\sqrt{n}/\sigma)\Big\}M(t/(\sigma\sqrt{n}))^n\]

6.21

Solution

\[E(e^{tX_n}) = (1 - p_n + p_ne^t) = (1 + \lambda(e^t - 1)/n)^n\] now as \(n\rightarrow \infty\) we have that: \[(1 + \lambda(e^t - 1)/n)^n \rightarrow e^{\lambda(e^t - 1)} = E(e^{tX})\]

Stat 6401 Homework 6

Emanuel Rodriguez

November 24, 2015

5.1

5.6

5.8

5.11

5.14

5.15

5.18

5.24

5.30

5.31

5.32

6.3

6.7

6.8

6.15

6.17

6.21