Draw a picture of the CDF of X over the interval x ∈ [0, 8]. What are P(2 < X ≤ 4.5) and P(2 ≤ X < 4.5)?
See Submission Doc for CDF.
\(P(2 < X \leq 4.5)\) is equal to \(F_{X}(4.5) - F_{X}(2)\). We can construct,
\(P(2 < X \leq 4.5) = F_{X}(4.5) - F_{X}(2). = 0.2 - 0.1 = 0.1\).
Similarly, we can construct and plug in for \(P(2 \leq X < 4.5)\), adjusting the formula with the identity \(P(X=x) = P(X \leq x) - P(X < X)\) to account for the different bounds:
\(P(2 \leq X < 4.5) = (F_{X}(4.5) - f_{X}(4.5)) - (F_{X}(2) - f_{X}(2)) = (0.2 - 0) - (0.1 - 0.1) = 0.2\).
\(X \sim U(0,1)\). Per class slides, the uniform distribution has the following PDF:
\(f_{X}(x)= \begin{cases} \frac{1}{1-0}, & \text{if}\ 0 \leq X \leq 1 \\ 0, & \text{else}. \\ \end{cases}\)
And CDF:
\(F_{X}(x)= \begin{cases} 0, & \text{if} \ X < 0 \\ \frac{x}{1-0}, & \text{if}\ 0 \leq X \leq 1 \\ 1, & \text{if} X > 1 \\ \end{cases}\)
So,
\(\begin{aligned} Pr(X^2 \leq 1/4) &= Pr(X \leq \sqrt{1/4}) \\ &= F_{X}(\sqrt{1/4}) \\ &= \sqrt{1/4} \\ &= 1/2. \end{aligned}\)
Similar to above, we have \(P(X^2 ≤ a)\) = \(P(X ≤ \sqrt{a})\). So,
\(F_{X^2}(a)= \begin{cases} 0, & \text{if} \ \sqrt{a} < 0 \\ \frac{\sqrt{a}}{1-0}, & \text{if}\ 0 \leq \sqrt{a} \leq 1,\\ 1, & \text{if} \sqrt{a} > 1. \\ \end{cases}\)
For \(Y = X^2\), \(f_{Y}(y) = F'_{Y}(y)\). Thus, for \(0 \leq Y \leq 1\),
\(\begin{aligned} F_{Y}(y) &= \sqrt{y} \\ F'_{Y}(y) &= \frac{d}{dy}\sqrt{y} \\ &= \frac{1}{2\sqrt{y}}. \\ \end{aligned}\)
\(\begin{aligned} E[Y] &= \int_{-\infty}^{\infty}yf_Y(y)dy \\ &= \int_{0}^{1}yf_Y(y)dy \\ &= \int_{0}^{1}y*\frac{1}{2\sqrt{y}}dy \\ &= \int_{0}^{1}\frac{\sqrt{y}}{2}dy \\ &= [\frac{x^{3/2}}{3}]_0^1 \\ &= \frac{(1)^{3/2}}{3} - \frac{(0)^{3/2}}{3} \\ &= 1/3. \end{aligned}\)
\(\begin{aligned} Var[Y] &= E[(Y - E[Y])^2] \\ &= \int_{0}^{1}(y - E[Y])^2f_Y(y)dy \\ &= \int_{0}^{1}(y - 1/3)^2\frac{1}{2\sqrt{y}}dy \\ \end{aligned}\)
\(\int_{0}^{1}(y - 1/3)^2\frac{1}{2\sqrt{y}}dy\) is a tedious integral. Plugging into Symbolab we get \(Var[Y] = \frac{4}{45}.\)
Suppose that we have d independent standard normal random variables Z_1, . . . , Z_d, where each \(Z_i ∼ N(0, 1)\). We say that a continuous random variable X follows a chi-squared (\(\chi^2\)) distribution with d degrees of freedom if \(X \cong Z_1^2 + · · · + Z_d^2\). Compute \(E(X)\) using the relationship between the normal and the chi-squared.
Given \(X \sim \chi_d^2\). We should use what we know about \(E[X]\) for \(X \sim N(\mu, \sigma)\). To start, \(E[X] = E[Z_1^2 + ... + Z_d^2].\) We can separate due to the property of linear combination of random variables. Thus,
\(E[X] = E[Z_1^2] + ... + E[Z_d^2]\)
We know that \(Var(Z) = E[Z^2]-E[Z]^2\). So,
\(\begin{aligned} E[Z^2] &= Var(Z) + E[Z]^2 \\ &= \sigma^2 + \mu^2 \\ &= (1)^2 + (0)^2 = 1. \end{aligned}\)
Therefore, plugging in,
\(\begin{aligned} E[X] &= E[Z_1^2] + ... + E[Z_d^2] \\ E[X] &= d. \end{aligned}\)
Markov performs dimensional analysis on his expected speed to find expected time, but the expected speed has already been balanced for probability, and so the transformation is incorrect. I would perform dimensional analysis before calculating expected value, as follows:
\(\begin{aligned} E[T] &= 0.4(2*(1/5)) + 0.6(2*(1/10)) \\ &= \frac{8}{50} + \frac{12}{100} \\ &= \frac{7}{25}. \\ \end{aligned}\)
Suppose that U is a continuous random variable with a uniform distribution on [0, 1]. Now suppose that f is the PDF of some continuous random variable of interest, that F is the corresponding CDF, and assume that F is invertible, so that the function \(F_{−1}(u)\) exists and assigns a unique value to each \(u,\) \(∈ (0, 1)\). Show that the random variable \(X = F_{−1}(U)\) has PDF \(f(x)\).
Begin with CDF \(F_X(x)\), so that \(Pr(X \leq x)\). Substituting for \(X\) we have \(Pr(F_X^{-1}(U) \leq x)\). We can then transform using the CDF, which removes the inverse on \(X\), and gives us \(Pr(F_x(F_X^{-1}(U)) \leq F_X(x)) = Pr(U \leq F_X(x))\). By definition of the CDF for \(U \sim U(0,1)\), \(Pr(U \leq F_X(x)) = F_X(x)\). Differentiating, we have \(f(x)\). Therefore, \(X \sim F_{−1}(U)\) also has PDF \(f(x)\).
Suppose that \(X_N ∼ Binom(N, P)\) be the (random) number of successes in a sequence of N binary trials. Let \(\hat{p}_N = X_N/N\) denote the proportion of observed successes. Calculate \(E[\hat{p}_N]\) and \(sd(\hat{p}_N)\).
\(E[\hat{p}_N]\) can be calculated as follows. Expected value is a liner operator, and \(X_N\) is an independent variable, so we can remove \(\frac{1}{N}\) from the operation. Thus,
\(E[\hat{p}_N] = E[\frac{X_N}{N}] = \frac{1}{N}E[X_N]\)
We are given \(X_N ∼ Binom(N, P)\). From text we know that \(E[X_N] = NP\). Thus,
\(E[\hat{p}_N] = \frac{NP}{N} = P\).
\(sd(\hat{p}_N)\) can be calculated as follows. Let us start by taking \(Var[\hat{p}_N] = Var[\frac{X_N}{N}]\). Similar to \(E[\hat{p}_N]\), we can factor out the constant, \(\frac{1}{N}\). In this case, however, we factor out \(\frac{1}{N^2}\). We are left with
\(Var[\hat{p}_N] = \frac{1}{N^2}Var[X_N]\)
Plugging in for \(Var[X_N] = NP(1-P)\) and taking square root, we have,
\(sd[\hat{p}_N] = \sqrt{Var[\hat{p}_N]} = \sqrt{\frac{P(1-P)}{N}}.\)
Verify that the Monte Carlo mean and standard deviation of your simulated pˆ5’s agree, at least approximately, with the theoretical mean and standard deviation computed from your result in (A).
See below head and requested summary statistics from Monte Carlo simulation of \(\hat{p}_N\).
## result
## 1 0.0
## 2 0.8
## 3 0.6
## 4 0.0
## 5 0.4
## 6 0.8
## mean sd
## 0.4884 0.227678
As shown above, the Monte Carlo simulation yielded a mean of \(0.4984\), almost exactly our calculated mean of \(P = 0.5\). The Monte Carlo simulation yielded a standard deviation of \(\approx 0.2239\), which is also almost exactly our calculated sd of \(\sqrt{\frac{P(1-P)}{N}} = \sqrt{\frac{(0.5)(0.5)}{5}} \approx 0.2236\).
Now repeat the process in part (B) for pˆ10, pˆ25, pˆ50, and pˆ100. Make a graph that overlays two sets of points:(i) the sample standard deviation of pˆN versus N for the five different values of N that you used in your simulations; and (ii) the corresponding theoretical standard deviations versus N, calculated from your result in Part A. Comment on the patterns you see in the graph.
Here, we see that sd for both the MonteCarlo and Calculated standard deviations decreases exponentially? with N. Where the change in sd with respect to N decreases as N increases, and approaches zero.
Suppose that \(X_1, . . . , X_N\) are a set of \(N\) independent samples from an exponential distribution with rate λ. Let \(Y_N = max{X_1, . . . , X_N}\) be the maximum value in your sample. Derive the PDF of \(Y_N\) for fixed \(N\)?
Logic for the following derived w help from ChatGTP:
Begin with CDF of \(Y_N\), or \(F_{Y_N}(y) = Pr(Y_N \leq y)\). Since \(Y_N\) is a max, we can construct the following multivariate case:
\(F_{Y_N}(y) = Pr(Y_N \leq y) = Pr(X_1 \leq y, ... , X_N \leq y)\)
Since all X are independent, we can rewrite the joint CDF \(Pr(X_1 \leq y, ... , X_N \leq y)\) as a product of all the marginal probabilities. Thus,
\(F_{Y_N}(y) = Pr(X_1 \leq y)*...*Pr(X_N \leq y)\).
To simplify further, we will need to rewrite using the definition of CDF, adjusted for the domain of X, and the identity provided in the problem: \(F_{X}(x) = \int_0^{y}\lambda e^{-\lambda x}dx = 1-e^{-\lambda y}\).
\(\begin{aligned} F_{Y_N}(y) &= \int_0^y\lambda e^{-\lambda X_1}dX_1 * ... * \int_0^y\lambda e^{-\lambda X_N}dX_N \\ &= [1-e^{-\lambda y}]^N \end{aligned}\)
Lastly, we take the derivative of the CDF with respect to y:
\(F'_{Y_N} = N[1-e^{-\lambda y}]^{N-1} = f_{Y_N}(y).\)