library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
set.seed(905)

Question 1

The price of one share of stock in the Pilsdorff Beer Company (see Exercise 8.2.12) is given by \(Y_n\) on the \(n\)th day of the year. Finn observes that the differences \(X_n = Y_{n+1} − Y_n\) appear to be independent random variables with a common distribution having mean \(µ = 0\) and variance \(σ^2 = 1/4\). If \(Y_1 = 100\), estimate the probability that \(Y_365\) is…

I’d like to approach this problem intuitively first, then arrive at the ‘true’ theoretical value.

Given a variance of \(1/4\), we know that the standard deviation of \(X_n\) is \(1/2\) or \(0.5\).

Experimentally, we can approach our probability by generating a random normal distribution given our mean, standard deviation, and a sample of \(364\) (the number of days from \(Y_1\) to \(Y_365\). For each trial, the sum of that distribution equals the change from \(Y_1\) to \(Y_365\), which we can add to \(100\) to get \(Y_365\). From there, the percentage of a trials over a certain value can tell us the rough probabilities associated with each end value.

Y_365s <- c()

for (i in 1:100000) {
  dist <- rnorm(n=365, mean=0, sd=0.5)
  full_year_change <- sum(dist)
  Y_365_trial <- 100 + full_year_change
  
  Y_365s <- c(Y_365s, Y_365_trial)
}

\(Y_{365}\geq100\):

sum(ifelse(Y_365s >= 100, 1, 0)) / 100000
## [1] 0.49731

\(Y_{365}\geq110\):

sum(ifelse(Y_365s >= 110, 1, 0)) / 100000
## [1] 0.14694

\(Y_{365}\geq110\):

sum(ifelse(Y_365s >= 120, 1, 0)) / 100000
## [1] 0.01845

So our experimental probabilities are roughly \(49.7\%\), \(14.6\%\), and \(1.8\%\) respectively. Let’s see how they work out through calculation.

a) >= 100

If the mean change is \(0\) over a normal distribution, the expected total change over any number of days should itself be \(0\), with equal likelihood of being above or below. Therefore, the probability of \(Y_365\) being greater than or equal to \(Y_1\) at 100 is \(50\%\)–very close to my experimental value.

b) >= 110

We are essentially looking for the probability that the sum of all the daily changes is greater than or equal to 10.

The total variance of the sum of stock changes over time is given by \(n * \sigma^2\), so the standard devation is given by the square root of that variance, \(\sqrt{n * \sigma^2}=\sqrt{364 * (1/4)}=\sqrt{91}\approx9.54\)

If one standard deviation above 100 is 109.54, how many standard deviations take us to 110? \(10 / 9.54 \approx 1.05\) standard deviations, which corresponds to about \(35.3\%\) of the area under the curve above the mean. Adding the remaining \(50\%\) to the left of the curve, that means \(85.3\%\) of outcomes fall below 110, and \(1-0.853=0.147\), or \(14.7\%\) fall above it. This corresponds nicely to my experimental value above.

c) >= 120

We are now looking for the probability that the sum of all the daily changes is greater than or equal to 20.

We know from the calculation in b.) that the standard deviation of the total change is about \(9.54\). 20 is \(20 / 9.54 \approx 2.1\) standard deviations above the mean of 0.

A z-score (the amount of standard deviations) of 2.1 accounts for roughly \(98.2\%\) of the area under the curve, leaving \(1.8\%\) to the right of our value. Again, this corresponds to our experimental value.

Question 2

Calculate the expected value and variance of the binomial distribution using the moment generating function.

In general, the moment generating function is given by the sum from \(x=0\) to \(n\) of \(e^{xt}\) times a particular distribution function. Knowing the binomial distribution, this gives us:

\[ \sum_{x=0}^{n}e^{xt}\frac{n!}{x(n-x)!}p^x q^{n-x}dx \] This can be simplified to the following:

\[ (pe^t+q)^n \]

In order to solve for any given moment, including the first (the expected value), we must first take the derivative of this formula, giving us:

\[ n(pe^t+q)^{n-1} * pe^t \] Finally, the first moment is given when solving this derivative for \(t=0\) (remembering that \(q=1-p\)):

\[ n(pe^0+q)^{n-1} * pe^0 \\ n(p+q)^{n-1}*p \\ n(p+1-p)^{n-1}*p \\ n*p \] So the expected value of the binomial distribution is simply \(n*p\), or the number of trials times the probability of success in each trial.

The variance is given by taking the second moment (the second derivative of the MGF at \(t=0\)), and subtracting the square of the of first moment. The second derivative of the MGF is as follows:

\[ (np + n^2p^2-np^2) \] Substracting the square of the first moment, we are left with:

\[ np + n^2p^2-np^2 - (np)^2 \\ np + n^2p^2-np^2 - n^2p^2 \\ np-np^2 \\ np(1-p) \]

…or, the expected value times the probability of failure of any given trial.

Question 3

Calculate the expected value and variance of the exponential distribution using the moment generating function.

Using the above definition of an MGF, and knowing the exponential distribution is given by \(\lambda e^{-\lambda x}\), we arrive at the following:

\[ \sum_{x=0}^{n}e^{xt}\lambda e^{-\lambda x}dx \] This simplifies down to the following:

\[ \frac{\lambda}{\lambda - t} \]

In order to get the expected value (the first moment, or the moment when \(t=0\)), we must take the first derivative of the MGF. This works out to:

\[ \frac{\lambda}{(\lambda - t)^2} \] Solving this for \(t=0\) as above, we are left with:

\[ \frac{\lambda}{(\lambda)^2} = \frac{1}{\lambda} \]

So the expected value (mean) is the reciprocal of the rate.

As above, the variance is given by the second moment (the second derivative at \(t=0\)) minus the square of the first moment (the first derivative at \(t=0\)).

The second derivative of the MGF is given by:

\[ \frac{2}{\lambda^2} \] Subtracting the square of the first derivative, we are left with:

\[ \frac{2}{\lambda^2} - (\frac{1}{\lambda })^2 = \frac{2}{\lambda^2} - \frac{1}{\lambda^2} = \frac{1}{\lambda^2} \]