Background

Within statistics and data science there are a number of distributions that repeat quite frequently. The purpose of this assignment will be to explore these emportant distributions and observe some of their properties.


(1) Uniform distribution

Let X1, X2, . . . , Xn be n mutually independent random variables, each of which is uniformly distributed on the integers from 1 to k. Let Y denote the minimum of the Xi’s. Find the distribution of Y.

If the minimum of \(X_i\) is P(Y=j) where each \(X_i\) has k possibilities: 1, 2 … k, then the total number of assignments across the entire collection of random variables is \(X_1\), \(X_2\) … \(X_n\). From this we can derive that the number of ways of getting:

  • Y = 1 is \(k^n - (k-1)^n\). Options where no \(X_i\) = 1 is subtracted from the total option #.

  • Y = 2 is \((k^n - (k-2)^n)-(k^n - (k-1)^n)\) –> \((k-1)^n-(k-2)^n\). Options where no \(X_i\) = 2 is subtracted from options where no \(X_i\) = 1.

  • Y = j is \((k-j+1)^n - (k-j)^n\). Options where no \(X_i\) = j is subtracted from options where no \(X_i\) = j+1.

And thus for 1 ≤ j ≤ k, we set the number of ways to assign X_1 through X_n over the total number of possibilities (k^n) to define our distribution function as: \[m(j) = \frac{(k-j+1)^n - (k-j)^n)}{k^n}\]

course text solution reference

(2) Machine failure

Your organization owns a copier (future lawyers, etc.) or MRI (future doctors). This machine has a manufacturer’s expected lifetime of 10 years. This means that we expect one failure every ten years. (Include the probability statements and R Code for each part.).

(a) Geometric distribution

What is the probability that the machine will fail after 8 years?. Provide also the expected value and standard deviation. Model as a geometric. (Hint: the probability is equivalent to not failing during the first 8 years..)

pgeom function reference

We calculate our probability using the pgeom() function and then we calculate our expected value and standard deviation using the equations noted below (in the comments) and visualize using visualize.geom(). The visualization confirms our expected value 10 and standard deviation 9.49.

#Set lower.tail to FALSE so that probability is calculated for X > 8
pgeom(8, 0.1, lower.tail = FALSE) 
## [1] 0.3874205
#Calculate expected value: E(x) = 1/p
ex_val <- 1 / 0.1
ex_val
## [1] 10
#Calculate standard deviation value: SD(x) = sqrt(q/p^2)
std <- sqrt(0.9 / (0.1**2))
std
## [1] 9.486833
#Visualize probability of X > 8 using visualize.geom function
visualize.geom(9, 0.1, section = "upper")

#(For fun) visualize probability of X </= 8
#visualize.geom(8, 0.1, section = "lower")

(b) Exponential distribution

What is the probability that the machine will fail after 8 years?. Provide also the expected value and standard deviation. Model as an exponential.

We calculate our probability using the pexp() function, our expected value and standard deviation via \(\mu\) = \(\sigma\) = 1 / \(\lambda\) and then we visualize using visualize.exp(). The visualization confirms equivalent expected value and standard deviation of 0.1 (recall sd = sqrt(var).

pexp(8, 0.1, lower.tail = FALSE)
## [1] 0.449329
#Calculate expected value and standard deviation
both <- 1 / 10
both
## [1] 0.1
visualize.exp(8, 0.1, section = "upper")

(c) Binomial distribution

What is the probability that the machine will fail after 8 years?. Provide also the expected value and standard deviation. Model as a binomial. (Hint: 0 success in 8 years)

We calculate our probability using the pbinom() function and then we calculate our expected value and standard deviation via the equations noted below (in the comments). Calculations provided below:

#probability can also be found via q^n
pbinom(0, 8, 0.1)
## [1] 0.4304672
#calculate expected value: E(x) = np
ex_val <- 8 * 0.1
ex_val
## [1] 0.8
#calculate our standard deviation: SD(x) = npq
sd <- 8 * 0.1 * 0.9
sd
## [1] 0.72

(d) Poisson distribution

What is the probability that the machine will fail after 8 years?. Provide also the expected value and standard deviation. Model as a Poisson.

We calculate our probability using the ppois() function, our expected value and standard deviation via E(x) = \(\lambda\) and SD(x) = sqrt(\(\lambda\)), then we visualize using visualize.exp(). The visualization confirms expected value and standard deviation calculations (recall sd = sqrt(var).

ppois(8, 10, lower.tail = FALSE) #upper tail
## [1] 0.6671803
#Calculate expected value and standard deviation
exp_val <- 10
exp_val
## [1] 10
sd <- sqrt(10)
sd
## [1] 3.162278
visualize.pois(9, 10, section = "upper")