I really don’t know the formula to determine the expected minimum of a random variable and don’t remember seeing it in the readings. So let’s try simulating this…
# k = range of possible values
# n = number of mutually independent random variables
# m = number of values in each of the n variables
sim <- function(k, n, m){
df <- data.frame(matrix(ncol = n, nrow = m))
name_list <- c()
Y <- c()
for(i in seq(n)){
name <- paste('X_', i, sep = "")
name_list[i] <- name
ran_var <- c(sample(1:k, m, replace=T))
df[, i] <- ran_var
Y[i] <- min(ran_var)
}
colnames(df) <- name_list
print(df)
print("Y =", Y)
return(Y)
}
If the number of values in each variable is >= k you would expect the min to be 1 or close to it. So let’s try our simulation function with 5 random variables, \(X_1, ...X_5\) with 12 value each in the range 1 through 10.
set.seed(10)
# k = 10
# n = 5
# m = 12
sim1 <- sim(10, 5, 12)
## X_1 X_2 X_3 X_4 X_5
## 1 6 2 5 9 2
## 2 4 6 8 10 9
## 3 5 4 9 7 4
## 4 7 5 3 6 10
## 5 1 1 8 3 3
## 6 3 3 4 3 5
## 7 3 4 6 1 2
## 8 3 9 1 8 6
## 9 7 9 2 3 5
## 10 5 7 9 2 5
## 11 7 8 5 1 4
## 12 6 4 8 5 6
## [1] "Y ="
counts <- table(sim1)
barplot(counts)
As expected, most of our minimums are equal to 1 with only one of the minimum values equal to 2.
However, if the number of values in each variable is < k you would have a much different range of values in Y. Let’s try our simulation function again this time with 5 random variables, \(X_1, ...X_5\) with 12 values each in the range 1 through 100.
set.seed(10)
# k = 100
# n = 5
# m = 12
sim2 <- sim(100, 5, 12)
## X_1 X_2 X_3 X_4 X_5
## 1 51 12 41 83 11
## 2 31 60 71 96 81
## 3 43 36 84 69 36
## 4 70 43 24 51 94
## 5 9 6 78 28 25
## 6 23 27 36 23 48
## 7 28 40 54 2 20
## 8 28 84 10 73 59
## 9 62 87 17 25 46
## 10 43 62 90 17 47
## 11 66 78 43 2 40
## 12 57 36 75 49 51
## [1] "Y ="
sim2 <- sort(sim2)
counts <- table(sim2)
barplot(counts)
Once again, as expected, this time we get 5 different minimum values within the range 2 through 11.
Here’s a relatively simple way I found online (in the comments) of this post for computing the expected minimum of a set of independent identically distributed random variables (IID’s)…
Add another \(X_{n+1}\) random variable which is also uniformly distributed on the integers from 1 to \(k\) to the collection, let \(Y = min{X_1, ..., X_n}\), and compute \(P(X_{n+1} < Y)\) in two ways:
The first part makes saense to me, but the second part I am not so sure about… Also, the post was for uniform distributions from 0 to 1 so maybe that makes a difference?
\[ P(T > 8) = (1-0.1)^8 = 0.9^8 = 0.4304672\\ \]
\[ \mu = E[X] = \frac{1}{p} = \frac{1}{0.1} = 10\\ \sigma^2 = V[X] = \frac{1-p}{p^2} = \frac{1-0.1}{0.1^2} = 90\\ \sigma = \sqrt{90} = 9.486833 \]
(1-0.1)^8
## [1] 0.4304672
pgeom(0, 0.9^8)
## [1] 0.4304672
\[ P(T > 8) = 1-(1 - e^{- \lambda x}) = e^{- 0.1 \times 8}\\ \]
\[ \mu = E[X] = \frac{1}{\lambda} = \frac{1}{0.1} = 10\\ \sigma^2 = V[X] = \frac{1}{\lambda^2} = \frac{1}{0.1^2} = 100\\ \sigma = \sqrt{100} = 10 \]
exp(- 0.1*8)
## [1] 0.449329
pexp(8,0.1, lower.tail = FALSE)
## [1] 0.449329
\[ P(T > 8) = {8 \choose 0} \times 0.1^0 \times (1-0.1)^{8-0}= 1 \times 1 \times 0.9^8 = 0.4304672\\ \]
\[ \mu = E[X] = np = 8 \times 0.1 = 0.8\\ \sigma^2 = V[X] = np(1-p) = 8 \times 0.1 \times 0.9 = 0.72\\ \sigma = \sqrt{0.72} = 0.8485281 \]
choose(8,0)*0.1^0*(1-0.1)^8
## [1] 0.4304672
pbinom(0, 8, 0.1)
## [1] 0.4304672
\[ P(T > 8) = \frac{e^{-\lambda} \lambda^k}{k!} = \frac{e^{-(.1 \times 8)} (.1 \times 8)^0}{0!} = 0.449329 \]
\[ \mu = E[X] = \lambda = 0.1*8 = 0.8\\ \sigma^2 = V[X] = \lambda = 0.1*8 = 0.8\\ \sigma = \sqrt{\lambda} = \sqrt{0.1*8} = \sqrt{0.8} = 0.8944272 \]
\[ \mu = E[X] = \lambda = 0.1 = 0.1\\ \sigma^2 = V[X] = \lambda = 0.1 = 0.1\\ \sigma = \sqrt{\lambda} = \sqrt{0.1} = 0.3162278 \]
# lambda = the probability in an 8 year period, 0.1 per year times 8 years
(exp(1)^(-(0.1*8))*(0.1*8)^0)/factorial(0)
## [1] 0.449329
ppois(0, 0.1*8)
## [1] 0.449329