population <- c(3, 6, 7, 9, 11, 14)
# use combn
samples <- combn(population, 3)
sample_mins = numeric(ncol(samples))
for (i in 1:ncol(samples)) {
sample_mins[i] <- min(samples[i])
}
ggplot(data.frame(X = sample_mins), aes(x = X)) +
geom_histogram() +
theme_bw()
x_bar <- mean(samples)
The sampling distribution can be found above, it is positively skewed. The mean of the sampling distribution is 8.33.
f_y = function(y){(3/8)*(y^2)}
probability <- integrate(f_y, 0, 1/5)$value
\(P(0 \leq Y \leq 1/5) = 0.001\)
A friend claim that she has drawn a random sample of size 30, from the exponential distribution with \(\lambda = 1/10\). The mean of her sample is 12.
The Theoretical value for \(E(\hat{X})\) = \(\mu_\hat{X}\) = \(\mu_X\). This means that the expected value of a sample mean is 10.
library(tidyverse)
set.seed(123)
sims <- 10^4 - 1
x_bar = numeric(sims)
for (i in 1:sims) {
x_bar[i] <- mean(rexp(30, 1/10))
}
sample_mean <- mean(x_bar)
sample_mean_greater <- (sum(x_bar >= 12) + 1)/(sims + 1)
The sample mean is 9.9707052 and 13.41% of the values are greater than or equal to 12.
Given a probability of 0.1341 for a mean value at or above 12, we have insufficient evidence to suggest that a mean of 12 is unusual.
Let \(X_1, X_2, \ldots , X_{10} \overset{i.i.d}\sim N(20, 8)\) and \(Y_1, Y_2, \ldots, Y_{15} \overset{i.i.d}\sim N(16, 7)\). Let \(W = \bar{X} + \bar{Y}.\)
R and plot your results. Check that the simulated mean and standard error are close to the theoretical mean and standard error.\(W \mathtt{\sim} N(36, 3.26)\)
As \(W\) is the sum of two normal distributions, it will be normal. The mean will be 36 and the variance wil be ~3.26 The total mean is the sum of both means, and the variance is the squre root of the square root of the sum of the two variances.
sims <- 10^4 - 1
w <- numeric(sims)
for (i in 1:sims) {
w[i] <- mean(rnorm(10, 20, 8) + rnorm(15, 16, 7))
}
sample_mean <- mean(w)
standard_deviation <- sd(w)
ggplot(data.frame(W = w), aes(x = W)) +
geom_histogram() +
theme_bw()
Sample Mean = 35.97 Sample Standard Deviation = 3.23
prob <- (sum(w < 40))/(sims + 1)
exact_prob <- pnorm(40, 36, sqrt(sqrt(7^2 + 8^2)))
The sample probability that a value would be below 40 is 0.8926 and the exact calculated probability is 0.8901.
Let \(X_1, X_2, \ldots , X_{9} \overset{i.i.d}\sim N(7, 3)\) and \(Y_1, Y_2, \ldots, Y_{12} \overset{i.i.d}\sim N(10, 5)\). Let \(W = \bar{X} - \bar{Y}.\)
R and plot your results using ggplot2. Check that the simulated mean and the standard error are close to the theoretical mean and the standard error.\(W \mathtt{\sim} N(17, 2.41)\)
As \(W\) is the sum of two normal distributions, it will be normal. The mean will be 17 and the variance wil be ~2.41 The total mean is the sum of both means, and the variance is the squre root of the square root of the sum of the two variances.
# part b.
sims <- 10^4 - 1
w <- numeric(sims)
for (i in 1:sims) {
w[i] <- mean(rnorm(9, 7, 3) + rnorm(12, 10, 5))
}
sample_mean <- mean(w)
standard_deviation <- sd(w)
ggplot(data.frame(W = w), aes(x = W)) +
geom_histogram() +
theme_bw()
# part c.
# simulated answer
prob <- (sum(w < -1.5) + 1)/(sims + 1)
# Exact answer
exact_prob <- pnorm(-1.5, 17, sqrt(sqrt(3^2 + 5^2)))
Sample Mean = 17 Sample Standard Deviation = 1.81
The sample probability that a value would be below 40 is 10^{-4} and the exact calculated probability is 0.
sims <- 10^4
# Run for n = 2 (its easier to copy and paste code due to memory allocation)
w <- numeric(sims)
n <- 2
for (i in 1:sims) {
w[i] <- mean(rnorm(n, 0, 1))^2
}
sample_mean1 <- mean(w)
variance1 <- var(w)
ggplot(data.frame(W = w), aes(x = W)) +
geom_histogram() +
theme_bw()
# Run for n = 4
w <- numeric(sims)
n <- 4
for (i in 1:sims) {
w[i] <- mean(rnorm(n, 0, 1))^2
}
sample_mean2 <- mean(w)
variance2 <- var(w)
ggplot(data.frame(W = w), aes(x = W)) +
geom_histogram() +
theme_bw()
# Run for n = 5
w <- numeric(sims)
n <- 5
for (i in 1:sims) {
w[i] <- mean(rnorm(n, 0, 1))^2
}
sample_mean3 <- mean(w)
variance3 <- var(w)
ggplot(data.frame(W = w), aes(x = W)) +
geom_histogram() +
theme_bw()
We can conclude that as you increase \(n\), the sample’s variance will decrease. The distribution’s mean will also decrease (in our case) and near the actual expected mean.
Mean 1 = 0.5079857 Mean 2 = 0.2496777 Mean 3 = 0.2027067
Variance 1 = 0.5097832 Variance 2 = 0.126344 Variance 3 = 0.0799799
Let \(X\) be a uniform random variable on the interval \([40, 60]\) and \(Y\) a uniform random variable on \([45, 80].\) Assume that \(X\) and \(Y\) are independent.
expected <- (60+40)/2 + (80+45)/2
variance <- ((60-40)^2)/12 + ((80-45)^2)/12
The expected value of the random variables \(X\) and \(Y\) is simply the sum of their expected values. The expected value is 112.5. The variance is simply the sum of the two variances, which is 135.4166667.
sims <- 10^4
sample <- numeric(sims)
for (i in 1:sims) {
sample[i] <- mean(runif(30, 40, 60) + runif(30, 45, 80))
}
sample_mean <- mean(sample)
variance <- var(sample)
ggplot(data.frame(W = sample), aes(x = W)) +
geom_histogram() +
theme_bw()
The actual sampling mean was 112.5128753 and the actual sampling variance for this sample size was 4.5060025.
##Part C Answer
sims <- 10^4 - 1
sample <- numeric(sims)
for (i in 1:sims) {
sample[i] <- mean(runif(30, 40, 60) + runif(30, 40, 60))
}
prob <- (sum(sample < 90) + 1)/(sims + 1)
\(P(X + Y < 90) = 10^{-4}\)