1 Question 1

  1. Consider the population \(\{3, 6, 7, 9, 11, 14\}\). For samples of size 3 without replacement, find (and plot) the sampling distribution of the minimum. What is the mean of the sampling distribution?

1.1 Answer

population <- c(3, 6, 7, 9, 11, 14)
# use combn
samples <- combn(population, 3)
sample_mins = numeric(ncol(samples))
for (i in 1:ncol(samples)) {
  sample_mins[i] <- min(samples[i])
}
ggplot(data.frame(X = sample_mins), aes(x = X)) +
  geom_histogram() +
  theme_bw()

x_bar <- mean(samples)

The sampling distribution can be found above, it is positively skewed. The mean of the sampling distribution is 8.33.

2 Question 2

  1. Let \(X_1, X_2, \ldots, X_n\) be a random sample from some distribution and suppose \(Y = T(X_1, X_2, \ldots, X_n)\) is a statistic. Suppose the sampling distribution of \(Y\) has pdf \(f(y) = (3/8)y^2\) for \(0 \leq y \leq 2.\) Find \(P(0 \leq Y \leq 1/5)\).

2.1 Answer

f_y = function(y){(3/8)*(y^2)}
probability <- integrate(f_y, 0, 1/5)$value

\(P(0 \leq Y \leq 1/5) = 0.001\)

3 Question 3

  1. A friend claim that she has drawn a random sample of size 30, from the exponential distribution with \(\lambda = 1/10\). The mean of her sample is 12.

    1. What is the expected value of a sample mean?
    2. Run a simulation by drawing 10,000 random samples, each of size 30, from \(\text{Exp}(\lambda = 1/10)\) and then compute the mean. What proportion of the sample means are as large as or larger than 12?
    3. Is a mean of 12 unusual for a sample of size 30 from \(\text{Exp}(\lambda = 1/10)\)?

3.1 Part A Answer

The Theoretical value for \(E(\hat{X})\) = \(\mu_\hat{X}\) = \(\mu_X\). This means that the expected value of a sample mean is 10.

3.2 Part B Answer

library(tidyverse)
set.seed(123)
sims <- 10^4 - 1
x_bar = numeric(sims)
for (i in 1:sims) {
  x_bar[i] <- mean(rexp(30, 1/10))
}
sample_mean <- mean(x_bar)
sample_mean_greater <- (sum(x_bar >= 12) + 1)/(sims + 1)

The sample mean is 9.9707052 and 13.41% of the values are greater than or equal to 12.

3.3 Part C Answer

Given a probability of 0.1341 for a mean value at or above 12, we have insufficient evidence to suggest that a mean of 12 is unusual.

4 Question 4

  1. Let \(X_1, X_2, \ldots , X_{10} \overset{i.i.d}\sim N(20, 8)\) and \(Y_1, Y_2, \ldots, Y_{15} \overset{i.i.d}\sim N(16, 7)\). Let \(W = \bar{X} + \bar{Y}.\)

    1. Give the exact sampling distribution of \(W\)
    2. Simulate the sampling distribution in R and plot your results. Check that the simulated mean and standard error are close to the theoretical mean and standard error.
    3. Use your simulation to find \(P(W < 40).\) Calculate an exact answer and compare.

4.1 Part A Answer

\(W \mathtt{\sim} N(36, 3.26)\)

As \(W\) is the sum of two normal distributions, it will be normal. The mean will be 36 and the variance wil be ~3.26 The total mean is the sum of both means, and the variance is the squre root of the square root of the sum of the two variances.

4.2 Part B Answer

sims <- 10^4 - 1
w <- numeric(sims)
for (i in 1:sims) {
  w[i] <- mean(rnorm(10, 20, 8) + rnorm(15, 16, 7))
}
sample_mean <- mean(w)
standard_deviation <- sd(w)

ggplot(data.frame(W = w), aes(x = W)) +
  geom_histogram() +
  theme_bw()

Sample Mean = 35.97 Sample Standard Deviation = 3.23

4.3 Part C Answer

prob <- (sum(w < 40))/(sims + 1)
exact_prob <- pnorm(40, 36, sqrt(sqrt(7^2 + 8^2)))

The sample probability that a value would be below 40 is 0.8926 and the exact calculated probability is 0.8901.

5 Question 5

  1. Let \(X_1, X_2, \ldots , X_{9} \overset{i.i.d}\sim N(7, 3)\) and \(Y_1, Y_2, \ldots, Y_{12} \overset{i.i.d}\sim N(10, 5)\). Let \(W = \bar{X} - \bar{Y}.\)

    1. GIve the exact sampling distribution of \(W\).
    2. Simulate the sampling distribution of \(W\) in R and plot your results using ggplot2. Check that the simulated mean and the standard error are close to the theoretical mean and the standard error.
    3. Use your simulation to find \(P(W < -1.5)\). Calculate an exact answer and compare.

5.1 Part A Answer

\(W \mathtt{\sim} N(17, 2.41)\)

As \(W\) is the sum of two normal distributions, it will be normal. The mean will be 17 and the variance wil be ~2.41 The total mean is the sum of both means, and the variance is the squre root of the square root of the sum of the two variances.

5.2 Code for B and C

# part b.
sims <- 10^4 - 1
w <- numeric(sims)
for (i in 1:sims) {
  w[i] <- mean(rnorm(9, 7, 3) + rnorm(12, 10, 5))
}
sample_mean <- mean(w)
standard_deviation <- sd(w)

ggplot(data.frame(W = w), aes(x = W)) +
  geom_histogram() +
  theme_bw()

# part c.
# simulated answer
prob <- (sum(w < -1.5) + 1)/(sims + 1)
# Exact answer
exact_prob <- pnorm(-1.5, 17, sqrt(sqrt(3^2 + 5^2)))

5.3 Part B Answer

Sample Mean = 17 Sample Standard Deviation = 1.81

5.4 Part C Answer

The sample probability that a value would be below 40 is 10^{-4} and the exact calculated probability is 0.

6 Question 6

  1. Let \(X_1, X_2, \ldots , X_n\) be a random sample from \(N(0, 1)\). Let \(W = X_1^2 + X_2^2 + \cdots + X_N^2.\) Describe the sampling distribution of \(W\) by running a simulation, using \(n = 2.\) What is the mean and variance of the sampling distribution of \(W\)? Repeat using \(n = 4, n = 5.\) What observations or conjectures do you have for general \(n\)?

6.1 Code and Answer

sims <- 10^4

# Run for n = 2 (its easier to copy and paste code due to memory allocation)
w <- numeric(sims)
n <- 2
for (i in 1:sims) {
  w[i] <- mean(rnorm(n, 0, 1))^2
}
sample_mean1 <- mean(w)
variance1 <- var(w)
ggplot(data.frame(W = w), aes(x = W)) +
  geom_histogram() +
  theme_bw()

# Run for n = 4
w <- numeric(sims)
n <- 4
for (i in 1:sims) {
  w[i] <- mean(rnorm(n, 0, 1))^2
}
sample_mean2 <- mean(w)
variance2 <- var(w)
ggplot(data.frame(W = w), aes(x = W)) +
  geom_histogram() +
  theme_bw()

# Run for n = 5
w <- numeric(sims)
n <- 5
for (i in 1:sims) {
  w[i] <- mean(rnorm(n, 0, 1))^2
}
sample_mean3 <- mean(w)
variance3 <- var(w)
ggplot(data.frame(W = w), aes(x = W)) +
  geom_histogram() +
  theme_bw()

We can conclude that as you increase \(n\), the sample’s variance will decrease. The distribution’s mean will also decrease (in our case) and near the actual expected mean.

Mean 1 = 0.5079857 Mean 2 = 0.2496777 Mean 3 = 0.2027067

Variance 1 = 0.5097832 Variance 2 = 0.126344 Variance 3 = 0.0799799

7 Question 7

  1. Let \(X\) be a uniform random variable on the interval \([40, 60]\) and \(Y\) a uniform random variable on \([45, 80].\) Assume that \(X\) and \(Y\) are independent.

    1. Compute the expected value and variance of \(X + Y.\)
    2. Simulate the sampling distribution of \(X + Y.\) Desribe the graph of the distribution of \(X + Y\). Compute the mean and variance of the sampling distribution and compare this to the theoretical mean and variance.
    3. Suppose the time (in minutes) Jack takes to complete his statistics homework is \(\text{Unif}[40, 60]\) and the time JIll takes is \(\text{Unif}[40, 60].\) Assume they work independently. One day they announce that their total time to finish an assignment was less than 90 min. How likely is this?

7.1 Part A Answer

expected <- (60+40)/2 + (80+45)/2
variance <- ((60-40)^2)/12 + ((80-45)^2)/12

The expected value of the random variables \(X\) and \(Y\) is simply the sum of their expected values. The expected value is 112.5. The variance is simply the sum of the two variances, which is 135.4166667.

8 Part B Answer

sims <- 10^4
sample <- numeric(sims)
for (i in 1:sims) {
  sample[i] <- mean(runif(30, 40, 60) + runif(30, 45, 80))
}
sample_mean <- mean(sample)
variance <- var(sample)
ggplot(data.frame(W = sample), aes(x = W)) +
  geom_histogram() +
  theme_bw()

The actual sampling mean was 112.5128753 and the actual sampling variance for this sample size was 4.5060025.
##Part C Answer

sims <- 10^4 - 1
sample <- numeric(sims)
for (i in 1:sims) {
  sample[i] <- mean(runif(30, 40, 60) + runif(30, 40, 60))
}
prob <- (sum(sample < 90) + 1)/(sims + 1)

\(P(X + Y < 90) = 10^{-4}\)