Question 1

The question is asking to generate 1000 samples from the distribution of \(X\). However, each time \(X\) takes on different n or p values. \(X\)~\(Binom(n,p)\) would take on different n and p values each time and a different graph will be plotted for every condition. The values for \(n\) and \(p\) are as follow:

\(p = 0.2, 0.3, 0.4, 0.5\)

\(n = 10, 20, 100, 200, 1,000, 2,000, 10,000, 20,000, 100,000, 200,000\)

The code chunk below computes normal distribution for all the 40 possibilities and plots a Q-Q graph for each. Also the mean and variance of the binomial distributions were calculated as follow: \(mean = n * p\) and \(variance = n * p * (1-p)\).

bi.p <- as.array(c(0.2,0.3,0.4,0.5))
bi.n <- as.array(c(10,20,100,200,1000,2000,10000,20000,100000,200000))
bi.mean <- vector("list")
bi.var <- vector("list")
bi.counter = 1
par(mfrow = c(3,3))
for (j in bi.n) {
  for (i in bi.p) {
    RandomData <- rbinom(1000, j, i)
    qqnorm(RandomData, main = bquote("Q-Q Plot of Binom(" ~ .(as.character(j)) ~ "," ~ .(as.character(i)) ~ ")"))
    qqline(RandomData, col = "red")
    bi.mean[bi.counter] <- j * i #mean = n * p
    bi.var[bi.counter] <- j * i * (1-i) #variance = n * p * q
    bi.counter = bi.counter + 1
  }
}

To interpret a Q-Q plot we must observe the cluster of points on the graph as close as possible to the diagonal line (Q-Q line) on the graph. We have 1000 samples from the distribution of \(X\) for each graph. As the number of observations (\(n\)) increases, we observe more points around the diagonal line. The close proximity of the observation to the Q-Q line is a measurement of how well the sample was approximated by the normal distribution. It intuitively makes sense to have a better approximation if the number of observations is bigger. This would basically give us more chances of predicting a closer observation (depending on the probability of the observation).

bi.mean = matrix(bi.mean, ncol = 10, byrow = TRUE)
colnames(bi.mean) = c("n=10","n=20","n=100","n=200","n=1000","n=2000","n=10000","n=20000","n=100000","n=200000")
rownames(bi.mean) = c("p=0.2","p=0.3","p=0.4","p=0.5")
bi.var = matrix(bi.var, ncol = 10, byrow = FALSE)
colnames(bi.var) = c("n=10","n=20","n=100","n=200","n=1000","n=2000","n=10000","n=20000","n=100000","n=200000")
rownames(bi.var) = c("p=0.2","p=0.3","p=0.4","p=0.5")

Mean

n=10 n=20 n=100 n=200 n=1000 n=2000 n=10000 n=20000 n=100000 n=200000
p=0.2 2 3 4 5 4 6 8 10 20 30
p=0.3 40 50 40 60 80 100 200 300 400 500
p=0.4 400 600 800 1000 2000 3000 4000 5000 4000 6000
p=0.5 8000 10000 20000 30000 40000 50000 40000 60000 80000 1e+05

Variance

n=10 n=20 n=100 n=200 n=1000 n=2000 n=10000 n=20000 n=100000 n=200000
p=0.2 1.6 3.2 16 32 160 320 1600 3200 16000 32000
p=0.3 2.1 4.2 21 42 210 420 2100 4200 21000 42000
p=0.4 2.4 4.8 24 48 240 480 2400 4800 24000 48000
p=0.5 2.5 5 25 50 250 500 2500 5000 25000 50000

The mean and variance above was computed to give a better undestading of the data.

Question 2

Just like for question 1 we must iterate through different values. Poisson distribution was calculated for different values of \(n\) and \(λ\) The values are as follow:

\(λ = 10, 15, 20\)

\(n=10, 100, 1,000, 10,000, 100,000\)

The code chunk below iterates through possibilities of \(λ\) and \(n\) to generate \(Y\)~\(Poisson(λ)\) distribution. The specific command to generate the plots was commented out, however, a supplementary pdf file was attached to the assignment including all the plotted graphs.

The unbiased estimate of Poisson distribution is the mean of the distribution. Within the same loop, the 95% confidence interval of unbiased estimate was computed. It’s noteworthy to mention, in order to compute the 95% CI, this formula was used: \(95\% \ CI = λ - 1.96 * \sqrt{λ/n}\) and \(95\% \ CI = λ + 1.96 * \sqrt{λ/n}\). Unlike the exactpoissonCI() function, this formula would take into account the different n values. Otherwise, the exactpoissonCI() function would give the same result for all the distributions that have same \(λ\) values but different \(n\).

library(exactci)
lambda <- c(10,15,20)
poiss.n <- c(10,100,1000,10000,100000)
poiss.data <- vector("list", length(poiss.n))
CI <- vector("list")
poiss.estimate <- vector("list")
poiss.counter = 1
par(mfrow = c(2,2))
for(x in 1:length(poiss.n)){
  for(y in lambda){
    poiss.data <- rpois(n = poiss.n[x], lambda = y)
    poiss.estimate[poiss.counter] <- mean(poiss.data)
    poiss.data <- table(poiss.data)
    #plot(poiss.data, main = bquote("Poisson(" ~ .(as.character(poiss.n[x])) ~ "," ~ .(as.character(y)) ~ ")"), xlab = "X", ylab = "Density")
    CI[[poiss.counter]] <- toString(formatC(c(y-1.96*sqrt(y/poiss.n[x]),y+1.96*sqrt(y/poiss.n[x]))), digits = 2, format = "f")
    poiss.counter = poiss.counter + 1
  }
}

This code chunk below organizes all the confidence intervals and unbiased estimates into two tables according to the appropriate \(n\) and \(λ\) values.

CInterval <- matrix(c(CI[1], CI[2], CI[3], CI[4], CI[5], CI[6], CI[7], CI[8], CI[9], CI[10], CI[11], CI[12],CI[13], CI[14], CI[15]), ncol = 3, byrow = TRUE)
colnames(CInterval) <- c("λ=10","λ=15","λ=20")
rownames(CInterval) <- c("n=10","n=100","n=1000","n=10000","n=100000")
poiss.estimate <- matrix(poiss.estimate, ncol = 3, byrow = TRUE)
colnames(poiss.estimate) <- c("λ=10","λ=15","λ=20")
rownames(poiss.estimate) <- c("n=10","n=100","n=1000","n=10000","n=100000")

Unbiased Estimator

λ=10 λ=15 λ=20
n=10 10.5 14.9 21.5
n=100 9.64 15.27 19.66
n=1000 10.093 14.811 20.065
n=10000 10.0302 15.0396 20.0123
n=100000 10.01345 14.99205 20.0137

The table above shows that the more we increase the \(n\), the closer the estimator gets to \(λ\) value.

95% Confidence Interval

λ=10 λ=15 λ=20
n=10 8.04, 11.96 12.6, 17.4 17.23, 22.77
n=100 9.38, 10.62 14.24, 15.76 19.12, 20.88
n=1000 9.804, 10.2 14.76, 15.24 19.72, 20.28
n=10000 9.938, 10.06 14.92, 15.08 19.91, 20.09
n=100000 9.98, 10.02 14.98, 15.02 19.97, 20.03

It can be inferred from the table above that for a specific \(λ\) value, as the \(n\) increases the 95% confidence interval gets narrower. Mathematically speaking, increase in n would make \(\sqrt{λ/n}\) smaller, thus the narrower smaller interval. Intuitively, increasing the number of values to be predicted (\(n\)) would ensure there will be more chances of getting a number closer to \(λ\) every time, thus narrowing the interval.