Ch 6 Lab

Page 4

This code is simulating 100 samples, each of sizes 5, 25, and 100. Vectors are created for the storage of the simulated means. Then in a for loop, for every simulation a sample of 5 is drawn from an exponential distribution with lambda=2. The mean is taken of this sample and stored in the vector mean5 in the ith simulation position. The same is done for samples of 25 and 100 and stored in mean25 and mean100.

set.seed(12345)

lambda<-2
numsim<-100

#creating vectors
mean5<-rep(0,numsim)
mean25<-mean5
mean100<-mean5

#for loop to fill vectors with means of samples
for (i in 1:numsim){
  mean5[i]<-mean(rexp(5,lambda))
  mean25[i]<-mean(rexp(25,lambda))
  mean100[i]<-mean(rexp(100,lambda))
  
}
#boxplot for distribution of means
boxplot(mean5,mean25,mean100,names=c("n=5","n=25","n=100"))
title("Distribution of Mean of Exponentials")

y <- seq(0,1,.1)

n <- c(5,25,100)
mean <- c(mean(mean5),mean(mean25),mean(mean100))
sd <- c(sd(mean5),sd(mean25),sd(mean100))
kable(cbind(n,mean,sd)) %>%
  kable_paper("hover", full_width = F)

n	mean	sd
5	0.5352591	0.2551066
25	0.5027771	0.1011640
100	0.4996425	0.0458466

Page 5
First it creates three vectors: x4, x5, and x6 with 3 zeros each contained in them. Then a 3x5 matrix is created filled with randomly generated numbers from N(0,1). A table is made with this matrix and the three vectors as columns. Then to occupy the 6th, 7th, and 8th columns (x4, x5, and x6), the mean, 20% trimmed mean (manually constructed), and median functions are applied to the values in the first row of columns 1 to 5, the randomly generated sample. Vectors are then created for theta, bias, SD, and MSE with three zeros in each. Another for loop is used to enter these values.

set.seed(12345)

s <- 3
n <- 5

#creating vectors
x4 <- rep(0,s)
x5 <- rep(0,s)
x6 <- rep(0,s)

#creating the matrix that will hold the sample values
mymatrix <- matrix(rnorm(s*n), nrow=s,ncol=n)

#creating the table holding the samples and estimators
mydata <- cbind(mymatrix,x4,x5,x6)

#filling the estimator columns with the estimator values
mydata[,6] <- apply(mydata[,1:5],1,mean)
mydata[,7] <- apply(mydata[,1:5],1,function(s){mean(s,trim=.2)} )
mydata[,8] <- apply(mydata[,1:5],1,median)

#creating vectors for other measures
thetaMC <- rep(0,3)
biasMC<- rep(0,3)
SDMC<- rep(0,3)
MSEMC<- rep(0,3)

#for loop for filling these vectors
for (i in 1:3) {
  thetaMC[i] <- mean(mydata[,n+i])
  biasMC[i] <- thetaMC[i] - 0
  SDMC[i] <- sd(mydata[,n+i])
  MSEMC[i] <- (SDMC[i])^2 + (biasMC[i])^2
}
estimator <- c("sample mean", "trimmed mean", "median")

kable(round(mydata,3)) %>%
  kable_paper("hover", full_width = F)

					x4	x5	x6
0.586	-0.453	0.630	-0.919	0.371	0.043	0.168	0.371
0.709	0.606	-0.276	-0.116	0.520	0.289	0.337	0.520
-0.109	-1.818	-0.284	1.817	-0.751	-0.229	-0.381	-0.284

kable(cbind(estimator,round(thetaMC,3),round(biasMC,3),round(SDMC,3),round(MSEMC,3)))%>%
  kable_paper("hover", full_width = F)

estimator
sample mean	0.034	0.034	0.259	0.068
trimmed mean	0.041	0.041	0.375	0.143
median	0.202	0.202	0.428	0.224

Page 8
This generates 1000 samples of size 2 from N(0,1). The for loop stores the samples in x, takes the absolute value of the difference, and calculates the MSE. The mean is then taken of the difference, resulting in the MC estimate. The Bias, SE, theoretical variance, and MSE are also calculated.

m <- 1000
mu <- 0
g <- numeric(m)
MSE.Num<-numeric(m)

#for loop to generate samples of size 2, take the abs value of the difference, and finding the MSE of the numerator
for (i in 1:m) {
    x <- rnorm(2)
    g[i] <- abs(x[1] - x[2])
    MSE.Num[i]<-(g[i]-mu)^2 # MSE calculation of the numerator
       
}
#finding the mean difference and bias
est <- mean(g)
est

## [1] 1.110427

bias.g<-est-0 # Monte Carlo Bias, note mu ==0

bias.g

## [1] 1.110427

#finding SE and variance
MC.SE<-sd(g) # MC Standard error
var.theor<-2-(4/pi) # theroetical variance

#finding the MSE
MC.MSE<-sum(MSE.Num)/m # Monte Carlo MSE


kable(cbind(c("estimate","bias","MC Standard Error","theoretical variance","MC MSE"),round(c(est,bias.g,MC.SE,var.theor,MC.MSE),3))) %>%
  kable_paper("hover", full_width = F)

estimate	1.11
bias	1.11
MC Standard Error	0.846
theoretical variance	0.727
MC MSE	1.948

Page 10

This generates samples of 20 from N(0,2). The variance of the sample is multiplied by 19 and divided by a \(\chi ^2\) sample generated from \(\chi ^2(.05,19)\) to produce the 95% upper confidence limit. This is done 1000 times. The mean is taken of those samples that contain 4, the variance. The resulting number is the coverage probability/empirical confidence level.

n <- 20
alpha <- .05

#replicates 1000 times 
#generates from normal and chisq to calculate the upper confidence limit
UCL <- replicate(1000, expr = {
    x <- rnorm(n, mean = 0, sd = 2)
    (n-1) * var(x) / qchisq(alpha, df = n-1) } )

#compute the mean of those UCL values greater than 4 to get the percentage of times the variance is contained in the interval, i.e. confidence level
mean(UCL > 4)

## [1] 0.948

Ch 6 Lab

Amelia McCarthy

5/5/2021