FA 13 - CLT

MD AND EVAN H

LECTURE NOTES

n=25
m=1000

rdu<-function(n) mean(sample(1:6,n,replace=TRUE))
dat<-rep(n,m)
dat<-sapply(dat, rdu)

cat("The True Mean is:", 3.5, "\nThe Simulated Mean is:", mean(dat),
    "\n\nThe True Standard Deviation is:", 1.71/sqrt(n),
    "\nThe Simulated Standard Deviation is: ", sd(dat),
    "\n\nA Histogram is given by:\n")
## The True Mean is: 3.5 
## The Simulated Mean is: 3.49136 
## 
## The True Standard Deviation is: 0.342 
## The Simulated Standard Deviation is:  0.3429119 
## 
## A Histogram is given by:
hist(dat,main=paste("Histogram of the Average of",n,"Tosses of a Die \nReplicated",m,"times with CLT Normal Distribution"),col="blue",prob=TRUE)
curve(dnorm(x,3.5,1.71/sqrt(n)),add=TRUE,col="red")

######################################

1. A sample of size n is to be drawn from a population with a known mean of μ = 25.4 and a standard deviation of σ = 2.8. Using the Central Limit Theorem …

a. What is the probability that the average of the n observations is less than 25.3 when …

1a. n=10?

n=10
m=10000

rdu<-function(n) mean(sample(1:6, n, replace = TRUE))
dat<-sapply(rep(n, m), rdu)

cat("The True Mean is:",3.5,
    "\nThe Simulated Mean is:", mean(dat),
    "\n\nThe True Standard Deviation is:", 1.71 / sqrt(n),
    "\nThe Simulated Standard Deviation is:", sd(dat),
    "\n\nEstimated P(average < 2.53):", mean(dat < 2.53),
    "\nCLT Normal Approx P(average < 2.53):", pnorm(2.53, mean = 3.5, sd = 1.71 / sqrt(n)),
    "\n")
## The True Mean is: 3.5 
## The Simulated Mean is: 3.49508 
## 
## The True Standard Deviation is: 0.5407495 
## The Simulated Standard Deviation is: 0.5373553 
## 
## Estimated P(average < 2.53): 0.0385 
## CLT Normal Approx P(average < 2.53): 0.03642202
hist(dat,
     main=paste("Histogram of the Average of",n,"Dice Tosses\nReplicated",m,"times"),
     col="blue", prob = TRUE)
curve(dnorm(x,3.5,1.71/sqrt(n)),add=TRUE,col="red")

2a. n=25?

n=25
m=10000
rdu<-function(n) mean(sample(1:6,n,replace=TRUE))
dat<-sapply(rep(n,m),rdu)
cat("True Mean:",3.5,"\nSim Mean:",mean(dat),"\nTrue SD:",1.71/sqrt(n),"\nSim SD:",sd(dat),"\n")
## True Mean: 3.5 
## Sim Mean: 3.504608 
## True SD: 0.342 
## Sim SD: 0.3398365
hist(dat,main=paste("Histogram of Average of",n,"Dice Tosses\n",m,"Replications"),col="blue",prob=TRUE)
curve(dnorm(x,3.5,1.71/sqrt(n)),add=TRUE,col="red")

# both histogram and curve here as well as in 1 and 3. 

3a. n=50

n=50
m=10000
dat<-sapply(rep(n,m),rdu)
cat("True Mean:",3.5,"\nSim Mean:",mean(dat),"\nTrue SD:",1.71/sqrt(n),"\nSim SD:",sd(dat),"\n")
## True Mean: 3.5 
## Sim Mean: 3.497906 
## True SD: 0.2418305 
## Sim SD: 0.2442692
hist(dat,main=paste("Histogram of Average of",n,"Dice Tosses\n",m,"Replications"),col="blue",prob=TRUE)
curve(dnorm(x,3.5,1.71/sqrt(n)),add=TRUE,col="red")

b. Comment on anything you notices about Pr(X<25.3) as n is increased

The dist of the dist of the sample means becomes more tightly concentrated around the true mean.
The var of the mean decreases proportional to 1/n.
The probability that the sample deviates significantly from 3.5 becomes smaller.

c. Using R, plot on the same graph the pdf of X over the range X∈(24,27) for the three different sample sizes n. Make the plots three different colors. Copy and paste the plot into a Word document. (hint: use the following syntax in R …

curve(,,_,col=””)
curve(,,_,col=””,add=TRUE)
curve(,,_,col=””,add=TRUE) )
No need, the plots are within the R code file so run it within the file as said by Professor Matis.