rnorm2 <- function(n,mean,sd) { mean+sd*scale(rnorm(n)) } #the function represent the distribution calculation
set.seed(1239)
r1 <- rnorm2(100,25,4)
r2 <- rnorm2(50,10,3)
samplingframe <- c(r1,r2) #conmbine r1 and r2
hist(samplingframe, breaks=20,col = "pink")
According to the code, r1 represents the calculation of distribution for 100 samples with mean 25 and standard deviation 4 and r2 represents same calculation but for 50 samples with mean 10 and standard deviation 3. The output histogram seems a negatively skewed binominal distribution because the result shows the distribution is neither bell curve nor symmetric.
Set the function first.
samp1m <- function(si, sa){
samp1 <- function(si){
samp1 <- sample(samplingframe, size = si, replace = F) #sampling with size "si"
return(mean(samp1)) #return the mean of the sampling results
}
set.seed(1239) #set the seed to ensure the data not wrong
replicate(sa, samp1(si)) #using replicate to repeat the calculation of mean "sa" times
}
Output plot with 50 samples of size 15.
hist(samp1m (15,50), xlab = "Sample Mean", main = "Distribution of Sample Mean (Sample: 50, Size: 15)", col = "blue") #output the histogram with samples of 50 and size of 15 and set the x-axis, title, and color.
Using same function used in part b. Set the size to 45 and change the color to red. Output the result.
hist(samp1m (45,50), xlab = "Sample Mean", main = "Distribution of Sample Mean (Sample: 50, Size: 45)", col = "red") #output the histogram with samples of 50 and size of 15 and set the x-axis, title, and color.
par(mfrow=c(1,2))
hist(samp1m (15,50), xlab = "Sample Mean", main = "Distribution of Sample Mean (Sample: 50, Size: 15)", col = "blue")
hist(samp1m (45,50), xlab = "Sample Mean", main = "Distribution of Sample Mean (Sample: 50, Size: 45)", col = "red")
The distribution from Part A with more samples seems smoother than other two. The distributions from other two parts are more close to normal distribution.
The Central Limit Theorems states that given the large amount of variables from a population, the distribution of all the samples will shows a normal distribution pattern. Additionally, if the sample size was large enough, the mean of samples will approximately equal to mean of population.
Yes. In the exercise, we can see that, comparing to distribution with size 15, the distribution with size 45 is obviously more close to normal distribution. It supports the theory that the larger sample size, the closer to the normal distribution.