Task 1: Visualizing Exponential distribution

We first create a matrix of \(1000*40\)

The whole matrix can be considered the population, while each row will be considered as an independent random sample from that population.The expnontial distribution is mathematically given by: \(L*e^{-Lx}\) .

Plotting the first set of 40 samples gives us:-

mat<- matrix( rexp(40*1000, 0.2), 1000,40)
 hist(mat[1,], xlab="First 40 samples", main="First 40 samples from exp distribution")

Plotting the whole 1000 set of 40 samples gives us:-

library(ggplot2)
g<-ggplot( data=data.frame( as.vector(mat) ), aes( as.vector(mat) ) )
h<-g+geom_histogram( color="red", fill="blue")+labs(x="All 40 k samples", title="Whole population of exp distribution", y="Frequency" )
popsd<-sd ( as.vector(mat) )
popmean<- mean ( as.vector(mat) )

#Adding vertical lines at mean and standard deviations
i<-h+geom_vline(aes(xintercept=popmean), col="green", lwd=1,linetype="dashed")+geom_vline(aes(xintercept=popmean+popsd), col="black", lwd=1 ,linetype="dashed" )+geom_vline(aes(xintercept=popmean-popsd), col="black", lwd=1 ,linetype="dashed")
i

Population mean and standard deviations are respectively

popmean ; popsd

## [1] 5.056989

## [1] 5.033594

Now we plot, the averages of each of the rows:

mns<- apply(mat,1, mean)
distsd<- sd(mns)
distmean<- mean(mns)

g<-ggplot( data=data.frame( as.vector(mns) ), aes( as.vector(mns) ) )
h<-g+geom_histogram( color="red", fill="blue")+labs(x="1000 means of the 40 samples", title="Sampling distribution of the sample means", y="Frequency" )

#Adding vertical lines at mean and standard deviations
i<-h+geom_vline(aes(xintercept=distmean), col="green", lwd=1,linetype="dashed")+geom_vline(aes(xintercept=distmean+distsd), col="black", lwd=1 ,linetype="dashed" )+geom_vline(aes(xintercept=distmean-distsd ), col="black", lwd=1 ,linetype="dashed")
i

Sample mean and standard deviations are respectively

distmean ; distsd

## [1] 5.056989

## [1] 0.7653707

Analysis

As we can see the sampling distribution is looks like a normal distribution. Also,

The population mean and mean of the sampling distribution are identical.
Standard deviation of the sampling distribution must be roughly equal to original standard deviation of the population divided by the square root of sample size ( n =40 ). We will verify this using R.

popsd; distsd; popsd/sqrt(40)

## [1] 5.033594

## [1] 0.7653707

## [1] 0.7958811

We can treeat 0.80 and 0.797 as nearly equal.