Sampling Error App

Demonstrating the alpha risk in sampling

Pieter Musters

1 Introduction

This app demonstrates the statistical concept of Type I error, or the alpha risk: there is always a change that due to bad luck, the confidence interval of the mean of a sample doesnot include the actual average value of the population. This risk, is the alpha risk and is usually set at 5%.

It is visualised in the plot below in hypothesis testing.

width

2 Sampling

The app samples with a function SimulateData from a normal distribution with a mean of zero and a standard deviation as defined by the user.The user can define the number of samples, the number of observations within each sample and the alpha risk. The set.seed is set to 1234 if seed=TRUE, allowing to compare multiple simulations.

SimulateData<-function(stdev,nsamples,nobs,alphavalue,seed){
  if (seed) {set.seed(1234)}
  SimulateData<-matrix(rnorm(nsamples*nobs,0,stdev),nrow=nobs,ncol=nsamples)
  Simulations<-SimulateData
  return(Simulations)
}
X<-SimulateData(3,100,50,5,FALSE)

The Sampling Matrix X is a random sample from the normal distribution, with Nsamples Columns and Nobs rows. This is evident from the dimensions of X:

## [1]  50 100

3 Plotting

We use the error.bar function from the psych package to plot the mean of all columns and their confidence intervals. The population average at zero is plotted as a blue horizontal line.

library(psych)
error.bars(X,eyes=FALSE,xlab="Sample number",ylab="Average",alpha=0.05,main="Confidence intervals")  
abline(h=0,col="blue",lwd=2)

plot of chunk unnamed-chunk-3

4 Using the app