1 Introduction

I will be constructing two different 95% confidence intervals for the mean of the variable Spices in the data set protein, one assuming a normal population distribution, the other using a bootstrapping method. This variable is a measure of the percentage of protein intake for which spices account in countries around the world (relative to other types of food). There are 170 observations for this variable in the sample. These values range from 0.0000 to 1.8604, with a mean of 0.2318 and a median value of 0.1129.The standard deviation is 0.3288.

2 Constructing A Confidence Interval of the Mean Assuming Population Normality

protein <-  read.csv("https://raw.githubusercontent.com/pengdsci/sta321/main/ww02/w02-Protein_Supply_Quantity_Data.csv", header = TRUE) #reading in data set
Spices <-  protein$Spices   #defining vector containing values for the variable Spices
n <-  length(Spices)    #storing the sample size in an object
SpicesSD <-  sd(Spices)   #storing the sample standard deviation in an object
CV.95 <-  qnorm(0.975, 0, 1)  #storing the critical value for a 95% CI in an object
E <-  CV.95*SpicesSD/sqrt(n)  #calculating margin of error under assumption of normality
xbar <- mean(Spices)  #storing sample mean in an object
LCL <-  xbar - E      #calculating lower limit of CI and storing in object 
UCL <-  xbar + E      #calculating upper limit of CI and storing in object
CI <-  cbind(LCL, UCL)  #merging lower and upper limits to generate CI

Using this method, the 95% confidence interval for the mean percentage of protein intake for which spices account in countries around the world is [0.1824, 0.2812].

3 Constructing a Confidence Interval For the Mean Using the Bootstrapping Method

bootmean <-  NULL   #creating a vector to store means of bootstrap samples
for(i in 1:1000){  #creating loop to generate 1000 bootstrap samples
bootsample <- sample(Spices, 170, replace = TRUE)  #resampling from original sample with replacement to generate bootstrap samples of same size   
bootmean[i] <- mean(bootsample) #calculating the mean of each bootstrap sample and storing them in a vector
}
CI <- quantile(bootmean, c(0.025, 0.975))  #finding 2.5% and 97.5% quantiles of bootstrap sample means to define CI
hist(bootmean) #generating histogram of bootstrap sample mean distribution

Using the bootstrapping method generates a 95% confidence interval of [0.1854, 0.2839]. This interval is very close in width to the first confidence interval constructed under an assumption of population normality; however, its lower and upper limits are slightly higher. That is, the confidence interval constructed via the bootstrapping method estimates a slightly higher probable range for the mean percentage of protein intake for which spices account in countries around the world.