Introduction

Over the recent years, the poultry industry is growing rapidly and so is the demand. To get the full potential out of broiler chickens it is important that the target weight at 7 days of age should be reached.These days the interest in early nutrition research has increased due to the high correlation between the weight and the diet given during the initial stage.

In this project we aim to find out if the different protein diets have different effects on the weights of the chicks.

Data

The data provides information on the body weights of the chicks measured at birth and every second day thereafter until day 20. The weight is also measured on day 21. There are four groups of chicks on different protein diets.

The data set includes the following variables:

Analysis

library(tidyr)
head(ChickWeight)
##   weight Time Chick Diet
## 1     42    0     1    1
## 2     51    2     1    1
## 3     59    4     1    1
## 4     64    6     1    1
## 5     76    8     1    1
## 6     93   10     1    1

Empirical CDF

The empirical CDF is plotted below

Bootstrap standard errors and confidence intervals

The estimated sample mean is found to be 121.82 gm. After performing bootstrap to find the standard error, we get a result of 2.94 gm. At a confidence level of 0.95 , we get the following confidence intervals for the mean :

Pivotal gives the best approximate for the confidence interval hence we can say that the sample mean is 121.82 gms and 95% of the times the interval 116.15gms to 127.61gms will have the population mean.

library(bootstrap)
data1=ChickWeight$weight
theta <- function(x){
mean(x)
}
results <- bootstrap(x=data1,3200,theta)
th.hat=mean(data1)#theta hat
Tboot<-results$thetastar#vector of bootstrap results
se.boot <- sqrt(var(Tboot))
##Normal
CI.normal<-c(th.hat-2*se.boot, th.hat+2*se.boot)
##Pivotal
CI.pivotal<-2*th.hat-quantile(Tboot,probs = c(0.975, 0.025))
#Quantile
CI.quantile<-quantile(Tboot,probs = c(0.025, 0.975) )
hist(Tboot,probability = T)

MLE and its asymptotic distributions

The MLE of the mean is 121.82 gms The asymptotic distribution is shown below. It can be seen that asymptotically the MLE of mean follows a normal distribution.

n<-length(data1)
mean.mle<-mean(data1)
sigma_hat<-sqrt((1/n)*sum((data1-mean(data1))^2))
data.norm<-rnorm(10000,mean.mle,sigma_hat)
hist(data.norm)

#

Hypothesis Testing

Hypothesis test is conducted to see if there is any significant difference in the weights of the chicks that were on two different diets.

diet.1<-dplyr::filter(ChickWeight, Diet==1)
diet.2<-dplyr::filter(ChickWeight, Diet==2)

mean.diet.1<- mean(diet.1$weight)
mean.diet.2<- mean(diet.2$weight)
sd.diet.1<-sd(diet.1$weight)
sd.diet.2<-sd(diet.2$weight)

z<-(mean.diet.1-mean.diet.2)/sqrt(((sd.diet.1^2)/length(diet.1$weight))+((sd.diet.2^2)/length(diet.2$weight)))

c<-qnorm(0.975)

hyp<-abs(z)>c
#We reject the null hypotheses

diff<-mean.diet.1-mean.diet.2
se.hat<-sqrt(((sd.diet.1^2)/length(diet.1$weight))+((sd.diet.2^2)/length(diet.2$weight)))
CI<-c(diff-2*se.hat,diff+2*se.hat)

Wald Test

Wald Test is used to compare the means of the two samples when the parameter is known. The results show the test statistic as -2.63 and the critical value is 1.96 . As the absolute value of the observed test statistic is more extreme than the critical value, there is strong evidence against the null. Hence, we reject the null hypothesis at a significance level of 0.5% and conclude that both the means are different.

#wald_test=(mean(data1)-mu_hat)/sqrt(mean(data1/n))

Bayesian analysis

In the Bayesian framework:

  1. We choose a probability density (prior distribution)-that expresses our beliefs about a parameter.

  2. We choose a statistical model that reflects our beliefs about the distribution given the parameter.

  3. After observing the data, we update our beliefs and calculate the posterior distribution.

Here, we have considered the prior to be 𝑓(𝜇)=1. For a normal distribution, the posterior distribution follows N(𝑥̅,𝜎2𝑛) where 𝑥̅ is the mean of the data, 𝜎2𝑛 is the variance. On approximating the posterior distribution using simulation (1000 times), we get the mean to be 121.818 gms.

simu.mu=rnorm(1000,mean = mean(data1), sd = 1/sqrt(length(diet.1$weight)))
hist(simu.mu, probability = T)

mean.bayesian<-mean(simu.mu)

Conclusions:

  1. Using MLE, the mean weight of the chicks was found to be 121.82 gms and this result is accurate within 116.15 and 127.61 gms 95% of the time.
  2. We also conducted a hypothesis test to see if there is any significant difference in the weights of the chicks that were on two different diets.