1. Introduction

This report aims to explore the relationship between the number of Calories and the Sugar content (in grams) of 36 different breakfast cereals. Both frequentist and Bayesian methods will be employed to assess the correlation between these two variables.

2. Load the Data and Necessary Libraries

2.1 Load the Data

##                  Cereal Calories Sugar Fiber
## 1 Common Sense Oat Bran      100     6     3
## 2            Product 19      100     3     1
## 3   All Bran Xtra Fiber       50     0    14
## 4            Just Right      140     9     2
## 5     Original Oat Bran       70     5    10
## 6             Heartwise       90     5     6

3. a) Scatterplot and Correlation Coefficient

We begin by examining the relationship between Calories and Sugar through a scatterplot. Additionally, we will calculate the correlation coefficient.

Create a scatterplot of Calories vs Sugar

Calculate the Pearson correlation coefficient

## [1] 0.5154008

4. b) Hypothesis Test for Correlation

We will conduct a hypothesis test to determine whether there is a significant correlation between Calories and Sugar. The null hypothesis is that there is no correlation in the population of all breakfast cereals.

Conduct a hypothesis test for correlation

## 
##  Pearson's product-moment correlation
## 
## data:  cereal_data$Sugar and cereal_data$Calories
## t = 3.5069, df = 34, p-value = 0.001296
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.2249563 0.7217280
## sample estimates:
##       cor 
## 0.5154008

Extract the p-value and 95% confidence interval

## [1] 0.001295907

## [1] 0.2249563 0.7217280
## attr(,"conf.level")
## [1] 0.95

5. c) Bayesian Correlation Test

Using the bayes.cortest function from the BayesianFirstAid package, we will perform a Bayesian analysis to estimate the population correlation coefficient. This approach provides a 95% credible interval and the probability that the correlation is positive.

## 
##  Bayesian First Aid Pearson's Correlation Coefficient Test
## 
## data: cereal_data$Sugar and cereal_data$Calories (n = 36)
## Estimated correlation:
##   0.49 
## 95% credible interval:
##   0.20 0.72 
## The correlation is more than 0 by a probability of 0.999 
## and less than 0 by a probability of 0.001

Plot the Bayesian analysis results

6. d) Comparison of Frequentist and Bayesian Results

We will now compare the p-value from the frequentist test with the Bayesian probability that the true correlation is negative, and compare the frequentist 95% confidence interval with the Bayesian 95% credible interval.

Bayesian credible interval

Frequentist interpretation: Compare p-value and Bayesian probability

## P-value from frequentist approach: 0.001295907

## Bayesian probability that true correlation is negative:

Compare confidence interval vs credible interval

## 95% Confidence Interval (Frequentist): 0.2249563 0.721728

## 95% Credible Interval (Bayesian):

Bayesian Analysis of Correlation Between Calories and Sugar in Breakfast Cereals

Guy Buhendwa

2024-09-29