This section investigates the correlation between the calories and sugar content in breakfast cereals. The correlation coefficient is calculated to quantify the strength of the relationship between the two variables.
## [1] 0.5154008
The correlation coefficient between calories and sugar content in cereals is 0.5154008.
We test the null hypothesis that there is no linear relationship between calories and sugar in breakfast cereals. The results of fitting a linear model are shown below:
##
## Call:
## lm(formula = Calories ~ Sugar, data = Cereal)
##
## Residuals:
## Min 1Q Median 3Q Max
## -37.428 -9.832 0.245 8.909 40.322
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 87.4277 5.1627 16.935 <2e-16 ***
## Sugar 2.4808 0.7074 3.507 0.0013 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 19.27 on 34 degrees of freedom
## Multiple R-squared: 0.2656, Adjusted R-squared: 0.244
## F-statistic: 12.3 on 1 and 34 DF, p-value: 0.001296
We perform a hypothesis test to determine if there is a significant correlation between calories and sugar content in breakfast cereals. The 95% confidence interval is also calculated.
##
## Pearson's product-moment correlation
##
## data: Cereal$Calories and Cereal$Sugar
## t = 3.5069, df = 34, p-value = 0.001296
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.2249563 0.7217280
## sample estimates:
## cor
## 0.5154008
The P-value of the test is 0.0012959, and the 95% confidence interval for the correlation is from 0.2249563 to 0.721728.
In this part, we use Bayesian methods to analyze the correlation between calories and sugar content. We compare the P-value from part (b) with the Bayesian probability that the true correlation is negative.
## Loading required package: rjags
## Loading required package: coda
## Linked to JAGS 4.3.2
## Loaded modules: basemod,bugs
## Data
## Cereal$Sugar and Cereal$Calories, n = 36
##
## Model parameters
## rho: the correlation between Cereal$Sugar and Cereal$Calories
## mu[1]: the mean of Cereal$Sugar
## sigma[1]: the scale of Cereal$Sugar , a consistent
## estimate of SD when nu is large.
## mu[2]: the mean of Cereal$Calories
## sigma[2]: the scale of Cereal$Calories
## nu: the degrees-of-freedom for the bivariate t distribution
## xy_pred[1]: the posterior predictive distribution of Cereal$Sugar
## xy_pred[2]: the posterior predictive distribution of Cereal$Calories
##
## Measures
## mean sd HDIlo HDIup %<comp %>comp
## rho 0.473 0.133 0.203 0.714 0.002 0.998
## mu[1] 5.655 0.812 4.046 7.237 0.000 1.000
## mu[2] 101.331 3.798 93.878 108.744 0.000 1.000
## sigma[1] 4.702 0.610 3.566 5.928 0.000 1.000
## sigma[2] 22.083 3.095 16.278 28.297 0.000 1.000
## nu 40.228 30.711 2.607 102.393 0.000 1.000
## xy_pred[1] 5.674 5.042 -4.471 15.399 0.120 0.880
## xy_pred[2] 101.401 23.903 54.275 148.753 0.000 1.000
##
## 'HDIlo' and 'HDIup' are the limits of a 95% HDI credible interval.
## '%<comp' and '%>comp' are the probabilities of the respective parameter being
## smaller or larger than 0.
##
## Quantiles
## q2.5% q25% median q75% q97.5%
## rho 0.183 0.389 0.485 0.568 0.702
## mu[1] 4.071 5.122 5.652 6.195 7.271
## mu[2] 93.960 98.824 101.305 103.818 108.844
## sigma[1] 3.658 4.272 4.647 5.089 6.046
## sigma[2] 16.657 19.964 21.854 23.956 28.762
## nu 6.119 18.226 31.893 53.078 124.380
## xy_pred[1] -4.315 2.438 5.725 8.907 15.602
## xy_pred[2] 54.422 86.109 101.283 116.416 149.061
The P-value from part (b) suggests that there is a low probability of no correlation, while the Bayesian analysis gives a probability of around that the true correlation is negative.
This section compares the results obtained from the frequentist and Bayesian approaches, focusing on how the 95% confidence interval from part (b) compares to the Bayesian 95% credible interval.
The 95% confidence interval from part (b) ranges from 0.2249563 to 0.721728, while the Bayesian 95% credible interval shows a probability distribution centered around the true correlation, with a slightly different range.
In conclusion, both frequentist and Bayesian methods provide insights into the correlation between calories and sugar in breakfast cereals. The frequentist method provides a P-value and confidence interval, while the Bayesian method offers a probabilistic interpretation of the true correlation.