HW5_RAITSES.knit

#Question 6
#   The `PlantGrowth` data set contains three different groups, 
#   with each representing various plant food diets (you may need 
#   to type data(`PlantGrowth`) to activate it).
#   The group labeled “`ctrl`” is the control group, while 
#   “`trt1`” and “`trt2`” are different types of experimental treatment. 
#
#   As a reminder, this subsetting statement accesses the weight
#   data for the control group:
# 
#    `PlantGrowth$weight[PlantGrowth$group==”ctrl”]`  
#   and this subsetting statement accesses the weight data for treatment group 1: `PlantGrowth$weight[PlantGrowth$group==”trt1”]`
#
#    Run a *`t*‐test` to compare the means of the control group (“`ctrl`”) and treatment group 1 (“`trt1`”) in the `PlantGrowth` data. Report the observed value of *`t`*, the degrees of
#   freedom, and the *`p*‐value` associated with the observed value. 
#
#   Assuming an alpha threshold of .05, decide whether you should reject the null hypothesis or fail to reject the null hypothesis. In addition, report the upper and lower bound of the confidence interval.

# Load the PlantGrowth dataset
data(PlantGrowth)

# Subset the weight data for the control group
ctrl_weight <- PlantGrowth$weight[PlantGrowth$group == "ctrl"]

# Subset the weight data for treatment group 1
trt1_weight <- PlantGrowth$weight[PlantGrowth$group == "trt1"]

# Perform the t-test
t_test_result <- t.test(ctrl_weight, trt1_weight)

# Extract the observed value of t
t_value <- t_test_result$statistic

# Extract the degrees of freedom
df <- t_test_result$parameter

# Extract the p-value
p_value <- t_test_result$p.value

# Report the observed value of t, degrees of freedom, and p-value
{
  cat("T-test results:\n")
  cat("Observed t-value:", t_value, "\n")
  cat("Degrees of Freedom:", df, "\n")
  cat("p-value:", p_value, "\n")
}

## T-test results:
## Observed t-value: 1.19126 
## Degrees of Freedom: 16.52359 
## p-value: 0.2503825

# Determine whether to reject the null hypothesis
alpha <- 0.05
if (p_value < alpha) {
  cat("Reject the null hypothesis\n")
} else {
  cat("Fail to reject Null hypothesis\n")
}

## Fail to reject Null hypothesis

# Report the confidence interval
lower_bound <- t_test_result$conf.int[1]
upper_bound <- t_test_result$conf.int[2]
cat("95% Confidence Interval:", lower_bound, "-", upper_bound, "\n")

## 95% Confidence Interval: -0.2875162 - 1.029516

#Question 7

# Install the HDInterval package if not already installed
if(!require(HDInterval)) install.packages("HDInterval")

## Loading required package: HDInterval

# Load the HDInterval package
library(HDInterval)

# Install the coda package if not already installed
if(!require(coda)) install.packages("coda")

## Loading required package: coda

# Load the coda package
library(coda)

# Install the rjags package if not already installed
if(!require(rjags)) install.packages("rjags",repos = "https://cran.r-project.org/src/contrib/Archive/rjags/rjags_4-14.tar.gz")

## Loading required package: rjags

## Linked to JAGS 4.3.2

## Loaded modules: basemod,bugs

# Load the coda package
library(rjags)

# Install the BEST package if not already installed
if(!require(BEST)) install.packages("BEST",repos = "https://cran.r-project.org/src/contrib/Archive/BEST/BEST_0.5.4.tar.gz")

## Loading required package: BEST

# Load the BEST package
library(BEST)

# Install the BaylorEdPsych package if not already installed
if(!require(BaylorEdPsych)) install.packages("BaylorEdPsych",repos = "https://cran.r-project.org/src/contrib/Archive/BaylorEdPsych/BaylorEdPsych_0.5.tar.gz")

## Loading required package: BaylorEdPsych

# Load the BaylorEdPsych package
library(BaylorEdPsych)

# Perform Bayesian estimation using BESTmcmc()
result <- BESTmcmc(ctrl_weight, trt1_weight)

## Waiting for parallel processing to complete...

## done.

# Plot the result

plot(result)

# Document the boundary values for the HDI

"Boundary Values for the HDI == -.372 and 1.13"

## [1] "Boundary Values for the HDI == -.372 and 1.13"

#Write a brief definition of the meaning of the HDI and interpret the results from this comparison.

"The HDI (Highest Density Interval) is a range of parameter values that are most credible given the data and the model. 
It gives the range of values within which the true parameter value lies with a certain degree of confidence.
The mean weights of the plants in the control group are likely to be greater than the weights in experimental 
treatment group 1 85% of the time. However, given that the HDI includes zero, it indicates that 
there may not be a significant difference between the two groups."

## [1] "The HDI (Highest Density Interval) is a range of parameter values that are most credible given the data and the model. \nIt gives the range of values within which the true parameter value lies with a certain degree of confidence.\nThe mean weights of the plants in the control group are likely to be greater than the weights in experimental \ntreatment group 1 85% of the time. However, given that the HDI includes zero, it indicates that \nthere may not be a significant difference between the two groups."

#Question 8
#   Compare and contrast the results of Exercise 6 and Exercise 7. 
#   Using the three types of evidence: 
#   Results of the null hypothesis test 
#   Confidence interval 
#   HDI from the BESTmcmc() procedure

"Based on the confidence interval, while there is some uncertainty, 
the population mean is likely positive, indicating an overall trend 
of higher plant weight in the control group.

The high density interval shows that the difference between the population 
means difference of the control group and treament group 1 is positive with 
a mean of 0.38, with about an 85% likelihood that the control group has a 
slightly higher yield than treatment group 1. However, since the HDI bounds
contain 0, there is a credible probability that the true population mean difference 
between the two groups could be zero. This suggests that the treatment 
may not have a significant effect on yield compared to the control group. 

This aligns with the interpretation that there isn't strong evidence to 
support the idea that treatment 1 leads to an improvement in yield 
compared to the control group.

Since we failed to reject the null hypothesis, it reinforces the idea 
that treatment 1 is not likely to produce a greater yield than the control group.

Overall, the combined evidence from the confidence interval 
and the HDI suggests that treatment 1 may not offer an improvement in yield 
compared to the control group in the PlantGrowth dataset."

## [1] "Based on the confidence interval, while there is some uncertainty, \nthe population mean is likely positive, indicating an overall trend \nof higher plant weight in the control group.\n\nThe high density interval shows that the difference between the population \nmeans difference of the control group and treament group 1 is positive with \na mean of 0.38, with about an 85% likelihood that the control group has a \nslightly higher yield than treatment group 1. However, since the HDI bounds\ncontain 0, there is a credible probability that the true population mean difference \nbetween the two groups could be zero. This suggests that the treatment \nmay not have a significant effect on yield compared to the control group. \n\nThis aligns with the interpretation that there isn't strong evidence to \nsupport the idea that treatment 1 leads to an improvement in yield \ncompared to the control group.\n\nSince we failed to reject the null hypothesis, it reinforces the idea \nthat treatment 1 is not likely to produce a greater yield than the control group.\n\nOverall, the combined evidence from the confidence interval \nand the HDI suggests that treatment 1 may not offer an improvement in yield \ncompared to the control group in the PlantGrowth dataset."

#Question 9
#   Using the same `PlantGrowth` data set, 
#   compare the “`ctrl`” group to the “`trt2`” group. 
#   Use all of the methods described earlier 
#   (*`t*‐test`, confidence interval, and Bayesian method) 
#   and explain all of the results.

# Subset the weight data for treatment group 2
trt2_weight <- PlantGrowth$weight[PlantGrowth$group == "trt2"]

# Perform the t-test
t_test_result2 <- t.test(ctrl_weight, trt2_weight)
t_test_result2

## 
##  Welch Two Sample t-test
## 
## data:  ctrl_weight and trt2_weight
## t = -2.134, df = 16.786, p-value = 0.0479
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.98287213 -0.00512787
## sample estimates:
## mean of x mean of y 
##     5.032     5.526

# Extract the observed value of t
t_value2 <- t_test_result2$statistic

# Extract the degrees of freedom
df2 <- t_test_result2$parameter

# Extract the p-value
p_value2 <- t_test_result2$p.value

# Report the observed value of t, degrees of freedom, and p-value
{
  cat("T-test results:\n")
  cat("Observed t-value:", t_value2, "\n")
  cat("Degrees of Freedom:", df2, "\n")
  cat("p-value:", p_value2, "\n")
}

## T-test results:
## Observed t-value: -2.13402 
## Degrees of Freedom: 16.78576 
## p-value: 0.04789926

# Determine whether to reject the null hypothesis
alpha <- 0.05
if (p_value2 < alpha) {
  cat("Reject the null hypothesis\n")
} else {
  cat("Fail to reject the null hypothesis\n")
}

## Reject the null hypothesis

# Perform confidence interval calculation
ci2 <- t.test(ctrl_weight, trt2_weight)$conf.int

# Report the confidence interval
cat("\nConfidence Interval (95%):\n", ci2[1], "-", ci2[2], "\n\n")

## 
## Confidence Interval (95%):
##  -0.9828721 - -0.00512787

# Perform Bayesian estimation using BESTmcmc()
result2 <- BESTmcmc(ctrl_weight, trt2_weight)

## Waiting for parallel processing to complete...done.

# Plot the result
plot(result2)

# Document the boundary values for the HDI

cat("\nBoundary Values for the HDI == -1.06, 0.0609\n")

## 
## Boundary Values for the HDI == -1.06, 0.0609

# Analysis of results

"The t-value (-2.134) suggests that there's a significant difference between the two groups. The degrees of freedom (16.79) indicates that there is a relatively large sample size or variability within the data, which contributes to the precision of the estimation. The p-value (0.048) being less than the alpha (assumed to be 0.05) merits the rejection of the null hypothesis, agreeing with the t-value that the difference between the groups is statistically significant.

The CI ranges from -0.98 to -0.005, indicating that there's a plausible range of values for the true difference in means between the control group and treatment group 2. Since this interval does not include zero, it further supports the notion of a significant difference between the groups.

The HDI bounds are wider (-1.06 to 0.061) compared to the CI. This HDI suggests a 96% likelihood that the population mean difference is less than 0, indicating that treatment group 2 is highly likely to produce a slightly greater yield than the control group.

Overall, the combined evidence from the t-test, confidence interval, and Bayesian method supports the conclusion that treatment group 2 is likely to produce a slightly greater yield than the control group."

## [1] "The t-value (-2.134) suggests that there's a significant difference between the two groups. The degrees of freedom (16.79) indicates that there is a relatively large sample size or variability within the data, which contributes to the precision of the estimation. The p-value (0.048) being less than the alpha (assumed to be 0.05) merits the rejection of the null hypothesis, agreeing with the t-value that the difference between the groups is statistically significant.\n\nThe CI ranges from -0.98 to -0.005, indicating that there's a plausible range of values for the true difference in means between the control group and treatment group 2. Since this interval does not include zero, it further supports the notion of a significant difference between the groups.\n\nThe HDI bounds are wider (-1.06 to 0.061) compared to the CI. This HDI suggests a 96% likelihood that the population mean difference is less than 0, indicating that treatment group 2 is highly likely to produce a slightly greater yield than the control group.\n\nOverall, the combined evidence from the t-test, confidence interval, and Bayesian method supports the conclusion that treatment group 2 is likely to produce a slightly greater yield than the control group."