install.packages(“readxl”) install.packages(“ggpubr”)
library(readxl)
library(ggpubr)
## Loading required package: ggplot2
Loading the Data set A2
DatasetA2 <- read_excel("D:/DatasetA2.xlsx")
In the DATASET A2, there is data about student ID and their favorite drink
Creating a Frequency Table:
table(DatasetA2$FavoriteDrink)
##
## Coffee Soda Tea Water
## 26 29 28 17
Observed Frequencies are: Coffee = 26, soda = 29, Tea = 28, Water = 17
PLotting a Bar Chart:
ggplot(DatasetA2, aes(x = FavoriteDrink, fill = FavoriteDrink)) +
geom_bar() +
labs(
x = "Favorite Drink",
y = "Frequency",
title = "Preference of Favorite Drink among Students"
) +
theme(
text = element_text(size = 14),
axis.title = element_text(size = 14),
axis.text = element_text(size = 14),
plot.title = element_text(size = 14),
legend.position = "none"
)
for Chi-square Chi-square Goodness-of-Fit test: Each beverage should account for 25% of responses as equal preference is assumed. Anticipated percentage for every category: 0.25, 0.25, 0.25, 0.25 Considering N = 100 students: Each division is expected to have 25 students. Conducting the Chi-square Goodness-of-Fit test
observed <- c(29, 28, 26, 17)
expected <- c(0.25, 0.25, 0.25, 0.25)
chisq.test(x = observed, p = expected)
##
## Chi-squared test for given probabilities
##
## data: observed
## X-squared = 3.6, df = 3, p-value = 0.308
The output is: Chi-squared test for given probabilities
data: observed X-squared = 3.6, df = 3, p-value = 0.308
Reporting Results for Dataset A2: A chi-square goodness-of-fit test indicated that the observed frequencies were not different from the expected frequencies, χ-squared = 3.60, p-value = .308.
Because the p-value (.308) is greater than .05, we have to accept the null hypothesis. This suggests that there is no statistically significant difference in beverage preferences among students.