Assignment-5

install.packages(“readxl”) install.packages(“ggpubr”)

library(readxl)
library(ggpubr)

## Loading required package: ggplot2

Loading the Data set A2

DatasetA2 <- read_excel("D:/DatasetA2.xlsx")

In the DATASET A2, there is data about student ID and their favorite drink

Creating a Frequency Table:

table(DatasetA2$FavoriteDrink)

## 
## Coffee   Soda    Tea  Water 
##     26     29     28     17

Observed Frequencies are: Coffee = 26, soda = 29, Tea = 28, Water = 17

PLotting a Bar Chart:

ggplot(DatasetA2, aes(x = FavoriteDrink, fill = FavoriteDrink)) +
  geom_bar() +
  labs(
    x = "Favorite Drink",
    y = "Frequency",
    title = "Preference of Favorite Drink among Students"
  ) +
  theme(
    text = element_text(size = 14),       
    axis.title = element_text(size = 14),  
    axis.text = element_text(size = 14),  
    plot.title = element_text(size = 14),  
    legend.position = "none"              
  )

for Chi-square Chi-square Goodness-of-Fit test: Each beverage should account for 25% of responses as equal preference is assumed. Anticipated percentage for every category: 0.25, 0.25, 0.25, 0.25 Considering N = 100 students: Each division is expected to have 25 students. Conducting the Chi-square Goodness-of-Fit test

observed <- c(29, 28, 26, 17) 
expected <- c(0.25, 0.25, 0.25, 0.25) 
chisq.test(x = observed, p = expected)

## 
##  Chi-squared test for given probabilities
## 
## data:  observed
## X-squared = 3.6, df = 3, p-value = 0.308

The output is: Chi-squared test for given probabilities

data: observed X-squared = 3.6, df = 3, p-value = 0.308

Reporting Results for Dataset A2: A chi-square goodness-of-fit test indicated that the observed frequencies were not different from the expected frequencies, χ-squared = 3.60, p-value = .308.

Because the p-value (.308) is greater than .05, we have to accept the null hypothesis. This suggests that there is no statistically significant difference in beverage preferences among students.

Assignment-5_A2

Hema Vamsinath Reddy

2026-02-16