Research Scenario 1

What are the null and alternate hypotheses for your research?
H0: There is no difference in number of minutes the customer stayed and drinks they had.
H1: There is a difference in number of minutes the customer stayed and drinks they had.

INSTALL REQUIRED PACKAGE

library(readxl)

IMPORT THE EXCEL FILE INTO R STUDIO

dataset <- read_excel("C:/Users/burug/Downloads/A5RQ1.xlsx")

LOAD THE PACKAGE

library(psych)

CALCULATE THE DESCRIPTIVE DATA

describe(dataset[, c("Minutes", "Drinks")])

##         vars   n  mean    sd median trimmed   mad min   max range skew kurtosis
## Minutes    1 461 29.89 18.63   24.4   26.99 15.12  10 154.2 144.2 1.79     5.20
## Drinks     2 461  3.00  1.95    3.0    2.75  1.48   0  17.0  17.0 1.78     6.46
##           se
## Minutes 0.87
## Drinks  0.09

CREATE A HISTOGRAM FOR EACH CONTINUOUS VARIABLE

hist(dataset$Minutes,
     main = "Histogram of Minutes",
     xlab = "Value",
     ylab = "Frequency",
     col = "lightblue",
     border ="black",
     breaks = 20)

hist(dataset$Drinks,
     main = "Histogram of Drinks",
     xlab = "Value",
     ylab = "Frequency",
     col = "lightgreen",
     border = "black",
     breaks = 20)

Q1) Check the SKEWNESS of the VARIABLE 1 histogram.
In your opinion, does the histogram look symmetrical, positively skewed, or negatively skewed?
Ans: Positive skewness
Q2) Check the KURTOSIS of the VARIABLE 1 histogram.
In your opinion, does the histogram look too flat, too tall, or does it have a proper bell curve?
Ans: Too tall
Q3) Check the SKEWNESS of the VARIABLE 2 histogram.
In your opinion, does the histogram look symmetrical, positively skewed, or negatively skewed?
Ans: Positive skewness
Q4) Check the KUROTSIS of the VARIABLE 2 histogram.
In your opinion, does the histogram look too flat, too tall, or does it have a proper bell curve?
Ans: Too tall

CONDUCT THE SHAPIRO-WILK TEST

shapiro.test(dataset$Minutes)

## 
##  Shapiro-Wilk normality test
## 
## data:  dataset$Minutes
## W = 0.84706, p-value < 2.2e-16

shapiro.test(dataset$Drinks)

## 
##  Shapiro-Wilk normality test
## 
## data:  dataset$Drinks
## W = 0.85487, p-value < 2.2e-16

Answer the questions below as a comment within the R script:
Was the data normally distributed for Variable 1?
No
Was the data normally distributed for Variable 2?
No

LOAD THE PACKAGE

library(ggplot2)

## 
## Attaching package: 'ggplot2'

## The following objects are masked from 'package:psych':
## 
##     %+%, alpha

library(ggpubr)

CREATE THE SCATTERPLOT

ggscatter(dataset, x = "Minutes", y = "Drinks",
          add = "reg.line",
          conf.int = TRUE,
          cor.coef = TRUE,
          cor.method = "spearman",
          xlab = "Minutes", ylab = "Drinks")

Is the relationship positive (line pointing up), negative (line pointing down), or is there no relationship (line is flat)?
The relationship is positive (line pointing up)

CONDUCT THE SPEARMAN CORRELATION

cor.test(dataset$Minutes, dataset$Drinks, method = "spearman")

## Warning in cor.test.default(dataset$Minutes, dataset$Drinks, method =
## "spearman"): Cannot compute exact p-value with ties

## 
##  Spearman's rank correlation rho
## 
## data:  dataset$Minutes and dataset$Drinks
## S = 1305608, p-value < 2.2e-16
## alternative hypothesis: true rho is not equal to 0
## sample estimates:
##       rho 
## 0.9200417

Q1) What is the direction of the effect?
The correlation of 0.92 is positive. As number of minutes increases, drinks they had increases.
Q2) What is the size of the effect?
A correlation of 0.92 is a strong relationship.

REPORT
A Spearman correlation was conducted to assess the relationship between number of minutes customers stayed and drinks they had (n = 461). There was a statistically significant correlation between Minutes customer stayed (M = 29.89, SD = 18.63) and drinks they had (M = 1.95, SD =3.0). The correlation was positive , has strong size of effect. ρ= 0.920. As number of minutes customer stayed increases, number of drinks they had increases.

Research Scenario 1

Team 1

2025-11-14