Research Scenario

A café owner wants to know if customers who stay longer tend to buy more drinks.
Data was collected on time spent in the café (in minutes) and number of drinks purchased.


Hypotheses


Descriptive Statistics

# Load required packages
library(readxl)
library(psych)

# Load dataset
dataset <- read_excel("C:/Users/rohit/Downloads/A5RQ1.xlsx")

# Display descriptive statistics for both variables
describe(dataset[, c("Minutes", "Drinks")])
##         vars   n  mean    sd median trimmed   mad min   max range skew kurtosis
## Minutes    1 461 29.89 18.63   24.4   26.99 15.12  10 154.2 144.2 1.79     5.20
## Drinks     2 461  3.00  1.95    3.0    2.75  1.48   0  17.0  17.0 1.78     6.46
##           se
## Minutes 0.87
## Drinks  0.09
# Shapiro-Wilk tests
shapiro.test(dataset$Minutes)
## 
##  Shapiro-Wilk normality test
## 
## data:  dataset$Minutes
## W = 0.84706, p-value < 2.2e-16
shapiro.test(dataset$Drinks)
## 
##  Shapiro-Wilk normality test
## 
## data:  dataset$Drinks
## W = 0.85487, p-value < 2.2e-16
# Histograms
hist(dataset$Minutes,
     main = "Histogram of Minutes",
     xlab = "Minutes",
     col = "lightblue",
     border = "black",
     breaks = 20)

hist(dataset$Drinks,
     main = "Histogram of Drinks",
     xlab = "Drinks",
     col = "lightgreen",
     border = "black",
     breaks = 20)

# Q-Q plots
par(mfrow = c(1, 2))  # show both plots side by side
qqnorm(dataset$Minutes, main = "Q-Q Plot of Minutes")
qqline(dataset$Minutes, col = "red")

qqnorm(dataset$Drinks, main = "Q-Q Plot of Drinks")
qqline(dataset$Drinks, col = "red")

# Reset layout
par(mfrow = c(1, 1))

# Pearson correlation
cor.test(dataset$Minutes, dataset$Drinks, method = "pearson")
## 
##  Pearson's product-moment correlation
## 
## data:  dataset$Minutes and dataset$Drinks
## t = 68.326, df = 459, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.9452363 0.9617123
## sample estimates:
##       cor 
## 0.9541922
# Scatterplot
plot(dataset$Minutes, dataset$Drinks,
     main = "Scatterplot of Time Spent vs. Drinks Purchased",
     xlab = "Minutes Spent in Café",
     ylab = "Number of Drinks Purchased",
     pch = 19,
     col = "blue")
abline(lm(Drinks ~ Minutes, data = dataset), col = "red", lwd = 2)

# Reset layout again just in case
par(mfrow = c(1, 1))

Interpretation

Direction: Positive — as time spent increases, drinks purchased increases. Effect Size: Strong — r = 0.95

Final Report

A Pearson correlation was conducted to examine the relationship between time spent in the café and number of drinks purchased (n = 461). There was a statistically significant correlation between time spent (M = 29.89, SD = 18.63) and drinks purchased (M = 3.00, SD = 1.95). The correlation was positive and very strong, r(459) = 0.95, p < .001.

As time spent in the café increases, the number of drinks purchased also increases