RESEARCH SCENARIO 1

A café owner thinks if she can get customers to stay in her café longer, the customers will make more purchases. She plans to make the café more comfortable (add couches, more electrical outlets for laptops, etc) so customers stay longer. Before she makes this investment, the owner wants to check if her belief is true. She buys an AI software to collect information from her cash register and cameras to determine how long each customer stayed in the café and how many drinks they buy. Analyze the data to determine whether there is a relationship between time spent (minutes) in the shop and number of drinks purchased. Use the appropriate test to see if longer visits are associated with higher spending.

Hypothesis:

H0: There is no relationship between time spent in the shop (Minutes) and the number of drinks purchased (Drinks).

H1: There is a positive relationship between time spent in the shop (Minutes) and the number of drinks purchased (Drinks).

library(readxl)
dataset <- read_excel("C:/Users/kodal/Desktop/ASSIGNMENT 5/A5RQ1.xlsx")
library(psych)
describe(dataset[, c("Minutes", "Drinks")])
##         vars   n  mean    sd median trimmed   mad min   max range skew kurtosis
## Minutes    1 461 29.89 18.63   24.4   26.99 15.12  10 154.2 144.2 1.79     5.20
## Drinks     2 461  3.00  1.95    3.0    2.75  1.48   0  17.0  17.0 1.78     6.46
##           se
## Minutes 0.87
## Drinks  0.09
hist(dataset$Minutes,
     main = "Histogram of Minutes",
     xlab = "Value",
     ylab = "Frequency",
     col = "lightblue",
     border = "black",
     breaks = 20)

hist(dataset$Drinks,
     main = "Histogram of Drinks",
     xlab = "Value",
     ylab = "Frequency",
     col = "lightgreen",
     border = "black",
     breaks = 20)

shapiro.test(dataset$Minutes)
## 
##  Shapiro-Wilk normality test
## 
## data:  dataset$Minutes
## W = 0.84706, p-value < 2.2e-16
shapiro.test(dataset$Drinks)
## 
##  Shapiro-Wilk normality test
## 
## data:  dataset$Drinks
## W = 0.85487, p-value < 2.2e-16
library(ggplot2)
## 
## Attaching package: 'ggplot2'
## The following objects are masked from 'package:psych':
## 
##     %+%, alpha
library(ggpubr)
ggscatter(dataset, x = "Minutes", y = "Drinks",
          add = "reg.line",
          conf.int = TRUE,
          cor.coef = TRUE,
          cor.method = "spearman",
          xlab = "Minutes", ylab = "Drinks")

cor.test(dataset$Minutes, dataset$Drinks, method = "spearman")
## Warning in cor.test.default(dataset$Minutes, dataset$Drinks, method =
## "spearman"): Cannot compute exact p-value with ties
## 
##  Spearman's rank correlation rho
## 
## data:  dataset$Minutes and dataset$Drinks
## S = 1305608, p-value < 2.2e-16
## alternative hypothesis: true rho is not equal to 0
## sample estimates:
##       rho 
## 0.9200417

Review Your Output

Name of test: Spearman Correlation

Variables: Minutes and Drinks

Total sample size (n): 461

Statistically significant? Yes

Mean and SD:Minutes: M = 29.89, SD = 18.63 Drinks: M = 3.00, SD = 1.95

Direction and size: Positive and Strong

Degrees of freedom (df): Not applicable

rho-value: 0.92

EXACT p-value: p < .001

FINAL REPORT

A Spearman correlation was conducted to assess the relationship between the time spent in the shop (Minutes) and the number of drinks purchased (Drinks) (n = 461). There was a statistically significant correlation between time spent (M = 29.89, SD = 18.63) and drinks purchased (M = 3.00, SD = 1.95). The correlation was positive and strong, rho = 0.92, p < .001. As time spent in the shop increases, the number of drinks purchased also increases.