Research Scenario

A cafe owner wants to know if customers who stay longer tend to buy more drinks. Data was collected on time spent in the café (in minutes) and number of drinks purchased.

Hypothesis

Null Hpothesis (H0): There is no relationship between the time spent in the shop and the number of drinks ordered.
Alternate Hypothesis (H1): There is a relationship between the time spent in the shop and the number of drinks ordered.

Summary of the Correlation Test

A Spearman correlation was conducted to examine the relationship between time spent and number of drinks purchased. Number of customers (n = 461). There was a statistically significant correlation between time spent (mean = 29.89, sd = 18.63) and number of drinks (mean = 3.00, sd = 1.95). The correlation was positive and strong, ρ(459) = 0.92, p < 0.05. As time spent in the shop increases, number of drinks purchased also increases.

Code

library(readxl)
A5RQ1 <- read_excel("C:/Users/armil/Downloads/A5RQ1.xlsx")
library(psych)
describe(A5RQ1[, c("Minutes", "Drinks")])
##         vars   n  mean    sd median trimmed   mad min   max range skew kurtosis
## Minutes    1 461 29.89 18.63   24.4   26.99 15.12  10 154.2 144.2 1.79     5.20
## Drinks     2 461  3.00  1.95    3.0    2.75  1.48   0  17.0  17.0 1.78     6.46
##           se
## Minutes 0.87
## Drinks  0.09
hist(A5RQ1$Minutes,
     main = "Histogram of Minutes",
     xlab = "Value",
     ylab = "Frequency",
     col = "lightblue",
     border = "black",
     breaks = 20)

hist(A5RQ1$Drinks,
     main = "Histogram of Drinks",
     xlab = "Value",
     ylab = "Frequency",
     col = "lightgreen",
     border = "black",
     breaks = 20)

### Review of the histograms:

# Q1) Check the SKEWNESS of the VARIABLE 1 (Minutes) histogram. In your opinion, does the histogram look symmetrical, positively skewed, or negatively skewed?
# The histogram is positively skewed.

# Q2) Check the KURTOSIS of the VARIABLE 1 (Minutes)  histogram. In your opinion, does the histogram look too flat, too tall, or does it have a proper bell curve?
# Too tall.

# Q3) Check the SKEWNESS of the VARIABLE 2 (Drinks) histogram. In your opinion, does the histogram look symmetrical, positively skewed, or negatively skewed?
# The histogram is positively skewed. 

# Q4) Check the KUROTSIS of the VARIABLE 2 (Drinks) histogram. In your opinion, does the histogram look too flat, too tall, or does it have a proper bell curve?
# Too tall.
shapiro.test(A5RQ1$Minutes)
## 
##  Shapiro-Wilk normality test
## 
## data:  A5RQ1$Minutes
## W = 0.84706, p-value < 2.2e-16
shapiro.test(A5RQ1$Drinks)
## 
##  Shapiro-Wilk normality test
## 
## data:  A5RQ1$Drinks
## W = 0.85487, p-value < 2.2e-16
# Analysis of the data
# Was the data normally distributed for Variable 1 (Minutes)?
# No.

# Was the data normally distributed for Variable 2 (Drinks)?
# No.

# As both the variables are not normal, we will be using Spearman Correlation Test. 
library(ggplot2)
## 
## Attaching package: 'ggplot2'
## The following objects are masked from 'package:psych':
## 
##     %+%, alpha
library(ggpubr)
ggscatter(A5RQ1, x = "Minutes", y = "Drinks",
          add = "reg.line",
          conf.int = TRUE,
          cor.coef = TRUE,
          cor.method = "spearman",
          xlab = "Minutes", ylab = "Drinks")

Is the relationship positive (line pointing up), negative (line pointing down), or is there no relationship (line is flat)?
Answer - The relationship is positive as the line is pointing up.

cor.test(A5RQ1$Minutes, A5RQ1$Drinks, method = "spearman")
## Warning in cor.test.default(A5RQ1$Minutes, A5RQ1$Drinks, method = "spearman"):
## Cannot compute exact p-value with ties
## 
##  Spearman's rank correlation rho
## 
## data:  A5RQ1$Minutes and A5RQ1$Drinks
## S = 1305608, p-value < 2.2e-16
## alternative hypothesis: true rho is not equal to 0
## sample estimates:
##       rho 
## 0.9200417
# Review of the correlation test:

# Spearman correlation
# Sample estimates:
# rho: 0.9200417
#Effect size for the spearman correlation test

# Q1) What is the direction of the effect?
# A correlation of 0.92 is positive. As time spent in the shop increases, number of drinks increases.

# Q2) What is the size of the effect?
# A correlation of 0.92 is a strong relationship.