Hypotheses

NULL HYPOTHESIS There is no relationship between number of Drinks purchased and time spent in cafe.

ALTERNATE HYPOTHESIS There is a positive relationship between number of Drinks purchased and time spent in cafe.

Code

# LOAD THE PACKAGE
library(readxl)
library(psych)

# IMPORT THE EXCEL FILE INTO R STUDIO
dataset <- read_excel("C:/Users/odhee/Downloads/A5RQ1.xlsx")

CALCULATE THE DESCRIPTIVE DATA

describe(dataset[, c("Drinks", "Minutes")])
##         vars   n  mean    sd median trimmed   mad min   max range skew kurtosis
## Drinks     1 461  3.00  1.95    3.0    2.75  1.48   0  17.0  17.0 1.78     6.46
## Minutes    2 461 29.89 18.63   24.4   26.99 15.12  10 154.2 144.2 1.79     5.20
##           se
## Drinks  0.09
## Minutes 0.87

CREATE A HISTOGRAM FOR DRINKS VARIABLE

hist(dataset$Drinks,
     main = "Histogram of Drinks",
     xlab = "Value",
     ylab = "Frequency",
     col = "lightgreen",
     border = "black",
     breaks = 20)

CREATE A HISTOGRAM FOR Minutes VARIABLE

hist(dataset$Minutes,
     main = "Histogram of Minutes",
     xlab = "Value",
     ylab = "Frequency",
     col = "lightblue",
     border = "black",
     breaks = 20)

CONDUCT THE SHAPIRO-WILK TEST

shapiro.test(dataset$Drinks)
## 
##  Shapiro-Wilk normality test
## 
## data:  dataset$Drinks
## W = 0.85487, p-value < 2.2e-16
shapiro.test(dataset$Minutes)
## 
##  Shapiro-Wilk normality test
## 
## data:  dataset$Minutes
## W = 0.84706, p-value < 2.2e-16

Normality Test Minutes: W = 0.847, p < .001 - Not normally distributed Drinks: W = 0.855, p < .001 - Not normally distributed

Decision: Since both variables are not normally distributed, we will use Spearman Correlation Test.

#Visually display the data
library(ggplot2)
## 
## Attaching package: 'ggplot2'
## The following objects are masked from 'package:psych':
## 
##     %+%, alpha
library(ggpubr)
#CREATE THE SCATTERPLOT

ggscatter(dataset, x = "Drinks", y = "Minutes",
          add = "reg.line",
          conf.int = TRUE,
          cor.coef = TRUE,
          cor.method = "spearman",
          xlab = "Drinks", 
          ylab = "Minutes")

Scatterplot Observation The line is pointing upward. So, the relationship is positive. As number of minutes increases, drinks purchased also increases.

# CONDUCT THE SPEARMAN CORRELATION TEST
cor.test(dataset$Drinks, dataset$Minutes, method = "spearman")
## Warning in cor.test.default(dataset$Drinks, dataset$Minutes, method =
## "spearman"): Cannot compute exact p-value with ties
## 
##  Spearman's rank correlation rho
## 
## data:  dataset$Drinks and dataset$Minutes
## S = 1305608, p-value < 2.2e-16
## alternative hypothesis: true rho is not equal to 0
## sample estimates:
##       rho 
## 0.9200417

The name of the inferential test used is Spearman Correlation Test. The names of the two variables analyzed are Time spent and drinks purchased. The total sample size n=461. The test results are statistically significant (p < .05). The mean and SD for each variable. Minutes: M = 29.89, SD = 18.63 Drinks: M = 3.00, SD = 1.95 The direction and size of the correlation are positive and strong. rho-value- 0.92 EXACT p-value- p < .001

##Final Report

A Spearman correlation was conducted to assess the relationship between Drinks and Minutes (n = 461). There was a statistically significant correlation between drinks (M = 3.0, SD = 1.95) and minutes (M = 29.89, SD = 18.63).The correlation was positive and very strong, rho = 0.92, p-value < 0.01. As minutes increases, drinks also increases substantially.