What are the null and alternate hypotheses for your research?
H0:
There is no difference in number of minutes the customer stayed and
drinks they had.
H1: There is a difference in number of minutes the
customer stayed and drinks they had.
INSTALL REQUIRED PACKAGE
library(readxl)
IMPORT THE EXCEL FILE INTO R STUDIO
dataset <- read_excel("C:/Users/burug/Downloads/A5RQ1.xlsx")
LOAD THE PACKAGE
library(psych)
CALCULATE THE DESCRIPTIVE DATA
describe(dataset[, c("Minutes", "Drinks")])
## vars n mean sd median trimmed mad min max range skew kurtosis
## Minutes 1 461 29.89 18.63 24.4 26.99 15.12 10 154.2 144.2 1.79 5.20
## Drinks 2 461 3.00 1.95 3.0 2.75 1.48 0 17.0 17.0 1.78 6.46
## se
## Minutes 0.87
## Drinks 0.09
CREATE A HISTOGRAM FOR EACH CONTINUOUS VARIABLE
hist(dataset$Minutes,
main = "Histogram of Minutes",
xlab = "Value",
ylab = "Frequency",
col = "lightblue",
border ="black",
breaks = 20)
hist(dataset$Drinks,
main = "Histogram of Drinks",
xlab = "Value",
ylab = "Frequency",
col = "lightgreen",
border = "black",
breaks = 20)
Q1) Check the SKEWNESS of the VARIABLE 1 histogram.
In your
opinion, does the histogram look symmetrical, positively skewed, or
negatively skewed?
Ans: Positive skewness
Q2) Check the KURTOSIS
of the VARIABLE 1 histogram.
In your opinion, does the histogram
look too flat, too tall, or does it have a proper bell curve?
Ans:
Too tall
Q3) Check the SKEWNESS of the VARIABLE 2 histogram.
In
your opinion, does the histogram look symmetrical, positively skewed, or
negatively skewed?
Ans: Positive skewness
Q4) Check the KUROTSIS
of the VARIABLE 2 histogram.
In your opinion, does the histogram
look too flat, too tall, or does it have a proper bell curve?
Ans:
Too tall
CONDUCT THE SHAPIRO-WILK TEST
shapiro.test(dataset$Minutes)
##
## Shapiro-Wilk normality test
##
## data: dataset$Minutes
## W = 0.84706, p-value < 2.2e-16
shapiro.test(dataset$Drinks)
##
## Shapiro-Wilk normality test
##
## data: dataset$Drinks
## W = 0.85487, p-value < 2.2e-16
Answer the questions below as a comment within the R script:
Was
the data normally distributed for Variable 1?
No
Was the data
normally distributed for Variable 2?
No
LOAD THE PACKAGE
library(ggplot2)
##
## Attaching package: 'ggplot2'
## The following objects are masked from 'package:psych':
##
## %+%, alpha
library(ggpubr)
CREATE THE SCATTERPLOT
ggscatter(dataset, x = "Minutes", y = "Drinks",
add = "reg.line",
conf.int = TRUE,
cor.coef = TRUE,
cor.method = "spearman",
xlab = "Minutes", ylab = "Drinks")
Is the relationship positive (line pointing up), negative (line
pointing down), or is there no relationship (line is flat)?
The
relationship is positive (line pointing up)
CONDUCT THE SPEARMAN CORRELATION
cor.test(dataset$Minutes, dataset$Drinks, method = "spearman")
## Warning in cor.test.default(dataset$Minutes, dataset$Drinks, method =
## "spearman"): Cannot compute exact p-value with ties
##
## Spearman's rank correlation rho
##
## data: dataset$Minutes and dataset$Drinks
## S = 1305608, p-value < 2.2e-16
## alternative hypothesis: true rho is not equal to 0
## sample estimates:
## rho
## 0.9200417
Q1) What is the direction of the effect?
The correlation of 0.92
is positive. As number of minutes increases, drinks they had
increases.
Q2) What is the size of the effect?
A correlation of
0.92 is a strong relationship.
REPORT
A Spearman correlation was conducted to assess the
relationship between number of minutes customers stayed and drinks they
had (n = 461). There was a statistically significant correlation between
Minutes customer stayed (M = 29.89, SD = 18.63) and drinks they had (M =
1.95, SD =3.0). The correlation was positive , has strong size of
effect. ρ= 0.920. As number of minutes customer stayed increases, number
of drinks they had increases.