QUESTION

What are the null and alternate hypotheses for your research?

H0: “There is no relationship between time spent in the café and number of drinks purchased.”

H1: “There is a relationship between time spent in the café and number of drinks purchased.”

======================

IMPORT EXCEL FILE CODE

======================

PURPOSE OF THIS CODE

Imports your Excel dataset automatically into R Studio.

You need to import your dataset every time you want to analyze your data in R Studio.

INSTALL REQUIRED PACKAGE

The package only needs to be installed once.

The code for this task is provided below. Remove the hashtag below to convert the note into code.

install.packages(“readxl”)

LOAD THE PACKAGE

You must always reload the package you want to use.

The code for this task is provided below. Remove the hashtag below to convert the note into code.

#install.packages(“readxl”)

library(readxl) # IMPORT THE EXCEL FILE INTO R STUDIO # Download the Excel file from One Drive and save it to your desktop. # Right-click the Excel file and click “Copy as path” from the menu. # In R Studio, replace the example path below with your actual path. # Replace backslashes with forward slashes / or double them //: # ✘ WRONG “C:.xlsx” # ✔ CORRECT “C:/Users/Joseph/Desktop/mydata.xlsx” # ✔ CORRECT “C:\Users\Joseph\Desktop\mydata.xlsx” # Replace “dataset” with the name of your excel data (without the .xlsx)

An example of the code for this task is provided below.

You can edit the code below and remove the hashtag to use the code below.

A5RQ1 <- read_excel(“C:\Users\manit\OneDrive\Desktop\A5RQ1.xlsx”) head(A5RQ1) # ====================== # DESCRIPTIVE STATISTICS # ======================

Calculate the mean, median, SD, and sample size for each variable.

INSTALL THE REQUIRED PACKAGE

Remove the hashtag in front of the code below to install the package once.

After installing the package, put the hashtag in front of the code again.

#install.packages(“psych”) # LOAD THE PACKAGE # Always reload the package you want to use.

library(psych)

CALCULATE THE DESCRIPTIVE DATA

Replace “dataset” with the name of your excel data (without the .xlsx)

Replace “V1” with the R code name for your first variable.

Replace “V2” with the R code name for your second variable.

CALCULATE THE DESCRIPTIVE DATA

describe(A5RQ1[, c(“Minutes”, “Drinks”)])

===============================================

CHECK THE NORMALITY OF THE CONTINUOUS VARIABLES

===============================================

OVERVIEW

Two methods will be used to check the normality of the continuous variables.

First, you will create histograms to visually inspect the normality of the variables.

Next, you will conduct a test called the Shapiro-Wilk test to inspect the normality of the variables.

It is important to know whether or not the data is normal to determine which inferential test should be used.

CREATE A HISTOGRAM FOR EACH CONTINUOUS VARIABLE

A histogram is used to visually check if the data is normally distributed.

CREATE A HISTOGRAM FOR EACH CONTINUOUS VARIABLE

Replace “dataset” with the name of your excel data (without the .xlsx)

Replace “V1” with the R code name for your first variable.

Replace “V2” with the R code name for your second variable.

hist(A5RQ1$Minutes, main = “Histogram of Minutes”, xlab = “Value”, ylab = “Frequency”, col = “lightblue”, border = “black”, breaks = 20)

hist(A5RQ1$Drinks, main = “Histogram of Drinks”, xlab = “Value”, ylab = “Frequency”, col = “lightgreen”, border = “black”, breaks = 20)

QUESTION

Answer the questions below as comments within the R script:

Q1) Check the SKEWNESS of the VARIABLE 1 histogram. In your opinion, does the histogram look symmetrical, positively skewed, or negatively skewed?

#Ans) The histogram for Minutes is positively skewed (right-skewed). # Q2) Check the KURTOSIS of the VARIABLE 1 histogram. In your opinion, does the histogram look too flat, too tall, or does it have a proper bell curve? #Ans) The distribution looks too tall and peaked, not like a normal bell curve. It has a leptokurtic shape. # Q3) Check the SKEWNESS of the VARIABLE 2 histogram. In your opinion, does the histogram look symmetrical, positively skewed, or negatively skewed? # Ans) The histogram for Drinks is also positively skewed (right-skewed). # Q4) Check the KUROTSIS of the VARIABLE 2 histogram. In your opinion, does the histogram look too flat, too tall, or does it have a proper bell curve? # Ans) The distribution is tall and peaked, not a normal bell curve. # PURPOSE # Use a statistical test to check the normality of the continuous variables. # The Shapiro-Wilk Test is a test that checks skewness and kurtosis at the same time. # The test is checking “Is this variable the SAME as normal data (null hypothesis) or DIFFERENT from normal data (alternate hypothesis)?” # For this test, if p is GREATER than .05 (p > .05), the data is NORMAL. # If p is LESS than .05 (p < .05), the data is NOT normal.

CONDUCT THE SHAPIRO-WILK TEST

Replace “dataset” with the name of your excel data (without the .xlsx)

Replace “V1” with the R code name for your first variable.

Replace “V2” with the R code name for your second variable.

shapiro.test(A5RQ1$Minutes) shapiro.test(A5RQ1$Drinks)

QUESTION

Was the data normally distributed for Variable 1?

Ans) No. The Shapiro-Wilk test for Minutes produced a p-value < .05, indicating that Variable 1 (Minutes) is NOT normally distributed.

Was the data normally distributed for Variable 2?

Ans) No. The Shapiro-Wilk test for Drinks also produced a p-value < .05, indicating that Variable 2 (Drinks) is NOT normally distributed.

If the data is normal for both variables, continue with the Pearson Correlation test.

If one or both of variables are NOT normal, change to the Spearman Correlation test.

=========================

VISUALLY DISPLAY THE DATA

=========================

CREATE A SCATTERPLOT

PURPOSE

A scatterplot visually shows the relationship between two continuous variables.

INSTALL THE REQUIRED PACKAGES

Remove the hashtags in front of the code below to install the package once.

After installing the packages, put the hashtag in front of the code again.

install.packages(“ggplot2”) install.packages(“ggpubr”) # LOAD THE PACKAGE # Always reload the package you want to use.

library(ggplot2) library(ggpubr)

CREATE THE SCATTERPLOT

Replace “dataset” with the name of your excel data (without the .xlsx)

Replace “V1” with the R code name for your first variable.

Replace “V2” with the R code name for your second variable.

Replace “pearson” with “spearman” if you are using the spearman correlation.

ggscatter(A5RQ1, x = “Minutes”, y = “Drinks”, add = “reg.line”, conf.int = TRUE, cor.coef = TRUE, cor.method = “spearman”, xlab = “Variable Minutes”, ylab = “Variable Drinks”)

QUESTION

Answer the questions below as a comment within the R script:

Is the relationship positive (line pointing up), negative (line pointing down), or is there no relationship (line is flat)?

Ans) The relationship is positive. The line is clearly pointing upward, showing that as Minutes increases,the number of Drinks also increases.

================================================

PEARSON CORRELATION OR SPEARMAN CORRELATION TEST

================================================

PURPOSE

Check if the means of the two groups are different.

CONDUCT THE PEARSON CORRELATION OR SPEARMAN CORRELATION

Replace “dataset” with the name of your excel data (without the .xlsx)

Replace “V1” with the R code name for your first variable.

Replace “V2” with the R code name for your second variable.

Replace “pearson” with “spearman” if you are using the spearman correlation.

cor.test(A5RQ1$Minutes, A5RQ1$Drinks, method = “spearman”)

DETERMINE STATISTICAL SIGNIFICANCE

If results were statistically significant (p < .05), continue to effect size section below.

NOTE: Getting results that are not statistically significant does NOT mean you switch to Spearman Correlation.

The Spearman Correlation is only for abnormally distributed data — not based on outcome significance.

===============================================

EFFECT SIZE FOR PEARSON & SPEARMAN CORRRELATION

===============================================

1) REVIEW THE CORRECT CORRELATION TEST

• For Pearson correlation, find “sample estimates: cor” in your output (when you calculated the Pearson Correlation earlier).

• For Spearman correlation, find “sample estimates: rho” in your output (when you calculated the Spearman Correlation earlier).

1) WRITE THE REPORT

Answer the questions below as a comment within the R script:

Q1) What is the direction of the effect?

Ans) The effect is positive. As Minutes increases, the number of Drinks also increases. The rho value is positive (0.92), indicating a strong upward relationship.

Q2) What is the size of the effect?

Ans) The effect size is strong. A rho value of 0.92 falls in the +or-0.50 to 1.00 range, which indicates a strong relationship between the variables.

========================================================

>> WRITTEN REPORT FOR SPEARMAN CORRELATION <<

========================================================

A Spearman correlation was conducted to examine the relationship between time spent in the café (minutes) and number of drinks purchased (n = [INSERT SAMPLE SIZE]). The results showed a statistically significant relationship between the two variables, p < .001. Time spent in the café had a mean of [M1] minutes (SD = [SD1]), and the number of drinks purchased had a mean of [M2] drinks (SD = [SD2]). The correlation was positive and strong, ρ = 0.92, indicating that customers who stayed longer in the café tended to purchase more drinks.

Team-6 1st

2025-11-14

QUESTION

What are the null and alternate hypotheses for your research?

H0: “There is no relationship between time spent in the café and number of drinks purchased.”

H1: “There is a relationship between time spent in the café and number of drinks purchased.”

======================

IMPORT EXCEL FILE CODE

======================

PURPOSE OF THIS CODE

Imports your Excel dataset automatically into R Studio.

You need to import your dataset every time you want to analyze your data in R Studio.

INSTALL REQUIRED PACKAGE

The package only needs to be installed once.

The code for this task is provided below. Remove the hashtag below to convert the note into code.

install.packages(“readxl”)

LOAD THE PACKAGE

You must always reload the package you want to use.

The code for this task is provided below. Remove the hashtag below to convert the note into code.

An example of the code for this task is provided below.

You can edit the code below and remove the hashtag to use the code below.

Calculate the mean, median, SD, and sample size for each variable.

INSTALL THE REQUIRED PACKAGE

Remove the hashtag in front of the code below to install the package once.

After installing the package, put the hashtag in front of the code again.

CALCULATE THE DESCRIPTIVE DATA

Replace “dataset” with the name of your excel data (without the .xlsx)

Replace “V1” with the R code name for your first variable.

Replace “V2” with the R code name for your second variable.

CALCULATE THE DESCRIPTIVE DATA

===============================================

CHECK THE NORMALITY OF THE CONTINUOUS VARIABLES

===============================================

OVERVIEW

Two methods will be used to check the normality of the continuous variables.

First, you will create histograms to visually inspect the normality of the variables.

Next, you will conduct a test called the Shapiro-Wilk test to inspect the normality of the variables.

It is important to know whether or not the data is normal to determine which inferential test should be used.

CREATE A HISTOGRAM FOR EACH CONTINUOUS VARIABLE

A histogram is used to visually check if the data is normally distributed.

CREATE A HISTOGRAM FOR EACH CONTINUOUS VARIABLE

Replace “dataset” with the name of your excel data (without the .xlsx)

Replace “V1” with the R code name for your first variable.

Replace “V2” with the R code name for your second variable.

QUESTION

Answer the questions below as comments within the R script:

Q1) Check the SKEWNESS of the VARIABLE 1 histogram. In your opinion, does the histogram look symmetrical, positively skewed, or negatively skewed?

CONDUCT THE SHAPIRO-WILK TEST

Replace “dataset” with the name of your excel data (without the .xlsx)

Replace “V1” with the R code name for your first variable.

Replace “V2” with the R code name for your second variable.

QUESTION

Was the data normally distributed for Variable 1?

Ans) No. The Shapiro-Wilk test for Minutes produced a p-value < .05, indicating that Variable 1 (Minutes) is NOT normally distributed.

Was the data normally distributed for Variable 2?

Ans) No. The Shapiro-Wilk test for Drinks also produced a p-value < .05, indicating that Variable 2 (Drinks) is NOT normally distributed.

If the data is normal for both variables, continue with the Pearson Correlation test.

If one or both of variables are NOT normal, change to the Spearman Correlation test.

=========================

VISUALLY DISPLAY THE DATA

=========================

CREATE A SCATTERPLOT

PURPOSE

A scatterplot visually shows the relationship between two continuous variables.

INSTALL THE REQUIRED PACKAGES

Remove the hashtags in front of the code below to install the package once.

After installing the packages, put the hashtag in front of the code again.

CREATE THE SCATTERPLOT

Replace “dataset” with the name of your excel data (without the .xlsx)

Replace “V1” with the R code name for your first variable.

Replace “V2” with the R code name for your second variable.

Replace “pearson” with “spearman” if you are using the spearman correlation.

QUESTION

Answer the questions below as a comment within the R script:

Is the relationship positive (line pointing up), negative (line pointing down), or is there no relationship (line is flat)?

Ans) The relationship is positive. The line is clearly pointing upward, showing that as Minutes increases,the number of Drinks also increases.

================================================

PEARSON CORRELATION OR SPEARMAN CORRELATION TEST

================================================

PURPOSE