In order to be more competitive in the market, a technology store wants to start selling their laptops and anti-virus software as a bundle to small businesses. A bundle is when two products are sold together at a lower price than if they were purchased separately. Before offering the bundle, the store wants to make sure the two products are commonly purchased together. The store has data from the past year showing how many laptops and how many anti-virus software licenses each small business bought from them. Analyze the data to determine if there is a positive correlation between the number of laptops purchased and the number of anti-virus licenses purchased.
Null Hypothesis (H0): There is no relationship between the number of laptops purchased and the number of anti-virus licenses purchased.
Alternative Hypothesis (H1): There is a positive relationship between the number of laptops purchased and the number of anti-virus licenses purchased.
# Install packages if not already installed
# install.packages(c("readxl", "psych", "ggplot2", "ggpubr", "rmarkdown"))
# Load required packages
library(readxl)
library(psych)
library(ggplot2)
##
## Attaching package: 'ggplot2'
## The following objects are masked from 'package:psych':
##
## %+%, alpha
library(ggpubr)
A5RQ2 <- read_excel("C:/Users/saisa/Downloads/A5RQ2.xlsx")
head(A5RQ2)
## # A tibble: 6 × 3
## Business Antivirus Laptop
## <dbl> <dbl> <dbl>
## 1 1 42 31
## 2 2 47 36
## 3 3 73 68
## 4 4 51 38
## 5 5 52 43
## 6 6 76 61
# Calculate descriptive statistics
describe(A5RQ2[, c("Antivirus", "Laptop")])
## vars n mean sd median trimmed mad min max range skew
## Antivirus 1 122 50.18 13.36 49 49.92 12.60 15 83 68 0.15
## Laptop 2 122 40.02 12.30 39 39.93 11.86 8 68 60 -0.01
## kurtosis se
## Antivirus -0.14 1.21
## Laptop -0.32 1.11
# Create histogram for Antivirus Licenses
hist(A5RQ2$Antivirus,
main = "Histogram of Antivirus Licenses Purchased",
xlab = "Number of Antivirus Licenses",
ylab = "Frequency",
col = "lightblue",
border = "black",
breaks = 20)
# Create histogram for Laptops Purchased
hist(A5RQ2$Laptop,
main = "Histogram of Laptops Purchased",
xlab = "Number of Laptops",
ylab = "Frequency",
col = "lightgreen",
border = "black",
breaks = 20)
# Conduct Shapiro-Wilk tests
shapiro_antivirus <- shapiro.test(A5RQ2$Antivirus)
shapiro_laptop <- shapiro.test(A5RQ2$Laptop)
# Display results
shapiro_antivirus
##
## Shapiro-Wilk normality test
##
## data: A5RQ2$Antivirus
## W = 0.99419, p-value = 0.8981
shapiro_laptop
##
## Shapiro-Wilk normality test
##
## data: A5RQ2$Laptop
## W = 0.99362, p-value = 0.8559
Normality Test Results:
Antivirus Licenses: W = 0.99419, p-value = 0.8981 → Normally distributed
Laptops Purchased: W = 0.99362, p-value = 0.8559 → Normally distributed
Decision: Since both variables are normally distributed, we will use Pearson Correlation.
# Create scatterplot with Pearson correlation
ggscatter(A5RQ2, x = "Antivirus", y = "Laptop",
add = "reg.line",
conf.int = TRUE,
cor.coef = TRUE,
cor.method = "pearson",
xlab = "Antivirus Licenses Purchased",
ylab = "Laptops Purchased",
title = "Relationship between Antivirus Licenses and Laptop Purchases")
Scatterplot Observation: The relationship is strongly positive (line pointing steeply upward) - as antivirus licenses increase, laptop purchases also increase substantially.
# Conduct Pearson correlation test
pearson_result <- cor.test(A5RQ2$Antivirus, A5RQ2$Laptop, method = "pearson")
pearson_result
##
## Pearson's product-moment correlation
##
## data: A5RQ2$Antivirus and A5RQ2$Laptop
## t = 25.16, df = 120, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.8830253 0.9412249
## sample estimates:
## cor
## 0.9168679
Statistical Results: -Pearson’s r: 0.917 -p-value: < .001 (specifically < 2.2e-16) -Sample size: n = 122
Effect Size Interpretation: -Direction: Positive correlation -Strength: Very strong (r = 0.92 > 0.50 threshold for “strong” relationship)
Q1) Antivirus skewness: Fairly symmetrical
Q2) Antivirus kurtosis: Proper bell curve
Q3) Laptop skewness: Symmetrical
Q4) Laptop kurtosis: Proper bell curve
Was the data normally distributed for Antivirus Licenses? Yes (W = 0.99419, p = 0.8981)
Was the data normally distributed for Laptops Purchased? Yes (W = 0.99362, p = 0.8559)
Is the relationship positive, negative, or no relationship? Strongly positive (line pointing steeply upward)
Q1) Direction of effect: Positive - as antivirus licenses increase, laptop purchases increase
Q2) Size of effect: Very strong (r = 0.92)
A Pearson correlation was conducted to examine the relationship between the number of antivirus licenses purchased and the number of laptops purchased by small businesses (n = 122). There was a statistically significant correlation between antivirus licenses (M = 50.18, SD = 13.36) and laptops purchased (M = 40.02, SD = 12.30). The correlation was positive and very strong, r(120) = 0.92, p < .001. As the number of antivirus licenses purchased increases, the number of laptops purchased also increases.