Research Scenario 2

In order to be more competitive in the market, a technology store wants to start selling their laptops and anti-virus software as a bundle to small businesses. A bundle is when two products are sold together at a lower price than if they were purchased separately. Before offering the bundle, the store wants to make sure the two products are commonly purchased together. The store has data from the past year showing how many laptops and how many anti-virus software licenses each small business bought from them. Analyze the data to determine if there is a positive correlation between the number of laptops purchased and the number of anti-virus licenses purchased.

Hypotheses

Null Hypothesis (H0): There is no relationship between the number of laptops purchased and the number of anti-virus licenses purchased.

Alternative Hypothesis (H1): There is a positive relationship between the number of laptops purchased and the number of anti-virus licenses purchased.

Loading Required Packages

# Install packages if not already installed
# install.packages(c("readxl", "psych", "ggplot2", "ggpubr", "rmarkdown"))

# Load required packages
library(readxl)
library(psych)
library(ggplot2)
## 
## Attaching package: 'ggplot2'
## The following objects are masked from 'package:psych':
## 
##     %+%, alpha
library(ggpubr)

Importing and Preparing Data

A5RQ2 <- read_excel("C:/Users/saisa/Downloads/A5RQ2.xlsx")

head(A5RQ2)
## # A tibble: 6 × 3
##   Business Antivirus Laptop
##      <dbl>     <dbl>  <dbl>
## 1        1        42     31
## 2        2        47     36
## 3        3        73     68
## 4        4        51     38
## 5        5        52     43
## 6        6        76     61

Descriptive Statistics

# Calculate descriptive statistics
describe(A5RQ2[, c("Antivirus", "Laptop")])
##           vars   n  mean    sd median trimmed   mad min max range  skew
## Antivirus    1 122 50.18 13.36     49   49.92 12.60  15  83    68  0.15
## Laptop       2 122 40.02 12.30     39   39.93 11.86   8  68    60 -0.01
##           kurtosis   se
## Antivirus    -0.14 1.21
## Laptop       -0.32 1.11

Checking Data Normality

Histograms

# Create histogram for Antivirus Licenses
hist(A5RQ2$Antivirus,
     main = "Histogram of Antivirus Licenses Purchased",
     xlab = "Number of Antivirus Licenses",
     ylab = "Frequency",
     col = "lightblue",
     border = "black",
     breaks = 20)

# Create histogram for Laptops Purchased
hist(A5RQ2$Laptop,
     main = "Histogram of Laptops Purchased",
     xlab = "Number of Laptops",
     ylab = "Frequency",
     col = "lightgreen",
     border = "black",
     breaks = 20)

Shapiro-Wilk Normality Tests

# Conduct Shapiro-Wilk tests
shapiro_antivirus <- shapiro.test(A5RQ2$Antivirus)
shapiro_laptop <- shapiro.test(A5RQ2$Laptop)

# Display results
shapiro_antivirus
## 
##  Shapiro-Wilk normality test
## 
## data:  A5RQ2$Antivirus
## W = 0.99419, p-value = 0.8981
shapiro_laptop
## 
##  Shapiro-Wilk normality test
## 
## data:  A5RQ2$Laptop
## W = 0.99362, p-value = 0.8559

Normality Test Results:

  • Antivirus Licenses: W = 0.99419, p-value = 0.8981 → Normally distributed

  • Laptops Purchased: W = 0.99362, p-value = 0.8559 → Normally distributed

Decision: Since both variables are normally distributed, we will use Pearson Correlation.

Visualizing the Relationship

# Create scatterplot with Pearson correlation
ggscatter(A5RQ2, x = "Antivirus", y = "Laptop",
          add = "reg.line",
          conf.int = TRUE,
          cor.coef = TRUE,
          cor.method = "pearson",
          xlab = "Antivirus Licenses Purchased", 
          ylab = "Laptops Purchased",
          title = "Relationship between Antivirus Licenses and Laptop Purchases")

Scatterplot Observation: The relationship is strongly positive (line pointing steeply upward) - as antivirus licenses increase, laptop purchases also increase substantially.

Pearson Correlation Test

# Conduct Pearson correlation test
pearson_result <- cor.test(A5RQ2$Antivirus, A5RQ2$Laptop, method = "pearson")
pearson_result
## 
##  Pearson's product-moment correlation
## 
## data:  A5RQ2$Antivirus and A5RQ2$Laptop
## t = 25.16, df = 120, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.8830253 0.9412249
## sample estimates:
##       cor 
## 0.9168679

Results Summary

Statistical Results: -Pearson’s r: 0.917 -p-value: < .001 (specifically < 2.2e-16) -Sample size: n = 122

Effect Size Interpretation: -Direction: Positive correlation -Strength: Very strong (r = 0.92 > 0.50 threshold for “strong” relationship)

Answers to Required Questions

Histogram Assessment

Q1) Antivirus skewness: Fairly symmetrical

Q2) Antivirus kurtosis: Proper bell curve

Q3) Laptop skewness: Symmetrical

Q4) Laptop kurtosis: Proper bell curve

Normality Questions

Was the data normally distributed for Antivirus Licenses? Yes (W = 0.99419, p = 0.8981)

Was the data normally distributed for Laptops Purchased? Yes (W = 0.99362, p = 0.8559)

Scatterplot Question

Is the relationship positive, negative, or no relationship? Strongly positive (line pointing steeply upward)

Effect Size Questions

Q1) Direction of effect: Positive - as antivirus licenses increase, laptop purchases increase

Q2) Size of effect: Very strong (r = 0.92)

Final Written Report

A Pearson correlation was conducted to examine the relationship between the number of antivirus licenses purchased and the number of laptops purchased by small businesses (n = 122). There was a statistically significant correlation between antivirus licenses (M = 50.18, SD = 13.36) and laptops purchased (M = 40.02, SD = 12.30). The correlation was positive and very strong, r(120) = 0.92, p < .001. As the number of antivirus licenses purchased increases, the number of laptops purchased also increases.