RESEARCH SCENARIO 2
In order to be more competitive in the market, a technology store wants to start selling their laptops and anti-virus software as a bundle to small businesses. A bundle is when two products are sold together at a lower price than if they were purchased separately. Before offering the bundle, the store wants to make sure the two products are commonly purchased together. The store has data from the past year showing how many laptops and how many anti-virus software licenses each small business bought from them. Analyze the data to determine if there is a positive correlation between the number of laptops purchased and the number of anti-virus licenses purchased.
Hypothesis:
H0: There is no relationship between the number of laptops purchased and the number of anti-virus licenses purchased.
H1: There is a positive relationship between the number of laptops purchased and the number of anti-virus licenses purchased.
library(readxl)
dataset2 <- read_excel("C:/Users/kodal/Desktop/ASSIGNMENT 5/A5RQ2.xlsx")
library(psych)
describe(dataset2[, c("Antivirus", "Laptop")])
## vars n mean sd median trimmed mad min max range skew
## Antivirus 1 122 50.18 13.36 49 49.92 12.60 15 83 68 0.15
## Laptop 2 122 40.02 12.30 39 39.93 11.86 8 68 60 -0.01
## kurtosis se
## Antivirus -0.14 1.21
## Laptop -0.32 1.11
hist(dataset2$Antivirus,
main = "Histogram of Antivirus",
xlab = "Value",
ylab = "Frequency",
col = "lightblue",
border = "black",
breaks = 20)
hist(dataset2$Laptop,
main = "Histogram of Laptop",
xlab = "Value",
ylab = "Frequency",
col = "lightgreen",
border = "black",
breaks = 20)
shapiro.test(dataset2$Antivirus)
##
## Shapiro-Wilk normality test
##
## data: dataset2$Antivirus
## W = 0.99419, p-value = 0.8981
shapiro.test(dataset2$Laptop)
##
## Shapiro-Wilk normality test
##
## data: dataset2$Laptop
## W = 0.99362, p-value = 0.8559
library(ggplot2)
##
## Attaching package: 'ggplot2'
## The following objects are masked from 'package:psych':
##
## %+%, alpha
library(ggpubr)
ggscatter(dataset2, x = "Antivirus", y = "Laptop",
add = "reg.line",
conf.int = TRUE,
cor.coef = TRUE,
cor.method = "pearson",
xlab = "Variable Antivirus", ylab = "Variable Laptop")
cor.test(dataset2$Antivirus, dataset2$Laptop, method = "pearson")
##
## Pearson's product-moment correlation
##
## data: dataset2$Antivirus and dataset2$Laptop
## t = 25.16, df = 120, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.8830253 0.9412249
## sample estimates:
## cor
## 0.9168679
Review Your Output
Name of test: Pearson Correlation
Variables: Antivirus and Laptop
Total sample size (n): 122
Statistically significant? Yes
Mean and SD:Antivirus: M = 50.18, SD = 13.36 Laptop: M = 40.02, SD = 12.30
Direction and size: Positive and Strong
Degrees of freedom (df): 120
r-value: 0.92
EXACT p-value: p < .0012)
Final Report
A Pearson correlation was conducted to examine the relationship between anti-virus software purchases and laptop purchases (n = 122). There was a statistically significant correlation between anti-virus software (M = 50.18, SD = 13.36) and laptop purchases (M = 40.02, SD = 12.30). The correlation was positive and strong, r(120) = 0.92, p < .001. As anti-virus software purchases increase, laptop purchases also increase.