Research Question

In order to be more competitive in the market, a technology store wants to start selling their laptops and anti-virus software as a bundle to small businesses. A bundle is when two products are sold together at a lower price than if they were purchased separately. Before offering the bundle, the store wants to make sure the two products are commonly purchased together. The store has data from the past year showing how many laptops and how many anti-virus software licenses each small business bought from them. Analyze the data to determine if there is a positive correlation between the number of laptops purchased and the number of anti-virus licenses purchased.


Hypothesis

H0:There is no correlation between the number of laptops purchased and the number of anti-virus licenses purchased.

H1:There is a positive correlation between the number of laptops purchased and the number of anti-virus licenses purchased.

Load the package

# Load required packages
library(readxl)
library(psych)
# Import the Excel file
A5RQ2 <- read_excel("C:/Users/sravz/Downloads/A5RQ2.xlsx")

Descriptive statistics

# calculate descriptive statistics
describe(A5RQ2[, c("Antivirus", "Laptop")])
##           vars   n  mean    sd median trimmed   mad min max range  skew
## Antivirus    1 122 50.18 13.36     49   49.92 12.60  15  83    68  0.15
## Laptop       2 122 40.02 12.30     39   39.93 11.86   8  68    60 -0.01
##           kurtosis   se
## Antivirus    -0.14 1.21
## Laptop       -0.32 1.11

Histograms

hist(A5RQ2$Antivirus,
     main = "Histogram of Antivirus",
     xlab = "Value",
     ylab = "Frequency",
     col = "lightblue",
     border = "black",
     breaks = 20)

hist(A5RQ2$Laptop,
     main = "Histogram of Laptop",
     xlab = "Value",
     ylab = "Frequency",
     col = "orange",
     border = "black",
     breaks = 20)

QUESTIONS

Q1) Check the SKEWNESS of the VARIABLE 1 histogram. In your opinion, does the histogram look symmetrical, positively skewed, or negatively skewed?

The histogram is roughly symmetrical with a slight positive skew.

Q2) Check the KURTOSIS of the VARIABLE 1 histogram. In your opinion, does the histogram look too flat, too tall, or does it have a proper bell curve?

The histogram shows a normal-looking, properly shaped bell curve.

Q3) Check the SKEWNESS of the VARIABLE 2 histogram. In your opinion, does the histogram look symmetrical, positively skewed, or negatively skewed?

The histogram appears roughly symmetrical with a slight positive skew.

Q4) Check the KUROTSIS of the VARIABLE 2 histogram. In your opinion, does the histogram look too flat, too tall, or does it have a proper bell curve?

The histogram shows a normal-looking, properly shaped bell curve.

# Shapiro - wilk Normality Tests
shapiro.test(A5RQ2$Antivirus)
## 
##  Shapiro-Wilk normality test
## 
## data:  A5RQ2$Antivirus
## W = 0.99419, p-value = 0.8981
shapiro.test(A5RQ2$Laptop)
## 
##  Shapiro-Wilk normality test
## 
## data:  A5RQ2$Laptop
## W = 0.99362, p-value = 0.8559

QUESTIONS

Was the data normally distributed for Variable 1? Data is normally distributed.

Was the data normally distributed for Variable 2? Data is normally distributed.

Since both variables are normally distributed , we will use Pearson Correlation.

# Load required packages
library(ggplot2)
## 
## Attaching package: 'ggplot2'
## The following objects are masked from 'package:psych':
## 
##     %+%, alpha
library(ggpubr)

SCATTERPLOT

# create scatterplot with pearson correlation
ggscatter(A5RQ2, x = "Antivirus", y = "Laptop",
          add = "reg.line",
          conf.int = TRUE,
          cor.coef = TRUE,
          cor.method = "pearson",
          xlab = "Variable Antivirus", ylab = "Variable Laptop",
          title = "Relationship between Antivirus Licenses and Laptop purchases")

Is the relationship positive (line pointing up), negative (line pointing down), or is there no relationship (line is flat)?

The relationship is positive (line pointing up)

# Conduct pearson correlation test
cor.test(A5RQ2$Antivirus, A5RQ2$Laptop, method = "pearson")
## 
##  Pearson's product-moment correlation
## 
## data:  A5RQ2$Antivirus and A5RQ2$Laptop
## t = 25.16, df = 120, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.8830253 0.9412249
## sample estimates:
##       cor 
## 0.9168679

QUESTIONS

Q1) What is the direction of the effect?

A correlation of 0.92 is positive. As the number of anti-virus licenses purchase increases, number of laptop purchase increases.

Q2) What is the size of the effect?

A correlation of 0.92 is a strong relationship.

Final Report

A Pearson correlation was conducted to examine the relationship between number of anti-virus licenses purchased and number of laptop purchased (n = 122).There was a statistically significant correlation between number of anti-virus licenses purchased (M = 50.18, SD = 13.36) and number of Laptops purchased(M = 40.02, SD = 12.30). The correlation was positive and strong, r(120) = 0.92, p < .001. As number of laptops purchase increases,the number of anti-virus licenses purchase increases.