An IT manager wants to know if businesses that purchase more antivirus licenses also tend to purchase more new laptops. Data was collected on number of antivirus licenses (Antivirus) and number of new laptops purchased (Laptop).
Null Hypothesis (H₀):There is no relationship between the number of antivirus licenses and the number of laptops purchased.
Alternative Hypothesis (H₁): There is a relationship between the number of antivirus licenses and the number of laptops purchased.
A Pearson correlation was conducted to examine the relationship between Antivirus and Laptop (n = 122). There was a statistically significant correlation between Antivirus (M = 50.18, SD = 13.36) and Laptop (M = 40.02, SD = 12.30). The correlation was positive and strong, r(120) = 0.92, p < .001. As the number of antivirus licenses purchased increases, the number of laptops purchased also increases.
# Load required packages
library(readxl)
library(psych)
# Load dataset
dataset <- read_excel("C:\\Users\\rohit\\Downloads\\A5RQ2.xlsx")
# Display descriptive statistics for both variables
describe(dataset[, c("Antivirus", "Laptop")])
## vars n mean sd median trimmed mad min max range skew
## Antivirus 1 122 50.18 13.36 49 49.92 12.60 15 83 68 0.15
## Laptop 2 122 40.02 12.30 39 39.93 11.86 8 68 60 -0.01
## kurtosis se
## Antivirus -0.14 1.21
## Laptop -0.32 1.11
# Histogram for checking the normality
hist(dataset$Antivirus,
main = "Histogram of Antivirus Licenses",
xlab = "Value",
ylab = "Frequency",
col = "lightblue",
border = "black",
breaks = 20)
hist(dataset$Laptop,
main = "Histogram of Laptops Purchased",
xlab = "Value",
ylab = "Frequency",
col = "lightgreen",
border = "black",
breaks = 20)
## Review of Histograms
# Q1) VARIABLE 1 SKEWNESS:
# The histogram for Antivirus appears approximately symmetrical.
# Q2) VARIABLE 1 KURTOSIS:
# The histogram for Antivirus looks like it has a proper bell curve (not too flat or too tall).
# Q3) VARIABLE 2 SKEWNESS:
# The histogram for Laptop appears approximately symmetrical.
# Q4) VARIABLE 2 KURTOSIS:
# The histogram for Laptop looks like it has a proper bell curve.
## Shapiro-Wilk tests for normality
shapiro.test(dataset$Antivirus)
##
## Shapiro-Wilk normality test
##
## data: dataset$Antivirus
## W = 0.99419, p-value = 0.8981
shapiro.test(dataset$Laptop)
##
## Shapiro-Wilk normality test
##
## data: dataset$Laptop
## W = 0.99362, p-value = 0.8559
## Analysis of data
# Q1) Was the data normally distributed for Variable 1?
# Yes, Antivirus appears normally distributed.
# Q2) Was the data normally distributed for Variable 2?
# Yes, Laptop appears normally distributed.
library(ggplot2)
##
## Attaching package: 'ggplot2'
## The following objects are masked from 'package:psych':
##
## %+%, alpha
library(ggpubr)
# Scatterplot to visually show the relationship between two continuous variables.
ggscatter(dataset, x = "Antivirus", y = "Laptop",
add = "reg.line",
conf.int = TRUE,
cor.coef = TRUE,
cor.method = "pearson",
xlab = "Antivirus", ylab = "Laptop")
# The relationship is positive because the scatterplot shows an upward trend (positive slope).
# Decision: Both Shapiro-Wilk p-values were > .05, so we use Pearson.
cor.test(dataset$Antivirus, dataset$Laptop, method = "pearson")
##
## Pearson's product-moment correlation
##
## data: dataset$Antivirus and dataset$Laptop
## t = 25.16, df = 120, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.8830253 0.9412249
## sample estimates:
## cor
## 0.9168679
## Report
# Q1) What is the direction of the effect?
# The correlation between Antivirus and Laptop is positive. This means that as the Antivirus values increase,
# the Laptop values also increase. So the direction of the relationship is positive.
# Q2) What is the size of the effect?
# The size of the effect is strong. The correlation (about r = 0.92) falls in the ±0.50 to 1.00 range,
# which means there is a strong relationship between the two variables.