H1: There is a relationship between the Number of sold Laptops and
Number of sold anti-virus software
# ===================================================
# PEARSON CORRELATION & SPEARMAN CORRELATION OVERVIEW
# ===================================================
# PURPOSE
# Used to test the relationship between two continuous variables.
# ==========
# HYPOTHESES
# ==========
# H0: There is no relationship between the selling quantity of laptops and anti-virus software
# H1: There is a relationship between the selling quantity of laptops and anti-virus software
# .................................................................
# ======================
# IMPORT EXCEL FILE CODE
# ======================
library(readxl)
dataset <- read_excel("D:/20251021 AA 5221 Applied Analytics & Methods 1/Week 5/A5RQ2.xlsx")
# ======================
# DESCRIPTIVE STATISTICS
# ======================
# Calculate the mean, median, SD, and sample size for each variable.
library(psych)
describe(dataset[, c("Antivirus", "Laptop")])
## vars n mean sd median trimmed mad min max range skew
## Antivirus 1 122 50.18 13.36 49 49.92 12.60 15 83 68 0.15
## Laptop 2 122 40.02 12.30 39 39.93 11.86 8 68 60 -0.01
## kurtosis se
## Antivirus -0.14 1.21
## Laptop -0.32 1.11
# ===============================================
# CHECK THE NORMALITY OF THE CONTINUOUS VARIABLES
# ===============================================
# CREATE A HISTOGRAM FOR EACH CONTINUOUS VARIABLE
hist(dataset$Antivirus,
main = "Histogram of Antivirus software sold",
xlab = "Value",
ylab = "Frequency",
col = "lightblue",
border = "black",
breaks = 20)

hist(dataset$Laptop,
main = "Histogram of sold Laptop",
xlab = "Value",
ylab = "Frequency",
col = "lightgreen",
border = "black",
breaks = 20)

# ........................................................
# COMMENT:
# Var 1 histogram looks symmetrical, slightly positively skewed, and has a proper bell curve
# Var 2 histogram looks symmetrical, slightly negatively skewed, and has a proper bell curve
# ........................................................
# CONDUCT THE SHAPIRO-WILK TEST
shapiro.test(dataset$Antivirus)
##
## Shapiro-Wilk normality test
##
## data: dataset$Antivirus
## W = 0.99419, p-value = 0.8981
shapiro.test(dataset$Laptop)
##
## Shapiro-Wilk normality test
##
## data: dataset$Laptop
## W = 0.99362, p-value = 0.8559
# .........................................................
# COMMENT:
# Both variable 1 (Number of sold Antivirus software) and variable 2 (Number of sold Laptop) are NORMALLY DISTRIBUTED
# Var 1 Number of sold Antivirus software: M = 50.18, SD = 13.36, p = 0.898 > .05
# Var 2 Number of sold Laptop: M = 40.02, SD = 12.30, p = 0.855 > .05
# .........................................................
# Continue with Pearson Correlation Test
# =========================
# VISUALLY DISPLAY THE DATA
# =========================
library(ggplot2)
##
## Attaching package: 'ggplot2'
## The following objects are masked from 'package:psych':
##
## %+%, alpha
library(ggpubr)
ggscatter(dataset, x = "Antivirus", y = "Laptop",
add = "reg.line",
conf.int = TRUE,
cor.coef = TRUE,
cor.method = "pearson",
xlab = "Number of sold Antivirus software", ylab = "Number of sold Laptop")

# ........................................................
# There is a Positive relationship (Line pointing up)
# ........................................................
# ================================================
# PEARSON CORRELATION OR SPEARMAN CORRELATION TEST
# ================================================
cor.test(dataset$Antivirus, dataset$Laptop, method = "pearson")
##
## Pearson's product-moment correlation
##
## data: dataset$Antivirus and dataset$Laptop
## t = 25.16, df = 120, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.8830253 0.9412249
## sample estimates:
## cor
## 0.9168679
# DETERMINE STATISTICAL SIGNIFICANCE
# If results were statistically significant (p < .05), continue to effect size section below.
# If results were NOT statistically significant (p > .05), skip to reporting section below.
# ===============================================
# EFFECT SIZE FOR PEARSON & SPEARMAN CORRRELATION
# ===============================================
# ........................................................
# 1) WRITE THE REPORT
# Q1) The effect is Positive
# Q2) The effect is Strong cor = 0.917
# ========================================================
# >> WRITTEN REPORT FOR PEARSON CORRELATION <<
# ========================================================
# 1) OUTPUT
# 1) Pearson Correlation Test
# 2) Number of sold Antivirus software and Number of sold Laptop
# 3) n = 122
# 4) The inferential result is statistically significant, p < .001
# 5) Var 1 Number of sold Antivirus: M = 50.18, SD = 13.36, p = 0.898 > .05
# Var 2 Number of sold Laptop: M = 40.02, SD = 12.30, p = 0.855 > .05
# 6) The correlation is Positive and Strong, cor = 0.917
# 7) df = 120
# 8) cor = 0.917
# 9) p < .001
# ........................................................
# 2) WRITE YOUR FINAL REPORT
# A Pearson correlation was conducted to examine the relationship between
# Number of sold Antivirus software and Number of sold Laptop (n = 122).
# There was a statistically significant correlation between
# Number of antivirus (M = 50.18, SD = 13.36) and Number of Laptop (M = 40.02, SD = 12.30).
# The correlation was Positive and Strong, r = 0.92, p < .001.
# As Number of sold Laptop increases, Number of sold Antivirus increases.