RESEARCH SCENARIO 2 - ASSIGNMENT 5

Assess the correlation between Number of sold Antivirus software and Number of sold Laptop

HYPOTHESES:

H0: There is no relationship between the Number of sold Laptops and Number of sold anti-virus software

H1: There is a relationship between the Number of sold Laptops and Number of sold anti-virus software

# ===================================================
# PEARSON CORRELATION & SPEARMAN CORRELATION OVERVIEW
# ===================================================

# PURPOSE
# Used to test the relationship between two continuous variables.

# ==========
# HYPOTHESES
# ==========

# H0: There is no relationship between the selling quantity of laptops and anti-virus software
# H1: There is a relationship between the selling quantity of laptops and anti-virus software
# .................................................................

# ======================
# IMPORT EXCEL FILE CODE
# ======================

library(readxl)

dataset <- read_excel("D:/20251021 AA 5221 Applied Analytics & Methods 1/Week 5/A5RQ2.xlsx")

# ======================
# DESCRIPTIVE STATISTICS
# ======================

# Calculate the mean, median, SD, and sample size for each variable.

library(psych)

describe(dataset[, c("Antivirus", "Laptop")])
##           vars   n  mean    sd median trimmed   mad min max range  skew
## Antivirus    1 122 50.18 13.36     49   49.92 12.60  15  83    68  0.15
## Laptop       2 122 40.02 12.30     39   39.93 11.86   8  68    60 -0.01
##           kurtosis   se
## Antivirus    -0.14 1.21
## Laptop       -0.32 1.11
# ===============================================
# CHECK THE NORMALITY OF THE CONTINUOUS VARIABLES
# ===============================================

# CREATE A HISTOGRAM FOR EACH CONTINUOUS VARIABLE

hist(dataset$Antivirus,
     main = "Histogram of Antivirus software sold",
     xlab = "Value",
     ylab = "Frequency",
     col = "lightblue",
     border = "black",
     breaks = 20)

hist(dataset$Laptop,
     main = "Histogram of sold Laptop",
     xlab = "Value",
     ylab = "Frequency",
     col = "lightgreen",
     border = "black",
     breaks = 20)

# ........................................................

# COMMENT:
# Var 1 histogram looks symmetrical, slightly positively skewed, and has a proper bell curve
# Var 2 histogram looks symmetrical, slightly negatively skewed, and has a proper bell curve

# ........................................................

# CONDUCT THE SHAPIRO-WILK TEST

shapiro.test(dataset$Antivirus)
## 
##  Shapiro-Wilk normality test
## 
## data:  dataset$Antivirus
## W = 0.99419, p-value = 0.8981
shapiro.test(dataset$Laptop)
## 
##  Shapiro-Wilk normality test
## 
## data:  dataset$Laptop
## W = 0.99362, p-value = 0.8559
# .........................................................
# COMMENT:
# Both variable 1 (Number of sold Antivirus software) and variable 2 (Number of sold Laptop) are NORMALLY DISTRIBUTED
# Var 1 Number of sold Antivirus software: M = 50.18, SD = 13.36, p = 0.898 > .05
# Var 2 Number of sold Laptop: M = 40.02, SD = 12.30, p = 0.855 > .05
# .........................................................

# Continue with Pearson Correlation Test

# =========================
# VISUALLY DISPLAY THE DATA
# =========================

library(ggplot2)
## 
## Attaching package: 'ggplot2'
## The following objects are masked from 'package:psych':
## 
##     %+%, alpha
library(ggpubr)
ggscatter(dataset, x = "Antivirus", y = "Laptop",
          add = "reg.line",
          conf.int = TRUE,
          cor.coef = TRUE,
          cor.method = "pearson",
          xlab = "Number of sold Antivirus software", ylab = "Number of sold Laptop")

# ........................................................
# There is a Positive relationship (Line pointing up)
# ........................................................


# ================================================
# PEARSON CORRELATION OR SPEARMAN CORRELATION TEST
# ================================================

cor.test(dataset$Antivirus, dataset$Laptop, method = "pearson")
## 
##  Pearson's product-moment correlation
## 
## data:  dataset$Antivirus and dataset$Laptop
## t = 25.16, df = 120, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.8830253 0.9412249
## sample estimates:
##       cor 
## 0.9168679
# DETERMINE STATISTICAL SIGNIFICANCE
# If results were statistically significant (p < .05), continue to effect size section below.
# If results were NOT statistically significant (p > .05), skip to reporting section below.

# ===============================================
# EFFECT SIZE FOR PEARSON & SPEARMAN CORRRELATION
# ===============================================

# ........................................................

# 1) WRITE THE REPORT 
#    Q1) The effect is Positive
#    Q2) The effect is Strong cor = 0.917

# ========================================================
#     >> WRITTEN REPORT FOR PEARSON CORRELATION <<
# ========================================================

# 1) OUTPUT

#    1) Pearson Correlation Test
#    2) Number of sold Antivirus software and Number of sold Laptop
#    3) n = 122
#    4) The inferential result is statistically significant, p < .001
#    5) Var 1 Number of sold Antivirus: M = 50.18, SD = 13.36, p = 0.898 > .05
#       Var 2 Number of sold Laptop: M = 40.02, SD = 12.30, p = 0.855 > .05
#    6) The correlation is Positive and Strong, cor = 0.917
#    7) df = 120
#    8) cor = 0.917
#    9) p < .001

# ........................................................

# 2) WRITE YOUR FINAL REPORT

#    A Pearson correlation was conducted to examine the relationship between 
#    Number of sold Antivirus software and Number of sold Laptop (n = 122). 
#    There was a statistically significant correlation between 
#    Number of antivirus (M = 50.18, SD = 13.36) and Number of Laptop (M = 40.02, SD = 12.30). 
#    The correlation was Positive and Strong, r = 0.92, p < .001.
#    As Number of sold Laptop increases, Number of sold Antivirus increases.

OUTPUT PARAGRAPH

A Pearson correlation was conducted to examine the relationship between Number of sold Antivirus software and Number of sold Laptop (n = 122).

There was a statistically significant correlation between Number of antivirus (M = 50.18, SD = 13.36) and Number of Laptop (M = 40.02, SD = 12.30).

The correlation was Positive and Strong, r = 0.92, p < .001.

As Number of sold Laptop increases, Number of sold Antivirus increases.