library(ggplot2)
library(ggpubr)
library(readxl)
DatasetA <- read_excel("C:/Users/Admin/Downloads/DatasetA.xlsx")
DatasetB <- read_excel("C:/Users/Admin/Downloads/DatasetB.xlsx")
Variables: Study Hours (Independent Variable), Exam Score (Dependent Variable)
mean(DatasetA$StudyHours, na.rm = TRUE)
## [1] 6.135609
sd(DatasetA$StudyHours, na.rm = TRUE)
## [1] 1.369224
mean(DatasetA$ExamScore, na.rm = TRUE)
## [1] 90.06906
sd(DatasetA$ExamScore, na.rm = TRUE)
## [1] 6.795224
Independent Variable Graph:
Skewness: Symmetrical
Kurtosis: Proper bell curve
hist(DatasetA$StudyHours,
main = "Study Hours",
breaks = 20,
xlab="Independent Variable Graph: Study Hours",
col = "lightblue",
border = "lightyellow",
cex.main = 1,
cex.axis = 1,
cex.lab = 1)
Dependent Variable Graph:
Skewness: Negatively skewed
Kurtosis: Too tall
hist(DatasetA$ExamScore,
main = "Exam Score",
breaks = 20,
col = "lightblue",
xlab="Dependent Variable Graph: Exam Score",
border = "lightyellow",
cex.main = 1,
cex.axis = 1,
cex.lab = 1)
shapiro.test(DatasetA$StudyHours)
##
## Shapiro-Wilk normality test
##
## data: DatasetA$StudyHours
## W = 0.99388, p-value = 0.9349
shapiro.test(DatasetA$ExamScore)
##
## Shapiro-Wilk normality test
##
## data: DatasetA$ExamScore
## W = 0.96286, p-value = 0.006465
Exam scores are not normally distributed, a Spearman correlation will be used.
cor_test_A <- cor.test(DatasetA$StudyHours, DatasetA$ExamScore, method = "spearman")
## Warning in cor.test.default(DatasetA$StudyHours, DatasetA$ExamScore, method =
## "spearman"): Cannot compute exact p-value with ties
cor_test_A
##
## Spearman's rank correlation rho
##
## data: DatasetA$StudyHours and DatasetA$ExamScore
## S = 16518, p-value < 2.2e-16
## alternative hypothesis: true rho is not equal to 0
## sample estimates:
## rho
## 0.9008825
The Spearman correlation analysis showed a statistically significant relationship between study hours and exam scores (p < .001). The association was positive, indicating that higher study hours were linked to higher exam scores. The relationship was strong, with a Spearman correlation coefficient of ρ = 0.90.
ggscatter(DatasetA,
x = "StudyHours",
y = "ExamScore",
add = "reg.line",
xlab = "Study Hours",
ylab = "Exam Score (%)",
title = "Relationship Between Study Hours and Exam Score")
Direction: The relationship between study hours and exam score is positive, meaning higher study hours are associated with higher exam scores. Strength: The data indicate a strong association between the variables. Linearity: The pattern follows a monotonic increasing trend, supporting the use of a Spearman correlation. Outliers: No extreme outliers are present that would significantly affect the results.
Variables: Screen Time (IV), Sleeping Hours (DV)
mean(DatasetB$ScreenTime, na.rm = TRUE)
## [1] 5.063296
sd(DatasetB$ScreenTime, na.rm = TRUE)
## [1] 2.056833
mean(DatasetB$SleepingHours, na.rm = TRUE)
## [1] 6.938459
sd(DatasetB$SleepingHours, na.rm = TRUE)
## [1] 1.351332
Independent Variable Graph:
Skewness: Positively Skewed
Kurtosis: Too Flat
hist(DatasetB$ScreenTime,
main = "Screen Time",
breaks = 20,
col = "lightblue",
xlab="Independent Variable Graph: Screen Time",
border = "lightyellow",
cex.main = 1,
cex.axis = 1,
cex.lab = 1)
Dependent Variable Graph:
Skewness: Symmetrical
Kurtosis: Proper Bell Curve
hist(DatasetB$SleepingHours,
main = "Sleeping Hours",
breaks = 20,
col = "lightblue",
xlab="Dependent Variable Graph: Sleeping Hours",
border = "lightyellow",
cex.main = 1,
cex.axis = 1,
cex.lab = 1)
For ScreenTime p < .05 → variable is not normal
For SleepingHours p ≥ .05 → variable is normal
shapiro.test(DatasetB$ScreenTime)
##
## Shapiro-Wilk normality test
##
## data: DatasetB$ScreenTime
## W = 0.90278, p-value = 1.914e-06
shapiro.test(DatasetB$SleepingHours)
##
## Shapiro-Wilk normality test
##
## data: DatasetB$SleepingHours
## W = 0.98467, p-value = 0.3004
Decision: Because screen time was not normally distributed based on the Shapiro–Wilk test, a Spearman correlation will be used.
cor_test_B <- cor.test(DatasetB$ScreenTime, DatasetB$SleepingHours, method = "spearman")
cor_test_B
##
## Spearman's rank correlation rho
##
## data: DatasetB$ScreenTime and DatasetB$SleepingHours
## S = 259052, p-value = 3.521e-09
## alternative hypothesis: true rho is not equal to 0
## sample estimates:
## rho
## -0.5544674
The Spearman correlation indicated a statistically significant association between screen time and sleeping hours (p < .001). The relationship was negative, meaning that individuals who spent more time on their phones generally reported fewer hours of sleep. The strength of the association was moderate, with a correlation coefficient of ρ = −0.55.
ggscatter(DatasetB,
x = "ScreenTime",
y = "SleepingHours",
add = "reg.line",
xlab = "Screen Time (Hours)",
ylab = "Sleeping Hours",
title = "Relationship Between Screen Time and Sleeping Hours")
Direction: The relationship between screen time and sleeping hours is negative, indicating that increased phone usage is associated with fewer hours of sleep. Strength: The relationship is moderate in strength. Linearity: The pattern appears monotonic, which is appropriate for a Spearman correlation. Outliers: No extreme outliers are observed that would meaningfully influence the relationship.