library(ggplot2)
library(ggpubr)
library(readxl)
DatasetA <- read_excel("C:/Users/Admin/Downloads/DatasetA.xlsx")
Variables: Study Hours (Independent Variable), Exam Score (Dependent Variable)
#Part 2: Descriptive Statistics
mean(DatasetA$StudyHours, na.rm = TRUE)
## [1] 6.135609
sd(DatasetA$StudyHours, na.rm = TRUE)
## [1] 1.369224
mean(DatasetA$ExamScore, na.rm = TRUE)
## [1] 90.06906
sd(DatasetA$ExamScore, na.rm = TRUE)
## [1] 6.795224
#Part 3: Check Normality
hist(DatasetA$StudyHours,
main = "Study Hours",
breaks = 20,
xlab="Independent Variable Graph: Study Hours",
col = "lightblue",
border = "lightyellow",
cex.main = 1,
cex.axis = 1,
cex.lab = 1)
The histogram shows that study hours are approximately normally distributed, with a symmetrical shape and a clear bell-curve pattern.
hist(DatasetA$ExamScore,
main = "Exam Score",
breaks = 20,
col = "lightblue",
xlab="Dependent Variable Graph: Exam Score",
border = "lightyellow",
cex.main = 1,
cex.axis = 1,
cex.lab = 1)
The histogram shows that most exam scores are high, with fewer low scores, which makes the distribution slightly uneven and not normally distributed.
shapiro.test(DatasetA$StudyHours)
##
## Shapiro-Wilk normality test
##
## data: DatasetA$StudyHours
## W = 0.99388, p-value = 0.9349
shapiro.test(DatasetA$ExamScore)
##
## Shapiro-Wilk normality test
##
## data: DatasetA$ExamScore
## W = 0.96286, p-value = 0.006465
Exam scores are not normally distributed, a Spearman correlation will be used.
cor_test_A <- cor.test(DatasetA$StudyHours, DatasetA$ExamScore, method = "spearman")
## Warning in cor.test.default(DatasetA$StudyHours, DatasetA$ExamScore, method =
## "spearman"): Cannot compute exact p-value with ties
cor_test_A
##
## Spearman's rank correlation rho
##
## data: DatasetA$StudyHours and DatasetA$ExamScore
## S = 16518, p-value < 2.2e-16
## alternative hypothesis: true rho is not equal to 0
## sample estimates:
## rho
## 0.9008825
The Spearman correlation analysis showed a statistically significant relationship between study hours and exam scores (p < .001). The association was positive, indicating that higher study hours were linked to higher exam scores. The relationship was strong, with a Spearman correlation coefficient of ρ = 0.90.
ggscatter(DatasetA,
x = "StudyHours",
y = "ExamScore",
add = "reg.line",
xlab = "Study Hours",
ylab = "Exam Score (%)",
title = "Relationship Between Study Hours and Exam Score")
The scatterplot shows that as study hours increase, exam scores also increase. The points follow an upward pattern, indicating a strong positive relationship with no major outliers.
#Part 6: Interpretation:
Direction: The relationship between study hours and exam score is positive, meaning higher study hours are associated with higher exam scores. Strength: The data indicate a strong association between the variables. Linearity: The pattern follows a monotonic increasing trend, supporting the use of a Spearman correlation. Outliers: No extreme outliers are present that would significantly affect the results.
#Results
The independent variable, study hours (M = 6.14, SD = 1.37), was correlated with the dependent variable, exam score (M = 90.07, SD = 6.80), ρ(98) = .90, p < .001. The relationship was positive and strong. As study hours increased, exam scores increased.
Variables: Screen Time (IV), Sleeping Hours (DV)
#Importing DatasetB
DatasetB <- read_excel("C:/Users/Admin/Downloads/DatasetB.xlsx")
mean(DatasetB$ScreenTime, na.rm = TRUE)
## [1] 5.063296
sd(DatasetB$ScreenTime, na.rm = TRUE)
## [1] 2.056833
mean(DatasetB$SleepingHours, na.rm = TRUE)
## [1] 6.938459
sd(DatasetB$SleepingHours, na.rm = TRUE)
## [1] 1.351332
#Part 3: Check Normality
hist(DatasetB$ScreenTime,
main = "Screen Time",
breaks = 20,
col = "lightblue",
xlab="Independent Variable Graph: Screen Time",
border = "lightyellow",
cex.main = 1,
cex.axis = 1,
cex.lab = 1)
The histogram shows that screen time is positively skewed, with more values clustered at the lower end and fewer higher values, indicating that the data are not normally distributed.
hist(DatasetB$SleepingHours,
main = "Sleeping Hours",
breaks = 20,
col = "lightblue",
xlab="Dependent Variable Graph: Sleeping Hours",
border = "lightyellow",
cex.main = 1,
cex.axis = 1,
cex.lab = 1)
The histogram shows that sleeping hours are approximately normally distributed, with a symmetrical shape and most values centered around the middle.
For ScreenTime p < .05 → variable is not normal
For SleepingHours p ≥ .05 → variable is normal
shapiro.test(DatasetB$ScreenTime)
##
## Shapiro-Wilk normality test
##
## data: DatasetB$ScreenTime
## W = 0.90278, p-value = 1.914e-06
shapiro.test(DatasetB$SleepingHours)
##
## Shapiro-Wilk normality test
##
## data: DatasetB$SleepingHours
## W = 0.98467, p-value = 0.3004
Decision: Because screen time was not normally distributed based on the Shapiro–Wilk test, a Spearman correlation will be used.
cor_test_B <- cor.test(DatasetB$ScreenTime, DatasetB$SleepingHours, method = "spearman")
cor_test_B
##
## Spearman's rank correlation rho
##
## data: DatasetB$ScreenTime and DatasetB$SleepingHours
## S = 259052, p-value = 3.521e-09
## alternative hypothesis: true rho is not equal to 0
## sample estimates:
## rho
## -0.5544674
The Spearman correlation indicated a statistically significant association between screen time and sleeping hours (p < .001). The relationship was negative, meaning that individuals who spent more time on their phones generally reported fewer hours of sleep. The strength of the association was moderate, with a correlation coefficient of ρ = −0.55.
ggscatter(DatasetB,
x = "ScreenTime",
y = "SleepingHours",
add = "reg.line",
xlab = "Screen Time (Hours)",
ylab = "Sleeping Hours",
title = "Relationship Between Screen Time and Sleeping Hours")
The scatterplot shows a downward trend, meaning that people who spend more time on their phones tend to sleep fewer hours.
Direction: The relationship between screen time and sleeping hours is negative, indicating that increased phone usage is associated with fewer hours of sleep. Strength: The relationship is moderate in strength. Linearity: The pattern appears monotonic, which is appropriate for a Spearman correlation. Outliers: No extreme outliers are observed that would meaningfully influence the relationship.
#Results
The independent variable, screen time (M = 5.06, SD = 2.06), was correlated with the dependent variable, sleeping hours (M = 6.94, SD = 1.35), ρ(98) = −.55, p < .001. The relationship was negative and moderate. As screen time increased, sleeping hours decreased.