RPubs Link:
https://rpubs.com/MianAfzaalZahoor/1392723

Loading Libraries

library(ggplot2)
library(ggpubr)
library(readxl)

Importing Datasets

DatasetA <- read_excel("DatasetA.xlsx")
DatasetB <- read_excel("DatasetB.xlsx")

Research Question 1: What is the relationship between study hours and exam score?

Variables: Study Hours (IV), Exam Score (DV)

Descriptive Statistics

mean(DatasetA$StudyHours, na.rm = TRUE)
## [1] 6.135609
sd(DatasetA$StudyHours, na.rm = TRUE)
## [1] 1.369224
mean(DatasetA$ExamScore, na.rm = TRUE)
## [1] 90.06906
sd(DatasetA$ExamScore, na.rm = TRUE)
## [1] 6.795224

Normality Check

hist(DatasetA$StudyHours,
     main = "Study Hours",
     breaks = 20,
     xlab="Independent Variable Graph: Study Hours",
     col = "lightblue",
     border = "blue",
     cex.main = 1,
     cex.axis = 1,
     cex.lab = 1)

Independent Variable Graph:

Skewness: Symmetrical

Kurtosis: Proper bell curve

hist(DatasetA$ExamScore,
     main = "Exam Score",
     breaks = 20,
     col = "lightblue",
     xlab="Dependent Variable Graph: Exam Score",
     border = "red",
     cex.main = 1,
     cex.axis = 1,
     cex.lab = 1)

Dependent Variable Graph

Skewness: Negatively skewed

Kurtosis: Too tall

Conducting Shapiro–Wilk tests

shapiro.test(DatasetA$StudyHours)
## 
##  Shapiro-Wilk normality test
## 
## data:  DatasetA$StudyHours
## W = 0.99388, p-value = 0.9349
shapiro.test(DatasetA$ExamScore)
## 
##  Shapiro-Wilk normality test
## 
## data:  DatasetA$ExamScore
## W = 0.96286, p-value = 0.006465

For StudyHours p ≥ .05 → variable is normal

For ExamScore p < .05 → variable is not normal

Decision: Because exam scores were not normally distributed, a Spearman correlation will be used.

cor_test_A <- cor.test(DatasetA$StudyHours, DatasetA$ExamScore, method = "spearman")
## Warning in cor.test.default(DatasetA$StudyHours, DatasetA$ExamScore, method =
## "spearman"): Cannot compute exact p-value with ties
cor_test_A
## 
##  Spearman's rank correlation rho
## 
## data:  DatasetA$StudyHours and DatasetA$ExamScore
## S = 16518, p-value < 2.2e-16
## alternative hypothesis: true rho is not equal to 0
## sample estimates:
##       rho 
## 0.9008825

Correlation Analysis

Statistical Significance: The results were statistically significant, as the p-value was less than .05 (p < .001).

Direction of the Relationship: The relationship between study hours and exam score was positive. This means that as the number of hours students study increases, exam scores also tend to increase.

Strength of the Relationship: The relationship was strong. The Spearman correlation coefficient was p = 0.90, which indicates a very strong association between study time and exam performance.

Scatterplots

ggscatter(DatasetA,
          x = "StudyHours",
          y = "ExamScore",
          add = "reg.line",
          conf.int = TRUE,
          xlab = "Study Hours",
          ylab = "Exam Score (%)",
          title = "Relationship Between Study Hours and Exam Score")

Interpretation:

Direction: The line of best fit slopes upward, indicating a positive relationship. As study hours increase, exam scores increase.

Strength: The points closely hug the line of best fit, indicating a strong relationship.

Linearity: The points form a clear straight-line pattern, showing the relationship is linear.

Outliers: There are no extreme points far away from the main cluster, so no serious outliers are evident.

Report of the Results

Study hours (M = 6.13, SD = 1.36) was correlated with exam scores (M = 90.06, SD = 6.79),
df(98) = .90, p < .001.
The relationship was positive and strong.
As the number of hours students studied increased, their exam scores increased.

Research Question 2: What is the relationship between how much a person uses their phone (hours) and how much they sleep (hours)?

Variables: Screen Time (IV), Sleeping Hours (DV)

Descriptive Statistics

mean(DatasetB$ScreenTime, na.rm = TRUE)
## [1] 5.063296
sd(DatasetB$ScreenTime, na.rm = TRUE)
## [1] 2.056833
mean(DatasetB$SleepingHours, na.rm = TRUE)
## [1] 6.938459
sd(DatasetB$SleepingHours, na.rm = TRUE)
## [1] 1.351332

Normality Check

hist(DatasetB$ScreenTime,
     main = "Screen Time",
     breaks = 20,
     col = "lightblue",
     xlab="Independent Variable Graph: Screen Time",
     border = "red",
     cex.main = 1,
     cex.axis = 1,
     cex.lab = 1)

Independent Variable Graph

Skewness: Positively Skewed

Kurtosis: Too Flat

hist(DatasetB$SleepingHours,
     main = "Sleeping Hours",
     breaks = 20,
     col = "lightblue",
     xlab="Dependent Variable Graph: Sleeping Hours",
     border = "blue",
     cex.main = 1,
     cex.axis = 1,
     cex.lab = 1)

Dependent Variable Graph

Skewness: Symmetrical

Kurtosis: Proper Bell Curve

Conducting Shapiro–Wilk tests

shapiro.test(DatasetB$ScreenTime)
## 
##  Shapiro-Wilk normality test
## 
## data:  DatasetB$ScreenTime
## W = 0.90278, p-value = 1.914e-06
shapiro.test(DatasetB$SleepingHours)
## 
##  Shapiro-Wilk normality test
## 
## data:  DatasetB$SleepingHours
## W = 0.98467, p-value = 0.3004

For ScreenTime p < .05 → variable is not normal

For SleepingHours p ≥ .05 → variable is normal

Decision: Because screen time was not normally distributed based on the Shapiro–Wilk test, a Spearman correlation will be used.

cor_test_B <- cor.test(DatasetB$ScreenTime, DatasetB$SleepingHours, method = "spearman")
cor_test_B
## 
##  Spearman's rank correlation rho
## 
## data:  DatasetB$ScreenTime and DatasetB$SleepingHours
## S = 259052, p-value = 3.521e-09
## alternative hypothesis: true rho is not equal to 0
## sample estimates:
##        rho 
## -0.5544674

Correlation Analysis

Statistical Significance: The results were statistically significant, as the p-value was less than .05 (p < .001).

Direction of the Relationship: The relationship between screen time and sleeping hours was negative. This indicates that as screen time increases, the amount of sleep tends to decrease.

Strength of the Relationship: The relationship was moderate in strength. The Spearman correlation coefficient was ρ = −0.55, which reflects a moderate negative association between phone use and sleep duration.

Scatterplots

ggscatter(DatasetB,
          x = "ScreenTime",
          y = "SleepingHours",
          add = "reg.line",
          conf.int = TRUE,
          xlab = "Screen Time (Hours)",
          ylab = "Sleeping Hours",
          title = "Relationship Between Screen Time and Sleeping Hours")

Interpretation

Direction: The line of best fit slopes downward, indicating a negative relationship. As screen time increases, sleeping hours decrease.

Strength: The points moderately cluster around the line, indicating a moderate to strong relationship.

Linearity: The points follow a generally straight-line pattern, indicating the relationship is monotonic and approximately linear, which supports using a Spearman correlation.

Outliers: A few points may appear slightly distant, but none are extreme enough to significantly distort the relationship.

Report of the Results

Screen time (M = 5.06, SD = 2.05) was correlated with sleeping hours (M = 6.93, SD = 1.35),
df(98) = −.55, p < 3.521e-09.
The relationship was negative and moderate to strong.
As phone usage increased, the number of hours spent sleeping decreased.