Title: “Correlation Analysis” Author: “Mohammed Parvez” output: html_document date: “2026-02-02” Web URL: http://rpubs.com/pmohammed/1392646

Overview

The goal of this assignment was to explore relationships between variables using correlation analysis in RStudio. Depending on whether the data met assumptions of normality, either Pearson or Spearman correlation coefficients were considered. Normality was evaluated through histograms and the Shapiro–Wilk test.

The analysis focused on two research questions:

Is there a relationship between the number of hours students study and their exam scores?

Is there a relationship between daily phone use and the number of hours a person sleeps?

Research Question 1: Study Hours and Exam Scores Descriptive Statistics

On average, students reported studying 6.14 hours (SD = 1.37). The mean exam score was 90.07% (SD = 6.80).

Normality Assessment

Results from the Shapiro–Wilk test showed that study hours were normally distributed, while exam scores were not (p < .05). Because this violated the normality assumption, a Spearman correlation was used for further analysis.

Correlation Results

The Spearman correlation analysis revealed a strong positive relationship between study hours and exam scores, ρ(98) = .90, p < .001.

Interpretation

This finding suggests that students who spent more time studying generally achieved higher exam scores. The strength of the relationship indicates that study time plays an important role in academic performance.

Research Question 2: Screen Time and Sleeping Hours Descriptive Statistics

Participants reported an average daily screen time of 5.06 hours (SD = 2.06). The mean amount of sleep was 6.94 hours per night (SD = 1.35).

Normality Assessment

Shapiro–Wilk tests indicated that screen time was not normally distributed, whereas sleeping hours were normally distributed. Since at least one variable violated the assumption of normality, a Spearman correlation was again selected.

Correlation Results

The results showed a moderate negative relationship between screen time and sleeping hours, ρ(98) = −.55, p < .001.

Interpretation

This result indicates that individuals who spent more time on their phones tended to sleep fewer hours. Increased screen use appears to be associated with reduced sleep duration.

Conclusion

Overall, the findings from this analysis highlight two clear patterns. Greater study time was strongly associated with better exam performance, emphasizing the importance of consistent studying for academic success. In contrast, increased screen time was linked to reduced sleep, suggesting that excessive phone use may negatively affect sleep habits.

Together, these results underscore the value of maintaining healthy study routines and being mindful of screen use to support both academic outcomes and overall well-being.

Libraries

library(readxl)
library(ggpubr)
## Loading required package: ggplot2
setwd("C:/Users/Mrlaz/Applied Analytics")

DatasetA <- read_excel("DatasetA.xlsx")
DatasetB <- read_excel("DatasetB.xlsx")
mean(DatasetA$StudyHours)
## [1] 6.135609
sd(DatasetA$StudyHours)
## [1] 1.369224
mean(DatasetA$ExamScore)
## [1] 90.06906
sd(DatasetA$ExamScore)
## [1] 6.795224
mean(DatasetB$ScreenTime)
## [1] 5.063296
sd(DatasetB$ScreenTime)
## [1] 2.056833
mean(DatasetB$SleepingHours)
## [1] 6.938459
sd(DatasetB$SleepingHours)
## [1] 1.351332
hist(DatasetA$StudyHours,
     main = "Study Hours",
     xlab = "Hours Studied")

hist(DatasetA$ExamScore,
     main = "Exam Score",
     xlab = "Exam Score (%)")

hist(DatasetB$ScreenTime,
     main = "Screen Time",
     xlab = "Hours")

hist(DatasetB$SleepingHours,
     main = "Sleeping Hours",
     xlab = "Hours")

shapiro.test(DatasetA$StudyHours)
## 
##  Shapiro-Wilk normality test
## 
## data:  DatasetA$StudyHours
## W = 0.99388, p-value = 0.9349
shapiro.test(DatasetA$ExamScore)
## 
##  Shapiro-Wilk normality test
## 
## data:  DatasetA$ExamScore
## W = 0.96286, p-value = 0.006465
shapiro.test(DatasetB$ScreenTime)
## 
##  Shapiro-Wilk normality test
## 
## data:  DatasetB$ScreenTime
## W = 0.90278, p-value = 1.914e-06
shapiro.test(DatasetB$SleepingHours)
## 
##  Shapiro-Wilk normality test
## 
## data:  DatasetB$SleepingHours
## W = 0.98467, p-value = 0.3004
cor.test(
  DatasetA$StudyHours,
  DatasetA$ExamScore,
  method = "spearman"
)
## Warning in cor.test.default(DatasetA$StudyHours, DatasetA$ExamScore, method =
## "spearman"): Cannot compute exact p-value with ties
## 
##  Spearman's rank correlation rho
## 
## data:  DatasetA$StudyHours and DatasetA$ExamScore
## S = 16518, p-value < 2.2e-16
## alternative hypothesis: true rho is not equal to 0
## sample estimates:
##       rho 
## 0.9008825
cor.test(
  DatasetB$ScreenTime,
  DatasetB$SleepingHours,
  method = "spearman"
)
## 
##  Spearman's rank correlation rho
## 
## data:  DatasetB$ScreenTime and DatasetB$SleepingHours
## S = 259052, p-value = 3.521e-09
## alternative hypothesis: true rho is not equal to 0
## sample estimates:
##        rho 
## -0.5544674
ggscatter(
  DatasetA,
  x = "StudyHours",
  y = "ExamScore",
  add = "reg.line",
  conf.int = TRUE
)

ggscatter(
  DatasetB,
  x = "ScreenTime",
  y = "SleepingHours",
  add = "reg.line",
  conf.int = TRUE
)