Assignment 1 Q2

library(readxl) 
library(ggpubr)

## Loading required package: ggplot2

DatasetB <- read_excel("C:/Users/Joyce/Downloads/DatasetB.xlsx")

mean(DatasetB$ScreenTime)

## [1] 5.063296

sd(DatasetB$ScreenTime)

## [1] 2.056833

mean(DatasetB$SleepingHours)

## [1] 6.938459

sd(DatasetB$SleepingHours)

## [1] 1.351332

hist(DatasetB$ScreenTime,
     main = "ScreenTime",
     breaks = 20,
     col = "lightblue",
     border = "white",
     cex.main = 1,
     cex.axis = 1,
     cex.lab = 1)

hist(DatasetB$SleepingHours,
     main = "Sleeping Hours",
     breaks = 20,
     col = "lightcoral",
     border = "white",
     cex.main = 1,
     cex.axis = 1,
     cex.lab = 1)

The variable “Screen Time” does not appears normally distributed. The data looks positively skewed (most data is on the left). The data does not appears to have a proper bell curve. The variable “Sleeping Hours” appears normally distributed. The data looks symmetrical (most data is in the middle). The data also appears to have a proper bell curve.

shapiro.test(DatasetB$ScreenTime)

## 
##  Shapiro-Wilk normality test
## 
## data:  DatasetB$ScreenTime
## W = 0.90278, p-value = 1.914e-06

shapiro.test(DatasetB$SleepingHours)

## 
##  Shapiro-Wilk normality test
## 
## data:  DatasetB$SleepingHours
## W = 0.98467, p-value = 0.3004

The Shaprio-Wilk p-value for Screen Time normality test is less than .05 (1.914e-06), so the data is not normal. The Shapiro-Wilk p-value for the Sleeping Hours normality test is greater than .05 (.30), so the data is normal.

cor.test(DatasetB$ScreenTime, 
         DatasetB$SleepingHours,
         method = "spearman")

## 
##  Spearman's rank correlation rho
## 
## data:  DatasetB$ScreenTime and DatasetB$SleepingHours
## S = 259052, p-value = 3.521e-09
## alternative hypothesis: true rho is not equal to 0
## sample estimates:
##        rho 
## -0.5544674

The Spearman Correlation test was selected because both variables were abnormally distributed according to the histograms and the Shapiro-Wilk tests. The p-value (probability value) is 3.52e, which is below .05. This means the results are statistically significant. The alternate hypothesis is supported. The rho-value is -.55. The correlation is negative, which means as Screen Time increases, sleeping Hours decrease. The correlation value is greater -0.50, which means the relationship is strong.

ggscatter(
  DatasetB,
  x = "ScreenTime",
  y = "SleepingHours",
  add = "reg.line",
  xlab = "Screen Time",
  ylab = "Sleeping Hours"
)

The line of best fit is pointing to the bottom right. This means the direction of the data is negative. As Screen Time increases, Sleeping Hours decreases. The dots are fairly close to hug the line. This means there is a moderate strong relationship between the variables. The dots form a straight-line pattern. This means the data is linear. There is possibly outliers at very high screen time and very high sleeping hours. Therefore, it does not appear to strongly impact the overall negative relationship between Screen Time and Sleeping Hours.

Screen time (M = 5.06, SD = 2.06) was correlated with sleeping hours (M = 6.94, SD = 1.35), ρ(98) = -.55, p = .001. The relationship was negative and strong. As screen time increased, sleeping hours decreased.

Assignment 1 Q2

Joyce Ben

2026-02-09