Part I

Setting Up Libraries, and importing databases

library(readxl)
library(ggpubr)
## Loading required package: ggplot2

Importing Dataset B into the environment

DatasetB <- read_excel("/Users/sarva/Desktop/DatasetB.xlsx")

Part II Descriptive Statistics

After importing, we must calculate the standard deviation of each variable, IV- Screen time, DV- Sleeping hours

mean(DatasetB$ScreenTime)
## [1] 5.063296
sd(DatasetB$ScreenTime)
## [1] 2.056833
mean(DatasetB$SleepingHours)
## [1] 6.938459
sd(DatasetB$SleepingHours)
## [1] 1.351332

Interpretation: We have calculated the mean and standard deviation of the two variables provided, this data shall be used for further references

Part3 Histogram Visualisation

hist(DatasetB$ScreenTime,
     main = "Screen Time" ,
     breaks = 20,
     col = "lightblue",
     border = "white",
     cex.main = 1 ,
     cex.axis = 1,
     cex.lab = 1)

hist(DatasetB$SleepingHours,
     main = "Sleeping Hours",
     breaks =20,
     col ="red",
     border = "black",
     cex.main = 1,
     cex.axis = 1,
     cex.lab = 1)

Interpretation: As we can see above the histogram for variable “screen time” is positively skewed, and histogram for variable “examscore” appears to be normal

Normality tests

shapiro.test(DatasetB$ScreenTime)
## 
##  Shapiro-Wilk normality test
## 
## data:  DatasetB$ScreenTime
## W = 0.90278, p-value = 1.914e-06
shapiro.test(DatasetB$SleepingHours)
## 
##  Shapiro-Wilk normality test
## 
## data:  DatasetB$SleepingHours
## W = 0.98467, p-value = 0.3004

Interpretation: here we conduct shapiro test to check the p-value of each variable provided in the dataset, we can find and prove that, for vairable screen time P value > 0.05, i.e 1.914e-0.6, indicating the data is normal and for variable “exam score” we can prove that the data is abnormal P.Value <0.05

Part IV Correlation Analysis

cor.test(DatasetB$ScreenTime, DatasetB$SleepingHours, method ="spearman")
## 
##  Spearman's rank correlation rho
## 
## data:  DatasetB$ScreenTime and DatasetB$SleepingHours
## S = 259052, p-value = 3.521e-09
## alternative hypothesis: true rho is not equal to 0
## sample estimates:
##        rho 
## -0.5544674

Interpretation: after running the correlation test for both the variables, we can find that P.Value >0.05 i.e 3.521e-09

Part V Scatterplot Analysis

ggscatter(
  DatasetB,
  x = "ScreenTime",
  y = "SleepingHours",
  add = "reg.line",
  xlab = "ScreenTime",
  ylab = "SleepingHours"
) 

Interpretation: After visualising the scatterplot diagram, we can see that the data is linear, heading into a negative direction, with some outliers, none of them extreme

Reporting the results

Mean (Studyhours) = 5.063296 Mean (Examscore) = 6.938459 Standard Deviation (studyhours) = 2.056833 Standard Deviation (examscore) = 1.351332 rho = -0.5544674