Part I
Setting Up Libraries, and importing databases
library(readxl)
library(ggpubr)
## Loading required package: ggplot2
Importing Dataset B into the environment
DatasetB <- read_excel("/Users/sarva/Desktop/DatasetB.xlsx")
Part II Descriptive Statistics
After importing, we must calculate the standard deviation of each variable, IV- Screen time, DV- Sleeping hours
mean(DatasetB$ScreenTime)
## [1] 5.063296
sd(DatasetB$ScreenTime)
## [1] 2.056833
mean(DatasetB$SleepingHours)
## [1] 6.938459
sd(DatasetB$SleepingHours)
## [1] 1.351332
Interpretation: We have calculated the mean and standard deviation of the two variables provided, this data shall be used for further references
Part3 Histogram Visualisation
hist(DatasetB$ScreenTime,
main = "Screen Time" ,
breaks = 20,
col = "lightblue",
border = "white",
cex.main = 1 ,
cex.axis = 1,
cex.lab = 1)
hist(DatasetB$SleepingHours,
main = "Sleeping Hours",
breaks =20,
col ="red",
border = "black",
cex.main = 1,
cex.axis = 1,
cex.lab = 1)
Interpretation: As we can see above the histogram for variable “screen
time” is positively skewed, and histogram for variable “examscore”
appears to be normal
Normality tests
shapiro.test(DatasetB$ScreenTime)
##
## Shapiro-Wilk normality test
##
## data: DatasetB$ScreenTime
## W = 0.90278, p-value = 1.914e-06
shapiro.test(DatasetB$SleepingHours)
##
## Shapiro-Wilk normality test
##
## data: DatasetB$SleepingHours
## W = 0.98467, p-value = 0.3004
Interpretation: here we conduct shapiro test to check the p-value of each variable provided in the dataset, we can find and prove that, for vairable screen time P value > 0.05, i.e 1.914e-0.6, indicating the data is normal and for variable “exam score” we can prove that the data is abnormal P.Value <0.05
Part IV Correlation Analysis
cor.test(DatasetB$ScreenTime, DatasetB$SleepingHours, method ="spearman")
##
## Spearman's rank correlation rho
##
## data: DatasetB$ScreenTime and DatasetB$SleepingHours
## S = 259052, p-value = 3.521e-09
## alternative hypothesis: true rho is not equal to 0
## sample estimates:
## rho
## -0.5544674
Interpretation: after running the correlation test for both the variables, we can find that P.Value >0.05 i.e 3.521e-09
Part V Scatterplot Analysis
ggscatter(
DatasetB,
x = "ScreenTime",
y = "SleepingHours",
add = "reg.line",
xlab = "ScreenTime",
ylab = "SleepingHours"
)
Interpretation: After visualising the scatterplot diagram, we can see
that the data is linear, heading into a negative direction, with some
outliers, none of them extreme
Reporting the results
Mean (Studyhours) = 5.063296 Mean (Examscore) = 6.938459 Standard Deviation (studyhours) = 2.056833 Standard Deviation (examscore) = 1.351332 rho = -0.5544674