ASSIGNMENT 4

’’‘{rlibrary(readxl)’’’ ’’’{r}DatasetA <- read_excel(“C:/Users/user/Downloads/DatasetA.xlsx”) Datase A mean(DatasetA$StudyHours) sd(DatasetA$StudyHours) Output > mean((DatasetA$StudyHours) [1] 6.135609 > sd(DatasetA$StudyHours) [1] 1.369224

mean(DatasetA$ExamScore) sd(DatasetA$ExamScore) output > mean(DatasetA$ExamScore) [1] 90.06906 > sd(DatasetA$ExamScore) [1] 6.795224 ’’’ ’’’{r} hist(DatasetA$StudyHours, main = “StudyHours”, breaks = 20, col = “lightblue”, border = “white”, cex.main = 1, cex.axis = 1, cex.lab = 1)

hist(DatasetA$ExamScore, main = "ExamScore", breaks = 20, col = "lightcoral", border = "white", cex.main = 1, cex.axis = 1, cex.lab = 1) The variable "StudyHours" appears normally distributed. The data looks symmetrical (most data is in the middle). The data also appears to have a proper bell curve. The variable "ExamScore" appears normally distributed. The data looks symmetrical (most data is in the middle). The data also appears to have a proper bell curve. ''' '''{r} shapiro.test(DatasetA$StudyHours) shapiro.test(DatasetA$ExamScore) The Shaprio-Wilk p-value for StudyHours normality test is greater than .05 ( 0.9349), so the data is normal. The Shapiro-Wilk p-value for ExamScore the normality test is less than .05 (0.006465), so the data is not normal. ''' '''{r} correlation test cor.test(DatasetA$StudyHours, DatasetA$ExamScore, method = "pearson") Output Pearson's product-moment correlation data: DatasetA$StudyHours and DatasetA$ExamScore t = 20.959, df = 98, p-value < 2.2e-16 alternative hypothesis: true correlation is not equal to 0 95 percent confidence interval: 0.8606509 0.9346369 sample estimates: cor 0.904214 ’’’ Scatterplot ggscatter( DatasetA, x = “StudyHours”, y = “ExamScore”, add = “reg.line”, xlab = “StudyHours”, ylab = “ExamScore” ) ’’’

Dataset B ’’’{r} DatasetB <- read_excel(“C:/Users/user/Downloads/DatasetB.xlsx”) mean(DatasetB$ScreenTime) sd(DatasetB$ScreenTime) Output > mean(DatasetB$ScreenTime) [1] 5.063296

sd(DatasetB$ScreenTime) [1] 2.056833

mean(DatasetB$SleepingHours) sd(DatasetB$SleepingHours) Output

mean((DatasetB$SleepingHours) [1] 6.938459

sd(DatasetB$SleepingHours) [1] 1.351332 ''' '''{r} hist(DatasetB$ScreenTime, main = “ScreenTime”, breaks = 20, col = “lightblue”, border = “white”, cex.main = 1, cex.axis = 1, cex.lab = 1)

hist(DatasetB$SleepingHours, main = "SleepingHours", breaks = 20, col = "lightcoral", border = "white", cex.main = 1, cex.axis = 1, cex.lab = 1) The variable "ScreenTime" appears normally distributed. The data looks symmetrical (most data is in the middle). The data also appears to have a proper bell curve. The variable "SleepingHours" appears normally distributed. The data looks symmetrical (most data is in the middle). The data also appears to have a proper bell curve. ''' '''{r} shapiro.test(DatasetB$ScreenTime) shapiro.test(DatasetB$SleepingHours) The Shaprio-Wilk p-value for screenTime normality test is greater than .05 (1.914e-06), so the data is normal. The Shapiro-Wilk p-value for the SleepingHours normality test is less than .05 (0.3004), so the data is not normal. ''' correlation test '''{r} cor.test(DatasetB$ScreenTime, DatasetB$SleepingHours, method = "pearson") Output data: DatasetB$ScreenTime and DatasetB$SleepingHours t = -8.2538, df = 98, p-value = 7.27e-13 alternative hypothesis: true correlation is not equal to 0 95 percent confidence interval: -0.7433008 -0.5078341 sample estimates: cor -0.6403761 ’’’ ’’‘{r} Scatterplot ggscatter( DatasetB, x = “ScreenTime”, y = “SleepingHours”, add = “reg.line”, xlab = “ScreenTime”, ylab = “SleepingHours” )’’’

install.packages(“rmarkdown”) library(rmarkdown)

ASSIGNMENT 4

JOSEPHINE MAKENA

2026-02-09