We have 4 sections
finds how spread out the study hours are
DatasetA <- read_excel(“/Users/ha113ab/Desktop/datasets/DatasetA.xlsx”) ’’’we are reading the datasets from the folder that it was saved in accessing it through path.
In The First Section We are doing a Descriptive Statistics: here we are calculating means and standard deviations.
’’‘{r} mean(DatasetA\(StudyHours) ''' finds the average study hours '''{r} sd(DatasetA\)StudyHours)’’’ finds how spread out the study hours are.
’’‘{r} mean(DatasetA\(ExamScore) ''' finds the average study hours '''{r} sd(DatasetA\)ExamScore)’’’ finds how spread out the study hours are
Then we displayed it through
The first histogram displays how many students studied different amounts of hours
’’‘{r} hist(DatasetA$StudyHours, main = “StudyHours”, breaks = 20, col = “orange”, border = “black”, cex.main = 1, cex.axis = 1, cex.lab = 1)’’’
AND
The second histogram shows how many students received different exam scores.
’’‘{r} hist(DatasetA$ExamScore.
main = “ExamScore”, breaks = 20, col = “grey”, border = “white”,
cex.main = 1, cex.axis = 1, cex.lab = 1)’’’
In The Second Section We are doing a Normality Tests: here we are checking if the data is normally distributed, or bell-shaped.
’’‘{r} shapiro.test(DatasetA\(StudyHours) ''' tests if study hours follow a normal distribution. '''{r} shapiro.test(DatasetA\)ExamScore)’’’ tests if Exam Scores follow a normal distribution
In The Third Section We are doing a Correlational Analysis: here we are checking relationships between variables.
’’‘{r} cor.test(DatasetA\(StudyHours, DatasetA\)ExamScore, method = “spearman”)’’’ checks the same but does not assume a straight-line relationship.
In The Fourth and Final Section we are basically visualizing where
ggscatter() creates a scatterplot with dots for each student and a trend line.
’’’{r} ggscatter( DatasetA, x =“StudyHours”, y =“ExamScore”, add = “reg.line”, xlab = “StudyHours”, ylab = “ExamScore”,
) ’’’