First Assignment

We have 4 sections

finds how spread out the study hours are

DatasetA <- read_excel(“/Users/ha113ab/Desktop/datasets/DatasetA.xlsx”) ’’’we are reading the datasets from the folder that it was saved in accessing it through path.

In The First Section We are doing a Descriptive Statistics: here we are calculating means and standard deviations.

’’‘{r} mean(DatasetA$StudyHours) ''' finds the average study hours '''{r} sd(DatasetA$StudyHours)’’’ finds how spread out the study hours are.

’’‘{r} mean(DatasetA$ExamScore) ''' finds the average study hours '''{r} sd(DatasetA$ExamScore)’’’ finds how spread out the study hours are

Then we displayed it through

The first histogram displays how many students studied different amounts of hours

’’‘{r} hist(DatasetA$StudyHours, main = “StudyHours”, breaks = 20, col = “orange”, border = “black”, cex.main = 1, cex.axis = 1, cex.lab = 1)’’’

AND

The second histogram shows how many students received different exam scores.

’’‘{r} hist(DatasetA$ExamScore.
main = “ExamScore”, breaks = 20, col = “grey”, border = “white”, cex.main = 1, cex.axis = 1, cex.lab = 1)’’’

In The Second Section We are doing a Normality Tests: here we are checking if the data is normally distributed, or bell-shaped.

’’‘{r} shapiro.test(DatasetA$StudyHours) ''' tests if study hours follow a normal distribution. '''{r} shapiro.test(DatasetA$ExamScore)’’’ tests if Exam Scores follow a normal distribution

In The Third Section We are doing a Correlational Analysis: here we are checking relationships between variables.

’’‘{r} cor.test(DatasetA$StudyHours, DatasetA$ExamScore, method = “spearman”)’’’ checks the same but does not assume a straight-line relationship.

In The Fourth and Final Section we are basically visualizing where

ggscatter() creates a scatterplot with dots for each student and a trend line.

’’’{r} ggscatter( DatasetA, x =“StudyHours”, y =“ExamScore”, add = “reg.line”, xlab = “StudyHours”, ylab = “ExamScore”,

) ’’’

First Assignment

Haileab Bekele

2026-02-03