Now that we have cleaned the data, let’s proceed with exploratory data analysis to gain further insights.
Summary Statistics after Cleaning
# Summary statistics for numerical variables
summary_stats_clean <- summary(student_data[, c("total_score", "math score", "reading score", "writing score")])
print(summary_stats_clean)
total_score math score reading score writing score
Min. : 88.0 Min. : 22.00 Min. : 28.00 Min. : 27.00
1st Qu.:175.0 1st Qu.: 57.00 1st Qu.: 60.00 1st Qu.: 58.00
Median :206.0 Median : 66.00 Median : 70.00 Median : 69.00
Mean :204.3 Mean : 66.42 Mean : 69.47 Mean : 68.38
3rd Qu.:234.0 3rd Qu.: 77.00 3rd Qu.: 80.00 3rd Qu.: 79.00
Max. :300.0 Max. :100.00 Max. :100.00 Max. :100.00