2024-03-28

Introduction

Background: An experiment was performed to determine whether various aspects of identity, such as race, gender, and family heritage, more specifically, parental educational status, have an impact on their child’s performance in school. The Student Performance data set will allow us to explore this subject, investigating how identity may or may not influence a student’s academic achievement.

Problem definition: In what manner does identity contribute to a student’s academic performance? Is it feasible to anticipate the level of success a student will achieve?

The complete data set and more information about it can be found with the following link:

https://www.kaggle.com/datasets/spscientist/students-performance-in-exams

Summary of Student Performance Data

Description: This displays data for the mean of math, reading and writing scores.

##   gender race.ethnicity parental.level.of.education        lunch
## 1 female        group B           bachelor's degree     standard
## 2 female        group C                some college     standard
## 3 female        group B             master's degree     standard
## 4   male        group A          associate's degree free/reduced
## 5   male        group C                some college     standard
## 6 female        group B          associate's degree     standard
##   test.preparation.course math.score reading.score writing.score
## 1                    none         72            72            74
## 2               completed         69            90            88
## 3                    none         90            95            93
## 4                    none         47            57            44
## 5                    none         76            78            75
## 6                    none         71            83            78

Summary of Student Performance Data Cont.

Description: This displays data for gender, race, and parental level of education.

## 
## female   male 
##    518    482
## 
## group A group B group C group D group E 
##      89     190     319     262     140
## 
## associate's degree  bachelor's degree        high school    master's degree 
##                222                118                196                 59 
##       some college   some high school 
##                226                179

Gender Analysis of Math Scores

Gender Analysis of Reading Scores

Gender Analysis of Writing Scores

Race Analysis of Academic Performance

Correlation Heatmap

Group A Analysis of Parental Level of Education

Group B Analysis of Parental Level of Education

Group C Analysis of Parental Level of Education

Group D Analysis of Parental Level of Education

Group E Analysis of Parental Level of Education

Mean Scores of Ethnicity Group

Linear Regression Analysis

Residuals vs Fitted Values Plot Purpose: Assess linearity and constant variance. Description: Examines the relationship between predictors (gender, race/ethnicity, parental education) and math scores. A random scatter of points around y = 0 suggests linearity and constant variance.

Normal Q-Q Plot Purpose: Evaluate normality of residuals. Description: Checks if residuals follow a normal distribution. If points align with the diagonal line, it suggests residuals are normally distributed.

Scale-Location Plot Purpose: Check for consistent variance. Description: Assesses if variance of residuals is consistent across predictor levels. A random scatter indicates homoscedasticity.

Residuals vs Fitted Values

Normal Q-Q Plot

Scale-Location Plot

Conclusion

Gender Influence: Females tend to outperform males in reading and writing scores, while males tend to score higher in math.

Race/Ethnicity Influence: There are variations in academic performance across different racial/ethnic groups. On average, some groups tend to score higher than others, suggesting a potential influence of race/ethnicity on academic achievement.

Parental Level of Education Influence: Students with parents who have higher levels of education tend to perform better academically compared to students with parents who have lower levels of education.