assignment3: Quality of Teaching against Exam Score

Author

Aaryan

Introduction

I am going to be performing an analysis of from a data set that is from Kaggle that looks at what variables may affect a students academic performance. Each row on this data set is a student, and each column is either a variable that effects the student or just background information about the student.

students_performance= read_csv("https://myxavier-my.sharepoint.com/:x:/g/personal/bhattaa_xavier_edu/EWUhVnI91BZMmU_d6ReatzsBtyTCUiyPPfsnz_l5ffVP4w?download=1")

Research Question

How do students perform on their exam based off the number of tutoring sessions they attend and whether the type of school they attend also creates a difference.

Approach for Analysis

In order to find out more about more this, I need to use these variables: Tutoring sessions(in a month), Exam Score and School Type. I would then plot side by side box plots in which Tutoring Session is on the x-axis and Exam Score is on the y-axis. I would need to use the facet_wrap function to see if there is any difference between students who go to public or private schools.

Results

students_performance %>% 
  ggplot(aes(x=as.factor(Tutoring_Sessions),y=Exam_Score))+
  geom_boxplot()+
  facet_wrap(~School_Type)+
  labs(title = "Variance of Exam Score by number of Tutoring Sessions and the difference 
       between a Public and Private School",x=
         "Number of Tutoring Sessions Taken",y="Exam Score out of 100") 

Conclusion

Based off our geom_boxplots we can see that there is an increasing trend in which the more tutoring exams are taken, the better the students perform. Looking at the different school types, we can see that private institutions seemed to do slightly better than than public schools when students take 6 and 7 session a month.Although public schools linear trend went down after 7 and 8 sessions, the side by boxplots in general follow the linear trend we want. The other interesting observation is that public schools have more outliers in which students only take 1-2 sessions or no sessions at all. The box plots supports our intuition that the more tutoring session a students take, the better they perform.