StudentPerformance Data Set Overview

The StudentPerformance datasheet is a culmination of fictional data on exactly what it is named: student performance. It includes information on, gender, parental level of education, lunch status, and test prep status. Along with these, it also gives numerical data on the student’s math, reading, and writing test scores. The data set was created Royce Kimmons with the intention of it being used for data science analysis practice.

The figure shown below displays three relevant pieces of info from the data set. The first of which is the general size of the data set. By adding up the the bars of the chart, we can see that there are exactly 1000 rows. One other relevant piece of info that can be derived from the chart below is that there is no substantial difference between the quantities of genders. The final thing that can be seen is that it is much more common to NOT take a test prep course, no matter the gender.

 

Figure 1:


 

Effect of Test Prep Course on Scores

The four following charts display the the effect of taking a test prep course on your individual and average test score. As one would expect, the test prep course substantially boosts a given students test score with slightly varying degrees based on the subject.

 

Figure 2:

 

Mean Score Before Test Prep: 64.08         Mean Score After Test Prep: 69.7


Figure 3:

 

Mean Score Before Test Prep: 66.53         Mean Score After Test Prep: 73.89


Figure 4:

 

Mean Score Before Test Prep: 64.5         Mean Score After Test Prep: 74.42


Figure 5:

 

Mean Score Before Test Prep: 65.02         Mean Score After Test Prep: 72.67


In order to tell for sure that the test prep course would be effective at improving students score on larger scales, we must perform a two-sample hypothesis test. The following are the results of said test:

## 
##  Welch Two Sample t-test
## 
## data:  as.integer(none$X2) and as.integer(complete$X2)
## t = -8.6063, df = 790.37, p-value = 1
## alternative hypothesis: true difference in means is greater than 0
## 95 percent confidence interval:
##  -9.1084     Inf
## sample estimates:
## mean of x mean of y 
##  65.02492  72.67039

To clarify the data above, I will restate the info with a clearer format:
Null Hypothesis: There is no substantial change in average score following prep course
Alternative Hypothesis: There is a substantial change in the average score following the prep course
Level of Significance Chosen: 5%
P-value: 1
Decision: P is much larger than the level of significance, reject the null hypothesis.
Conclusion: There is, without a doubt, a significant improvement in test scores following having taken a test prep course.


Accounting for Gender

To further our understanding of the data above, I have chosen to also analyze some charts portraying the data categorized by gender. The following chart shows that despite the ratio of males to females being nearly 1 to 1, as shown above, there is some substantial differences. For example, females appear to have far less variance in there data when compared with males. Even more, males appear to do worse on average than females on the test.

Figure 6:

Male Mean Score Before Test Prep: 63.03         Male Mean Score After Test Prep: 70.75
Female Mean Score Before Test Prep: 66.86        Female Mean Score After Test Prep: 74.48