In 2010 and 2011, all 9th-grade students were a part of the Colorado Student Assessment Program (CSAP) to evaluate students’ performances in reading, writing, and math. Using the data provided by the Colorado Department of Education (CDE), we will set out to answer both of the following questions 1. Were there improvements or deterioration in the passing rates of students in each of the three subjects? 2. Was there an association between the passing rate of the math subject and the other two subjects, and if so, how accurate are the predictions from the math scores to the other subjects’ scores? The answers to these questions can help provide valuable feedback to educators who may have taken steps to attempt to raise CSAP scores.
The data used was obtained from the CDE’s public API, which provides schools, districts, and statewide CSAP scores and categorizes them into the following groups: unsatisfactory, partial, proficient, advanced, and no score. We will need to take a few steps to make our data usable and valuable. We will eliminate all district and state summary scores. These are important, but looking at individual schools will provide better insight. We will also eliminate any schools that had fewer than 31 students taking the test in both years. Requiring that 31 students take the exam in both years will allow us to keep our data set full and keep outliers to a minimum. Then we separate all three subjects so we can look at and assess each school individually. Finally, we take a random sample of 120 schools for testing, yielding a data set of scores for each school on reading, writing, and math. From here on out, all data referred to will be from our sample of 120 schools.
We’ll start by comparing the proportions of passing rates for each school in 2010 to their passing rates in 2011 for each subject. A passing score is defined as a student who achieves a score of proficient or advanced. These students are then divided by the total number of students taking the test that year to determine our passing rate. We compare passing rates across years by subtracting the passing rate in 2011 from the passing rate in 2010. Now we will use bootstrap sampling to estimate the distribution of the average difference in passing rates. The figure below shows the results from our bootstrap sampling distribution, along with a 95% confidence interval for the subject of writing.
We can say with 95% certainty that the passing rates from 2010 to 2011 increased by between 1.84 and 4.66 percentage points. We again perform the same bootstrap sampling for the math subject.
We can say with 95% certainty that the passing rates from 2010 to 2011 increased by between 1.99 and 5.14 percentage points. Finally, we perform the last bootstrap sampling for the reading subject.
We can say with 95% certainty that the passing rates from 2010 to 2011 decreased by between 1.88 and 4.72 percentage points. Now, we will observe the relationship between the passing rates of each schools math subject and compare it to its reading and writing subjects. We will do this by running a linear regression model and finding values that will provide insight to us. Below is a plot of all math and reading scores from the 2010 test (red) with a line of best fit (blue).
Math and reading have a linear and positive relationship. The blue line has a negative intercept (-0.251), implying that students perform worse in math than in reading. The line also has a slope less than 1 (0.872), indicating that the gap in performance gets smaller as the reading passing rate increases. The dashed line shows a line with slope 1. The line has a correlation coefficient of .78, implying a strong relationship between reading and math passing rates. For example, the model predicts that a school with reading passing rates of .5 will have a math passing rate of .185. We will plot the same thing for math and writing.
Similarly, math and writing have a linear and positive relationship. The blue line has a negative intercept (-0.059), implying that students perform slightly worse in math than in writing. The line also has a slope less than 1 (0.833), indicating that the gap in performance gets smaller as the writing passing rate increases. The line has a correlation coefficient of .84, implying a very strong relationship between writing and math passing rates. For example, the model predicts that a school with writing passing rates of .5 will have a math passing rate of .356. We will now use these lines to predict the results of the 2011 tests. We will do this by looking at the math results and using our prediction lines to model our 2011 results. We will then judge how accurate these guesses are by calculating the root mean squared(RMSE). Starting with math to predict reading, we got an RMSE of .133, meaning the prediction is accurate within 13.3 percentage points. With math predicting writing, we got an RMSE of .103, meaning the prediction is accurate to within 10.3 percentage points.
Given the results above, we can interpret many things about our analysis. Given the 95% confidence intervals, we can understand that with 95% percent certainty that math increased by 1.99 and 5.14 percentage points, writing increased by 1.84 and 4.66 percentage points, and reading decreased by between 1.88 and 4.72 percentage points. These are significant confidence figures that heavily imply the actual results. Also, it is essential to note the correlation between subjects. We found significant correlations between math and writing, as well as between math and reading, meaning that if a student performs well on math, they are likely to perform well on both writing and reading. Comparing the 2011 data to test our predictions, we found that we are accurate up to around 10.3 percentage points for writing and 13.3 percentage points for reading. For example, if a school has a .5 passing rate in reading, we will predict the school to have a .185 passing rate in math, with an error of plus or minus .133. Similarly, for writing, if a school has a .5 passing rate, we predict a .3577 passing rate for math, with an error of plus or minus .103. In the future, if we had more data from more years, we could make better predictions about the results schools would have across different subjects and over time. The more data provided, the more insight we have; it is essential to note that, given only 2 years’ worth of data, our results could be an outlier and might not predict results in later years.
There was truly an improvement in the passing rates for the math and writing exams between 2010 and 2011. There was a decrease in the pass rate on the reading exam between 2010 and 2011.
There was an association between a school’s math passing rates and its passing rates on the reading and writing exams, which we could predict with up to 10.3 percentage points of accuracy for the writing and 13.3 percentage points of accuracy for the reading.
It is likely that schools will have a correlation between passing rates across subjects, and this should be taken into account in later exams. This information could give educators a better idea of which areas to focus on to improve results.