Final Project: Analyzing DreamBox Data and End of Grade Math Scores
ECI 586 Intro to Learning Analytics
Author
Delaney Burns
Published
December 6, 2024
Prepare
Purpose of the Data Product and Research Questions
This data product examines the relationship between student engagement with DreamBox Math and their performance on math end-of-grade (EOG) district assessments and STAR math assessments by Renaissance Learning within an elementary school in Wake County Public Schools (WCPSS). DreamBox Math, provided by WCPSS to its elementary schools, is a comprehensive digital K–8 math program designed to deliver an engaging and rigorous learning experience. According to Discovery Education, DreamBox offers a standards-based curriculum with over 2,000 interactive lessons aimed at fostering active learning, critical thinking, and problem-solving skills. The program personalizes instruction by adapting in real-time to each student’s learning strategies, adjusting hints, pacing, difficulty levels, and lesson sequences to meet individual needs. By providing tailored support, DreamBox helps students build conceptual understanding and develop essential math skills.
WCPSS elementary schools leverage DreamBox to supplement math instruction and provide students with a resource to enhance their math performance. Analyzing the relationship between DreamBox usage and EOG math scores is critical for assessing the program’s effectiveness and justifying its continued use. This study aims to address the following research questions:
How does the completion of math standards in DreamBox correlate with end-of-grade math performance?
Do certain groups of students, based on gender, benefit more from DreamBox Math?
Data Sources and Variables
Data sources were all anonymized prior to this analysis. The data sources include:
DreamBox Math Usage Data: Provides metrics on standards completed by grade. Students have access to DreamBox starting in Kindergarten. The program has lessons that match the NC State Standards for Mathematics. When students show proficiency on these lessons it marks that standard as mastered and moves them through the course of study. Teachers have the ability to assign lessons that correlate to content that is currently being taught in class.
End-of-Grade Math Assessment Data: Captures student performance and proficiency levels on two different types of assessments: North Carolina State End of Grade Test (EOG) and the STAR Math Assessment by Renaissance Learning.
Variables used to address the research questions include:
DreamBox usage metrics including standards completed by grade level.
End-of-grade assessment scores to measure proficiency and growth.
Beginning-of-year (BOY) assessment scores including STAR math scores from the beginning of year to provide baseline data.
Demographic data including gender and grade level.
Wrangle
1. Load packages needed to work with the data set.
library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr 1.1.4 ✔ readr 2.1.5
✔ forcats 1.0.0 ✔ stringr 1.5.1
✔ ggplot2 3.5.1 ✔ tibble 3.2.1
✔ lubridate 1.9.3 ✔ tidyr 1.3.1
✔ purrr 1.0.2
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag() masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(tidygraph)
Attaching package: 'tidygraph'
The following object is masked from 'package:stats':
filter
Below, I cleaned the data necessary for the analysis. During this process I removed instances in which data was missing. For example, some students were not at the school during the beginning of the year, so they do not have documented STAR math data for the beginning of year or end of year within the system. I also created two new variables. The first variable (good_EOG_score) is to document if the student passed the End of Grade test. The second variable (good_STAR_score) is to document if the student passed the End of Year STAR math assessment.
#Cleaning data to remove any missing datadataset_clean <-na.omit(dataset2[, c("raw_score_NC_EOG", "average_db_usage", "gender", "star_math_ss_boy","star_math_ss_end_of_year", "star_math_end_of_year_score", "star_math_growth")])dataset_clean
# A tibble: 149 × 7
raw_score_NC_EOG average_db_usage gender star_math_ss_boy
<dbl> <dbl> <chr> <dbl>
1 533 0.524 F 842
2 553 0.913 M 985
3 531 0.43 M 840
4 536 0.490 F 863
5 538 0.715 F 842
6 562 0.963 F 1007
7 530 0.359 M 810
8 542 0.572 F 942
9 555 0.717 F 959
10 561 0.889 F 995
# ℹ 139 more rows
# ℹ 3 more variables: star_math_ss_end_of_year <dbl>,
# star_math_end_of_year_score <chr>, star_math_growth <chr>
#Adding in new variable categorizing if the EOG score is a passing score (545)dataset_clean <- dataset_clean|>mutate(good_EOG_score =if_else(raw_score_NC_EOG >544, 1, 0),good_EOG_score =as.factor(good_EOG_score))#Adding in new variable categorizing if the STAR math score is a passing scoredataset_clean <- dataset_clean|>mutate(good_STAR_score =if_else(star_math_end_of_year_score =="P", 1, 0),good_STAR_score =as.factor(good_STAR_score))dataset_clean
# A tibble: 149 × 9
raw_score_NC_EOG average_db_usage gender star_math_ss_boy
<dbl> <dbl> <chr> <dbl>
1 533 0.524 F 842
2 553 0.913 M 985
3 531 0.43 M 840
4 536 0.490 F 863
5 538 0.715 F 842
6 562 0.963 F 1007
7 530 0.359 M 810
8 542 0.572 F 942
9 555 0.717 F 959
10 561 0.889 F 995
# ℹ 139 more rows
# ℹ 5 more variables: star_math_ss_end_of_year <dbl>,
# star_math_end_of_year_score <chr>, star_math_growth <chr>,
# good_EOG_score <fct>, good_STAR_score <fct>
Analyze
Histogram of End of Grade Math Scores in 2024
The graph below shows the spread of NC EOG Math scores at the given elementary school. The test was taken in June of 2024. The blue line represents the average score for the school and the green line represents a passing score on the Math EOG which is a 545 (EOG Mathematics Achievement Level Ranges and Descriptors, 2022).
The histogram below shows the fraction of DreamBox Standards completed by 2024. For example, if a student is in fourth grade and has completed 50% of the standards from grades K-4 they receive a value of “.5”. The graph below shows the spread of those fractions across the school. The blue line represents the average usage of DreamBox at the school.
dataset_clean |>ggplot(aes(x = average_db_usage)) +geom_histogram(color ="green", fill ="green", binwidth = .05) +xlab("Fraction of DreamBox Standards Completed") +ylab("Number of Students") +geom_vline(aes(xintercept=mean(average_db_usage)),color="blue", linetype="dashed", linewidth=1)
DreamBox Usage Compared to Math EOG Scores
Scatterplot comparing DreamBox Usage to End of Grade Math Scores
ggplot(dataset_clean, aes(x=average_db_usage, y=raw_score_NC_EOG, color = gender)) +geom_point() +ylab("Math End of Grade Score") +xlab("Fraction of DreamBox Standards Completed by the Students") +geom_hline(aes(yintercept=545),color="green", linetype="solid", linewidth=1)
The scatterplot illustrates the relationship between the fraction of DreamBox standards completed by students and their Math End-of-Grade (EOG) scores, with the green horizontal line representing the passing score for the EOG. Students who complete at least 50% of DreamBox standards (usage above 0.5) are more likely to score above the passing threshold, while those with usage below 25% are at significant risk of failing. This suggests a clear threshold effect, where moderate to high DreamBox engagement substantially increases the likelihood of passing the EOG. Additionally, the data shows no visible differences in performance between male and female students, reinforcing the finding that gender is not a significant predictor of passing the EOG. Despite the overall trend, some variability at higher DreamBox usage levels indicates that other factors, such as baseline math ability or instructional quality, may also play a role in EOG performance. These insights highlight the importance of targeting interventions for students with low DreamBox engagement to help them meet or exceed the passing threshold.
Logistic Regression Model for Predicting Passing EOG Scores
Call:
glm(formula = good_EOG_score ~ average_db_usage + gender + star_math_ss_boy,
family = "binomial", data = dataset_clean)
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -48.765662 9.717980 -5.018 5.22e-07 ***
average_db_usage 9.401824 2.439565 3.854 0.000116 ***
genderM -0.084371 0.599590 -0.141 0.888095
star_math_ss_boy 0.044251 0.009959 4.443 8.87e-06 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 205.045 on 148 degrees of freedom
Residual deviance: 73.638 on 145 degrees of freedom
AIC: 81.638
Number of Fisher Scoring iterations: 7
Predicting Proficient EOG Scores with DreamBox Usage
dataset_clean$predicted_prob <-predict(m1, type ="response")#Plot the clean dataset using a scatterpolotggplot(dataset_clean, aes(x = average_db_usage, y = predicted_prob)) +geom_point(alpha =0.5) +# Scatterplot of probabilitiesgeom_smooth(method ="glm", method.args =list(family ="binomial"), se =TRUE) +# Logistic curvelabs(x ="Fraction of DreamBox Usage", y ="Predicted Probability of Good EOG Score") +theme_light()
`geom_smooth()` using formula = 'y ~ x'
Warning in eval(family$initialize): non-integer #successes in a binomial glm!
Insights from Visual Model of DreamBox Usage and EOG Scores
Positive Relationship:
The upward-sloping curve confirms a strong, positive relationship between DreamBox usage and the likelihood of achieving a good EOG score.
Interpretation: As DreamBox usage increases, the predicted probability of success on the EOG also increases.
Threshold Effect:
Between 0.25 and 0.75, the curve rises steeply, indicating that this range of DreamBox usage has the greatest impact on increasing the probability of achieving a good EOG score.
Beyond 0.75, the curve begins to flatten, suggesting diminishing returns at very high levels of DreamBox usage.
Low Usage, Low Probability:
For students with very low DreamBox usage (below 0.25), the predicted probability of achieving a good score is close to 0.
High Usage, High Probability:
Students with high DreamBox usage (above 0.75) are nearly guaranteed to achieve a good EOG score, with probabilities approaching 1.
Confidence Levels:
The gray confidence interval is widest at the extremes (very low or very high usage), reflecting greater uncertainty where fewer data points may exist.
In the middle range of usage (around 0.5), the confidence interval narrows, suggesting stronger reliability in predictions.
DreamBox Usage Compared to End of Year STAR Math Scores
Scatterplot comparing DreamBox Usage to STAR Math Scores
ggplot(dataset_clean, aes(x= average_db_usage, y=star_math_ss_end_of_year, color = gender)) +geom_point() +ylab("STAR Math Scaled Score for the End of Year") +xlab("Fraction of DreamBox Standards Completed by the Students")
This scatterplot shows the relationship between the fraction of DreamBox standards completed and STAR Math scaled scores at the end of the year, with data points distinguished by gender. Students with higher DreamBox usage (above 0.5) tend to achieve higher STAR Math scores, though the relationship is more gradual compared to Math EOG scores. The absence of a clear threshold suggests incremental benefits from increased DreamBox engagement, but other factors may also play a role in determining STAR Math outcomes. Additionally, male and female students appear evenly distributed, indicating that gender does not significantly impact performance.
Logistic Regression Model for STAR Math End of Year Scores with DreamBox Usage
#Logistic Regression Model for STAR Math End of Year Scores with DreamBox} m2 <-glm(good_STAR_score ~ average_db_usage + gender + star_math_ss_boy,data = dataset_clean, family ="binomial") summary(m2)
Call:
glm(formula = good_STAR_score ~ average_db_usage + gender + star_math_ss_boy,
family = "binomial", data = dataset_clean)
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -41.152677 7.573018 -5.434 5.51e-08 ***
average_db_usage 2.774035 1.520249 1.825 0.068 .
genderM -0.454043 0.502719 -0.903 0.366
star_math_ss_boy 0.042130 0.008222 5.124 2.99e-07 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 205.42 on 148 degrees of freedom
Residual deviance: 102.18 on 145 degrees of freedom
AIC: 110.18
Number of Fisher Scoring iterations: 6
Predicted Probability of Good STAR Math Score by DreamBox Usage
dataset_clean$predicted_prob2 <-predict(m2, type ="response")#Plot the clean dataset using a scatterpolotggplot(dataset_clean, aes(x = average_db_usage, y = predicted_prob2)) +geom_point(alpha =0.5) +# Scatterplot of probabilitiesgeom_smooth(method ="glm", method.args =list(family ="binomial"), se =TRUE) +# Logistic curvelabs(x ="Average DreamBox Usage", y ="Predicted Probability of Good EOY STAR Math Score") +theme_light()
`geom_smooth()` using formula = 'y ~ x'
Warning in eval(family$initialize): non-integer #successes in a binomial glm!
Insights from Visual Model of DreamBox Usage and End of Year Star Math Scores
Positive Relationship:
There is a clear upward trend: as DreamBox usage increases, the predicted probability of achieving a good STAR Math score also increases.
Gradual Increase:
The curve rises gradually, indicating a more moderate impact of DreamBox usage on STAR Math scores compared to other models (e.g., EOG predictions).
No Sharp Plateau:
Unlike the EOG model, this curve does not clearly level off at higher usage. This suggests that additional DreamBox usage continues to have a positive, while smaller, impact on STAR Math outcomes.
Confidence Levels:
The predictions are less certain for students with very low or very high DreamBox usage, as indicated by the wider confidence interval at these extremes. This could be due to fewer data points in these ranges.
Comparison of Two Models
Comparison from Regression Models
DreamBox Usage:
EOG Scores: DreamBox usage is a highly significant predictor, with a substantial coefficient. This suggests DreamBox may be more closely aligned with the skills assessed in EOG tests.
STAR Math Scores: DreamBox usage is only marginally significant, with a weaker coefficient. This suggests DreamBox is less impactful for the skills assessed in STAR Math.
Baseline Math Ability (BOY STAR math Scores):
BOY scores are highly significant predictors in both models, with similar coefficients. This reinforces that starting math ability strongly influences both EOG and STAR Math outcomes.
Gender:
Gender is not significant in either model, suggesting that male and female students are equally likely to achieve good scores when other factors (e.g., DreamBox usage, BOY scores) are controlled.
Model Fit:
The EOG model fits the data better ( it has a lower AIC, greater deviance reduction), indicating that DreamBox usage explains more variability in EOG outcomes than in STAR Math outcomes.
Comparison of Graphs
EOG Scores:
The curve shows a steep increase in probabilities with moderate DreamBox usage (0.25–0.75).
Interpretation: DreamBox usage is highly effective in improving EOG performance but has a point of diminishing returns.
STAR Math Scores:
The curve rises gradually, with no sharp increases or plateaus.
No distinct threshold: Additional DreamBox usage continues to provide incremental benefits, but the overall impact is weaker.
Interpretation: DreamBox usage is less strongly tied to STAR Math performance and may need to be supplemented with other interventions.
Communication
Key Findings and Insights
DreamBox Usage and NC EOG Scores:
DreamBox engagement significantly correlates with higher NC EOG math scores. Students with greater engagement in DreamBox (completing a higher percentage of DreamBox standards) are more likely to achieve good scores.
There is a threshold effect: Moderate DreamBox usage (30–60% of standards completed) is associated with the most significant gains, with less returns at higher levels.
DreamBox Usage and STAR Math Scores:
DreamBox usage does not significantly predict STAR Math performance. The weaker alignment may indicate that STAR Math assesses skills less directly linked to DreamBox content.
BOY Scores as a Predictor:
Beginning-of-year (BOY) Star Math scores are the strongest predictor of success for both NC EOG and STAR Math outcomes. Students with higher initial scores are more likely to perform well, regardless of DreamBox usage. This makes sense as they most likely have a stronger foundation with math skills.
Gender:
Gender does not significantly impact performance on either assessment.
Suggested Actions
Increase Targeted DreamBox Usage:
Focus interventions on students with low DreamBox engagement (<30%), particularly in alignment with NC EOG standards.
Encourage quality usage within the optimal range (30–60% of standards completed) to maximize effectiveness.
Supplement STAR Math Preparation:
Pair DreamBox with other tools or instructional methods that better align with adaptive skills measured by STAR Math assessments.
Early Identification and Support:
Use BOY Star Math scores to identify at-risk students and provide targeted interventions early in the school year.
Professional Development:
Train teachers to integrate DreamBox effectively into their lesson plans and to balance it with other resources.
Limitations and Ethical/Legal Considerations
Student Privacy:
All data was anonymized to protect individual student identities in compliance with FERPA regulations.
Sample Representation:
The data set is specific to one elementary school, which may limit generalizability to other contexts. Also, if given the opportunity, I would provide more demographic information to use in the analysis. However, due to time constraints and lack of access to information, I could only include gender and grade level. In future analyses other demographic information could be included like: race, student classifications (special education, English language learner, 504 plan status, and other), socioeconomic status, etc.
Alignment of Assessments:
Differences in content alignment between DreamBox and STAR Math assessments may reduce the apparent impact of DreamBox on STAR Math outcomes.
Acknowledgments:
Data was provided by the elementary school and derived from DreamBox usage logs and assessment results. Additionally, information regarding DreamBox and assessments was derived from literature that is cited below.
Summary of Results
The completion of math standards in DreamBox is positively correlated with end-of-grade math performance, with students who complete more standards generally achieving higher scores, particularly on the NC EOG. DreamBox usage significantly increases the likelihood of meeting or exceeding proficiency thresholds, demonstrating its value in supporting math achievement. Regarding gender, the analysis shows no significant differences in benefits between male and female students, indicating that DreamBox Math provides equitable support to students regardless of gender. This suggests that the program is effective for all students when consistently utilized.