This dataset contains results from a randomized controlled trial evaluating the impact of a school food assistance program (breakfast, lunch, and snacks) on student academic performance and behavioural engagement. The study was conducted across schools in the Toronto District School Board.
Important note on randomization: The original study design proposed cluster randomization at the school level, with entire schools assigned to treatment or control conditions. However, to simplify the analysis for this course, the simulated dataset uses individual-level randomization, where each student was independently assigned to treatment or control. You should analyze this data accordingly (i.e., you do not need to account for clustering in your analysis).
| Variable | Type | Range | Description |
|---|---|---|---|
student_id |
Integer | 1–800 | Unique student identifier |
grade |
Integer | 3, 6, 9, or 10 | Student’s grade level |
treatment |
Binary | 0 or 1 | Treatment assignment. 1 = received the food assistance program (breakfast, lunch, and snacks provided daily); 0 = control (no food program) |
eqao_score |
Continuous | 0–100 | Student’s EQAO standardized test score (percentage). This is the primary academic outcome. |
behaviour_term1 |
Continuous | 1.0–5.0 | Teacher-rated behavioural engagement score at the beginning of the school year (Term 1). Measured on a Likert-type scale where 1 = very disengaged and 5 = very engaged. |
behaviour_term2 |
Continuous | 1.0–5.0 | Teacher-rated behavioural engagement score at the end of the school
year (Term 2). Same scale as behaviour_term1. Note:
some values are missing. |
socioeconomic_index |
Continuous | 1.0–5.0 | Composite index of household socioeconomic status. Higher values indicate higher socioeconomic status. Derived from parental income, education, and neighbourhood indicators. |
baseline_nourishment |
Continuous | 1.0–5.0 | Self-reported baseline nourishment adequacy at the start of the study, before intervention. Higher values indicate better baseline food security and nutritional intake. |
attendance_rate |
Continuous | 0–100 | Percentage of school days attended during the study period. |
home_experience |
Continuous | 1.0–5.0 | Composite rating of the student’s home learning environment, including parental involvement, access to learning materials, and study space. Higher values indicate a more supportive home environment. |
requested_snack |
Binary | 0 or 1 | Whether the student requested additional snacks beyond the standard program offerings. Only applicable to students in the treatment group (treatment = 1). Values are missing (NA) for control group students, as they were not part of the food program. |
behaviour_term2 has approximately 8% missing values.
These represent cases where the end-of-year teacher rating was not
completed (e.g., teacher turnover, student transfers mid-year, or
incomplete forms).requested_snack is NA for all control group students by
design, as this variable only applies to participants in the food
program.{r} library(readr) data <- read_csv("group5_data.csv") head(data)
{r} control_data<-data%>% filter(treatment==0) control_data
{r} treatment_data<-data%>% filter(treatment==1) treatment_data