Local Authority School Readiness Analysis

Context

This analysis supports a project to identify and spread good practice in improving school readiness. The aim is to identify areas in Greater Manchester that are performing well, with the aim of doing case studies in those areas to identify what those areas are doing that is behind their good performance.

Outline

The approach here is to identify areas (at local authority and ward levels) that have higher rates of school readiness than would be expected from their levels of deprivation. The aim would be to do some case studies in these areas to try to find out what they are doing that is making the difference.

Aims and objectives

The aim of this project is to identify areas that are performing better than would be expected based on deprivation alone.

The objectives that support this aim are:

Get school readiness and deprivation data
Merge into a single data set
Plot and fit regression models to identify areas with big positive residuals in Greater Manchester.

1. Getting the data

1.1 School readiness data

School readiness data is available from PHE’s Fingertips database. Here I will use the FingertipsR package to get the data.

# Load packages
library(fingertipsR)
library(tidyverse)
library(knitr)
library(cowplot)

# Identify school readiness indicators from fingertips
inds <- indicators()
kable(inds %>% filter(grepl("([Ss]chool readiness)|(ASQ)", IndicatorName)) %>%
               distinct(IndicatorID, .keep_all = TRUE) %>%
               select(IndicatorID, IndicatorName))

IndicatorID	IndicatorName
92543	Proportion of children aged 2-2½yrs receiving ASQ-3 as part of the Healthy Child Programme or integrated review
90631	School readiness: Good level of development at age 5
90632	School readiness: Good level of development at age 5 with free school meal status
90633	School readiness: Year 1 pupils achieving the expected level in the phonics screening check
90634	School readiness: Year 1 pupils with free school meal status achieving the expected level in the phonics screening check

The table identifies five possible indicators. One measures child development at 2-2.5 years, but this is a process measure (it counts those receiving the assessment, not how they were assessed). Two measure child development at age 5 (one considering all children, one considering those eligible for free school meals). Two consider performans on phonics specifically (again, one considering all children, one considering those eligible for free school meals). All indicators are rates (percentages) of children meeting a defined target.

For simplicity, I will not use the phonics indicators or the ASQ-3 indicator. The remaining two indicators should help to identify areas that are performing well at supporting child development at both 2.5 years and at 5, and are minimising inequalities in child development at 5. The data are available for boys and girls separately. For simplicity I will not analyse the data separately by gender.

# Extract indicators 90631, 90632, and 92543 (school readiness)
sr_ids <- c(90631, 90632)
readiness <- fingertips_data(IndicatorID = sr_ids) %>%
             filter(AreaType == "County & UA",
                    Sex == "Persons") %>%
             select(IndicatorName,
                    AreaName,
                    AreaCode,
                    ParentName,
                    Timeperiod,
                    Denominator,
                    Value) %>%
             gather(key = "Key1",
                    value = "Val",
                    Denominator, Value) %>%
             unite(Key2, IndicatorName, Key1) %>%
             spread(Key2, Val) 

# Fix the long names
names(readiness)[5:8] <- c("n_all",
                           "school_readiness",
                           "n_fsm",
                           "school_readiness_fsm")

1.2 Deprivation data

There are a number of choices to be made about how to adjust for deprivation:

Whether to use the overarching English Indices of Multiple Deprivation (IMD) score/rank or to use the sub-domains of deprivation (which give a richer picture of the type of deprivation facing an area);
Whether to use the scores or ranks.

For simplicity, I will use the overall IMD score in the first instance (as it’s available via fingertips).

# Get the indicator numbers for the IMD scores
imd_ind <- inds %>% filter(grepl("Deprivation score", IndicatorName)) %>% 
                    distinct(IndicatorID)

dep <- fingertips_data(IndicatorID = 91872) %>%
       filter(AreaType == "County & UA") %>%
       select(IndicatorName,
              AreaName,
              AreaCode,
              Value) %>%
       spread(IndicatorName, Value) %>%
       rename(imd_2015 = `Deprivation score (IMD 2015)`)

# Merge with school readiness data
sr_data <- merge(readiness, dep,
                 by = c("AreaName", "AreaCode"))

# Highlight GM las
gm <- c("Bolton", "Bury", "Manchester", "Oldham", "Rochdale", "Salford",
        "Stockport", "Tameside", "Trafford", "Wigan")

sr_data$gm <- sr_data$AreaName %in% gm

2. Plotting the data

The next step is to visualise the relationship between IMD score and the three outcome variables. For simplicity I will use the most recent year for which data is available. I’ll also size the points by the number of children measured in that year.

# Plot school readiness against IMD score, weighted by number of children
g1 <- ggplot(data = filter(sr_data,
                           Timeperiod == "2017/18"),
             aes(x = imd_2015,
                 y = school_readiness,
                 colour = gm,
                 size = n_all)) +
      geom_point() +
      theme_bw() +
      scale_colour_brewer(palette = "Set2") +
      scale_size_continuous(range = c(0.5, 4)) +
      guides(colour = FALSE) +
      labs(title = "School readiness and deprivation",
           subtitle = "School readiness declines with\nincreasing deprivation...",
           x = "IMD (2015) score",
           y = "% 'good' development at 5 years",
           size = "n",
           caption = "Data source: PHE\nSchool readiness data for 2017/18\nIMD data is IMD 2015")

# Repeat the plot for children eligible for free school meals.
g2 <- ggplot(data = filter(sr_data,
                           Timeperiod == "2017/18"),
             aes(x = imd_2015,
                 y = school_readiness_fsm,
                 colour = gm,
                 size = n_fsm)) +
      geom_point() +
      theme_bw() +
      scale_colour_brewer(palette = "Set2") +
      scale_size_continuous(range = c(0.5, 4)) +
      guides(colour = FALSE) +
      labs(title = "School readiness and deprivation",
           subtitle = "...but school readiness among those\neligible for free school meals increases...",
           x = "IMD (2015) score",
           y = "% 'good' development at 5 years",
           size = "n",
           caption = "Data source: PHE\nSchool readiness data for 2017/18\nIMD data is IMD 2015")

plot_grid(g1, g2, labels = "AUTO")

The plots suggest that:

Overall child development at 5 years declines as deprivation increases (plot A)…
…But among those eligible for free school meals, child development appears to improve with increasing deprivation (although the relationship appears weaker; plot B)
Greater Manchester local authorities generally perform worse than expected based on their IMD scores. The only exception is Trafford.

Across the North West, there are areas that perform better than expected based on deprivation alone (Trafford, Blackpool, and Knowsley).

3. Fitting regression models

In the first instance, I will use a simple bivariate linear model describing the relationship between school readiness and IMD scores. As a composite index, the IMD adjusts for a range of factors. While this is not ideal, it will suffice to crudely identify areas that may be doing something right. I will also restrict the analysis to 2017/18 data (which avoids the need to adjust for year fixed effects and deal with correlated error terms). I’ll also weight the analysis by the number of children measured in each area.

# School readiness and IMD score
m1 <- lm(school_readiness ~ imd_2015,
         weights = n_all,
         data = filter(sr_data,
                       Timeperiod == "2017/18"))

summary(m1)

## 
## Call:
## lm(formula = school_readiness ~ imd_2015, data = filter(sr_data, 
##     Timeperiod == "2017/18"), weights = n_all)
## 
## Weighted Residuals:
##     Min      1Q  Median      3Q     Max 
## -468.08  -93.08    2.95   82.57  491.80 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 77.25969    0.58945  131.07   <2e-16 ***
## imd_2015    -0.25865    0.02503  -10.34   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 163 on 149 degrees of freedom
##   (1 observation deleted due to missingness)
## Multiple R-squared:  0.4176, Adjusted R-squared:  0.4137 
## F-statistic: 106.8 on 1 and 149 DF,  p-value: < 2.2e-16

# School readiness among those eligible for free school meals and IMD score
m2 <- lm(school_readiness_fsm ~ imd_2015,
         weights = n_fsm,
         data = filter(sr_data,
                       Timeperiod == "2017/18"))

summary(m2)

## 
## Call:
## lm(formula = school_readiness_fsm ~ imd_2015, data = filter(sr_data, 
##     Timeperiod == "2017/18"), weights = n_fsm)
## 
## Weighted Residuals:
##     Min      1Q  Median      3Q     Max 
## -364.49  -92.04    1.66   75.89  346.26 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 51.49560    1.29421  39.789  < 2e-16 ***
## imd_2015     0.20435    0.04958   4.122 6.24e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 120.7 on 148 degrees of freedom
##   (2 observations deleted due to missingness)
## Multiple R-squared:  0.103,  Adjusted R-squared:  0.09691 
## F-statistic: 16.99 on 1 and 148 DF,  p-value: 6.241e-05

In both cases, the relationship between deprivation and school readiness is statistically significant. The relationship between overall child development at 5 years old and deprivation is negative (declines with increasing deprivation), and moderately strong (IMD explains around 40% of the variance in child development at 5 years old). Among those eligible for free school meals the relationship between child development at 5 years old and deprivation is positive (increases with increasing deprivation) and relatively weak (IMD score explains only about 9% of the variation in child development at 5 years old).

The positive association between deprivation and child development among those eligible for free school meals is unexpected. This might be form of ecological fallacy - i.e. in more deprived local authorities the children who are eligible for free school meals might be different as a group than those eligible for free school meals in less deprived areas. Alternatively, it could be because schools in more deprived areas perform better for this group of children (possibly by having more experience supporting children in low income families, or by recruiting staff with a greater commitment to supporting children from deprived backgrounds).

It is worth re-plotting the data to illustrate how local authorities are performing relative to what would be expected given their level of deprivation.

# Plot the data with regression lines
g1 <- g1 +
      geom_abline(intercept = m1$coefficients[1],
                  slope = m1$coefficients[2],
                  colour = "red")

g2 <- g2 +
      geom_abline(intercept = m2$coefficients[1],
                  slope = m2$coefficients[2],
                  colour = "red") 

plot_grid(g1, g2, labels = "AUTO")

From this we can see that within Greater Manchester, only Trafford outperforms what would be expected given its level of deprivation. Manchester performs about as well as would be expected. We can get a table of residuals that shows how each local authority performs relative to expectations based on deprivation. I’ll standardise them while I’m at it.

# Ger a table of residuals in school readiness - all children
filter(sr_data,
       Timeperiod == "2017/18") %>%
  filter(!is.na(school_readiness)) %>%
  mutate(Residual = m1$residuals/sd(m1$residuals)) %>%
  filter(ParentName == "North West region") %>%
  select(AreaName, Residual) %>%
  arrange(desc(Residual))

# Get a table of table of residuals in school readiness - children eligible for fsm
filter(sr_data,
       Timeperiod == "2017/18") %>%
  filter(!is.na(school_readiness_fsm)) %>%
  mutate(Residual_fsm = m2$residuals/sd(m2$residuals)) %>%
  filter(ParentName == "North West region") %>%
  select(AreaName, Residual_fsm) %>%
  arrange(desc(Residual_fsm))

In terms of school readiness among children eligible for free school meals, only Manchester and Salford perform better than expected on the basis of deprivation scores. Notably, Trafford, which performs well on overall school readinesss appears to perform poorly when it comes to children eligible for free school meals.

Discussion and conclusions

Based on this preliminary analysis, it appears that only one local authority (Trafford) in Greater Manchester performs above what would be expected from its level of deprivation, and that area appears to perform poorly in terms of those children who are eligible for free school meals. Within the North West, Blackpool and Knowlsey, and to a lesser extent Warrington also outperform expectations.

There are a number of caveats that must be borne in mind:

This is an ecological analysis, so treat with caution.
It may be that there are smaller areas within Greater Manchester that do outperform expectations - i.e. local authorities are too big an area to identify good practice.

Ideally, I’d like to repeat the analysis using smaller areas within local authorities in a multilevel model.