1 Intro

This report documents explorations of the new Kinedu milestone norming survey. This is a large, cross-sectional sample of parents collected in Mexico. These data are useful for both the Kinedu app and also general research on development.

Kinedu Goals

Are there any milestones that are nonlinear (perhaps indicating milestones that are not very diagnostic of development)?
What month do milestones show up at? This is useful for serving milestones and categories appropriately.
Can we get the same amount of information from a smaller number of questions? What questions can be dropped?

Scientific Goals

Examine socioeconomic effects. Do parent reports of different milestones vary across socioeconomic status (indexed by maternal education)?
Examine correlations between milestones. Can we find developmental correlations via correlations between milestones?

knitr::opts_chunk$set(echo=FALSE, warning=FALSE, cache=TRUE, message=FALSE, 
                      sanitize = TRUE)

There are 323 milestones and 799 children.

2 Descriptives

We begin with some descriptions of the dataset. The age distribution is very consistent, with some underrepresentation at the low end.

The maternal education distribution is heavily skewed towards college and higher, so we can’t learn much about socioeconomic effects here. Note that I have rebinned education into a small number of bins, such that “Carrera” and “Maestría/Doctorado” are grouped as “College and Above,” and “Carrera técnica,” “Comenzó universidad” and “Preparatoria” are grouped as “Secondary.”

3 Growth Curves

We next turn to an analysis of individual children’s total milestones achieved. This analysis yields classic growth curves including percentiles. It’s clear from looking at these curves that 1) overall the survey is working well, but 2) there is some overreporting by parents of quite young infants. This second observations refers to the group of points well above the percentile curves. They seem like they are misinterpreting the task.

Next, we break this down by maternal education. Note that we have almost no data on the “below secondary” category where we might expect more differences, so for clarity I’ve removed that group.

The percentile range appears more compressed for the secondary educated folks, but likely that’s because of the amount of data we have.

We can also compare the medians for each group, to see if there are any differences in the central tendency for groups. This measure is more robust across sample sizes, and we’re not actually seeing anything.

4 Milestone Analysis

In this section we focus on individual milestones and their growth pattern.

4.1 Age of Acquisition

We start by defining age of acquisition. Age of acquisition (AoA) is the first month in which a milestone is predicted to be observed in more than 50% of cases. We use robust regression for fitting acquisition curves. Here the red line shows the age of acquisition. In this case (milestone: Emite sonidos de consonantes (por ejemplo, “m”)), it’s 4 months.

Now let’s see what the distribution of AoAs looks like across milestones. Looks like there are a good number of milestones that are reported by more than half of parents of one-month-olds. This may be an over-representation of true milestones for newborns (e.g., reflexes) or unclear milestone questions that everyone answers correctly.

Here’s a table of milestones that have AoAs less than 1 month. It seems clear that some of these, e.g. “Presta atención a conversaciones,” are just not capturing much.

4.2 AoA and Milestone Trajectories

Now let’s look at sample milestone trajectories. (Note that I’m going to have to truncate a lot of milestone names to get them to fit.) Here are a few.

Let’s look at milestones’ age slope. This tells us about how much they are changing. We can use this as one way to weed out “bad” milestones – they have late AoAs but low age slopes.

Here are some milestones with especially low age slopes and take a look at what they are. On a first look, they appear to be mostly milestones that are about very early infancy.

Let’s filter out the ones from early infancy (selecting those with higher AoA) and take a closer look.

The milestones that don’t perform well on this criterion look like they are 1) ones where someone could reasonably disagree that a child does that or 2) ones where many child may never do the particular behavior (“Camina con las piernas muy separadas y avanza meciéndose de lado a lado (como pingüino)”), so the asymptote is low. These may be candidates for deletion.

4.3 Milestones By Category

Next, let’s look at the coherence of the milestone categories.

4.4 Categories split by SES

Let’s take a quick look at which categories are learned earlier or later depending on socio-economic status (maternal education). It’s hard to see many differences here, although there are some suggestive differences with the lowest group (probably spurious due to small sample).

5 Correlation grouping of milestones

In our next analysis, we group milestones that are correlated with one another. We do this using the partial Spearman’s correlation (correcting for age). We expect milestones to be grouped within category but between-category grouping could be interesting.

Note that this analysis is a bit computationally intensive, so we cache the results offline.

The full correlation matrix is pretty but not that informative, so it’s not shown here.

As expected, average milestone-to-milestone correlation is higher for milestones in the same category (red). But it’s not always overwhelming.

Here are a few of the correlations that are high between category. This isn’t a great representation of this pattern, unfortunately.

A table is easier to read. Many of these are motor milestones, but there are a few neat correlations (e.g., sitting and object manipulation).

6 Psychometric analysis

In this section we examine the psychometric reliability of categories and items, namely, 1) how well the categories hang together as constructs, and 2) how well the items are reflective of the categories.

The assumptions are that categories that are not reflective of a single construct may need to be revised, and items that are not reflective of the category may be problematic.

Both of these assumptions are subject to potential revision. First, it might be that a category (e.g. “socialization”) might simply not reflect a construct with a single developmental timecourse and might instead put together a variety of different related constructs. Second, given this notion of category, a particular item that isn’t informative about the category may simply be different (but still informative about something else).

Nevertheless, in terms of test design, these models may be the best we can do, so if items are not performing well, we should probably look into them more deeply.

First look at scale reliabilities for each category.

Examine top (Walking, standing) and bottom categories (development of feelings, socialization).

Examine items in the “Socialización” category. This category mixes things like “laughs,” “pays attention to faces,” and “looks at the person who is talking” (which are relatively early) with much later ones like “identifies the feelings of people in images.”

Some of these may be useful milestones, but socialization as a category doesn’t really work for putting them together.

Kinedu Milestone Reports

Mike Frank

Rendered 2017-03-10