1 Filter data

1.0.1 Nigeria 2013 DHS dataset as downloaded

## [1] 38948

1.0.2 Drop if not within ages 18-44

## filter: removed 8,501 rows (22%), 30,447 rows remaining

1.0.3 Drop if currently pregnant

## filter: removed 4,205 rows (14%), 26,242 rows remaining

1.0.4 Drop if not partnered

## filter: removed 6,943 rows (26%), 19,299 rows remaining

1.0.5 Drop if doesn’t live with partner

## filter: removed 2,126 rows (11%), 17,173 rows remaining

1.0.6 Drop if hasn’t had sex in last month

## filter: removed 3,738 rows (22%), 13,435 rows remaining

1.0.7 Drop if using contraception

## filter: removed 2,786 rows (21%), 10,649 rows remaining

1.0.8 Drop if not nulliparous

## filter: removed 9,793 rows (92%), 856 rows remaining

1.0.9 Drop if had hysterectomy or is menopausal

I believe there is an inconsistency here: the table says that none are dropped, but there are some women who would meet the criterion for dropping. I assume this means that this criterion was not used; perhaps the “Not followed” in the cell below this one (for “Drop if never menstruated” referred to this cell instead)?

## mutate: new variable 'in menopause/had hysterectomy' with 2 unique values and 0% NA
## group_by: one grouping variable (in menopause/had hysterectomy)
## summarise: now 2 rows and 2 columns, ungrouped
## # A tibble: 2 x 2
##   `in menopause/had hysterectomy`  freq
##   <lgl>                           <int>
## 1 FALSE                             854
## 2 TRUE                                2

In order to keep the case counts consistent with the table, I do not filter based on v215.

1.0.10 Drop if never menstruated

There is an inconsistency here: the table says this was not followed, but these women appear to be dropped in the table to meet subsequent case counts.

## filter: removed 19 rows (2%), 837 rows remaining

1.0.11 Drop if v530 flagged an inconsistency in information reported on recency of sexual activity

## filter: no rows removed

1.0.12 Drop if currently postpartum amenorrheic

## filter: no rows removed

1.0.13 Drop if not married only once

## filter: removed 81 rows (10%), 756 rows remaining

2 Compute current durations

## mutate: changed 756 values (100%) of 'vcal_1' (0 new NA)

We now create the three values necessary for computing the current duration:

  • Calendar length (Thoma: in case no events in calendar, and cohabitation began before calendar)
  • Months since first cohabitation (interview cmc minus cohabitation cmc)
  • Months since last relevant event in reproductive calendar (first nonzero character in reproductive calendar, minus 1)
## mutate: new variable 'calendar_length' with 6 unique values and 0% NA
##         new variable 'first_cohab_months' with 186 unique values and 0% NA
##         new variable 'last_event' with 41 unique values and 84% NA

We now have what we need to compute the current duration. We take the minimum of first_cohab_months (cohabitation), last_event (calendar events), calendar_length (start of calendar).

## mutate: new variable 'current_duration_months' with 68 unique values and 0% NA

It is worth breaking down the data by which event provided the current duration. There are very few ties (2); in case of ties, “Cohabitation” will be indicated.

## mutate: new variable 'last_event_type' with 8 unique values and 0% NA

Now we compute the Polis current duration, with the three-month “lag”.

## mutate: new variable 'current_duration_polis' with 68 unique values and 0% NA

If we were to eliminate these, how many are there?

## filter: removed 116 rows (15%), 640 rows remaining
## [1] 640

Note that this disagrees with the WHO table, which lists 89 removed and 667 remaining. We removed 27 more than the WHO. Why might this be?

We examine the 756 cases to see if there are any groups that add up to 89.

##                 CD value
## Initiation event   0   1   2  ≥3
##        Cal start   0   0   0 174
##        Cohab      21  38  30 391
##        1           0   0   0   2
##        5           0   0   0   1
##        8           0   0   0   1
##        M           0   0   0   1
##        T           4  10  12  70
##        W           0   1   0   0

There are 89 cases whose current duration values are less than 3 (and hence would be eliminated by Polis et al’s “lag”) and whose last relevant event was a cohabitation. Are these the ones eliminated? But why not the 26 terminations (T) and one case using the “Other traditional methods” (W)?

There also seems to be a slight issue with the computation of the CD values, or at least with the visualization (maybe histogram breaks?). Below shows our histogram, which, although it has the same basic features as slide 52 in Keiding (2019), is not the same:

I suspect slight differences in computation of the current duration.

3 Table of cases for visual checking

We will rename columns for compactness/clarity.


This document was compiled under R version 3.6.2 (2019-12-12).