Intracluster correlation and design effect: How do we calculate them, and what do they look like?
Illustrative example using Demographic and Health Surveys
Simple random sample (SRS) is rarely possible, and cluster sampling is an approach balancing real world feasibility and desired precision from an unbiased sample. While operationally and statistically smart, a downside of this approach is that:
- our sampled units tend to form similarities in each group (i.e., underlying similarity of individuals within cluster, Intraclass Correlation (ICC), increases)
- variance in our sample decreases, and
- we lose precision in our survey estimates.
For example, even if we sample 300 students based on cluster sampling, statistical variance among the 300 students may be same with variance in a SRS of 100 students. This relative inefficiency in sample size in cluster sample compared to SRS (or relative inflation of sample size in cluster sample that will give same precision with that from SRS) is design effect.
It is important to know ICC and design effect for both those who conducted a survey and those who design a new survey:
- To understand effective sample size (and, thus, actual precision achieved) in completed surveys, and
- To estimate design effect and determine sample size in a new survey, like in this example.
Using data from Demographic and Health Surveys, this markdown demonstrates: how to calculate ICC and design effect; and distribution of ICC and design effect observed in multiple countries.
First, we need the following variables:
In Stata (sorry, until I figure out correct R code. I’d welcome any suggestions…), open an individual recode file (aka women’s datafile) from your study country, and run the following code. Figures 1 and 2 show estimated ICC and design effect to measure MCPR, respectively, in Mali DHS 2018.
*1.Create or define the primary indicator
gen mcpr =v313==3
*2.Declare survey design for the dataset
gen wt=v005/1000000
svyset v021 [pw=wt], str(v022) singleunit(centered)
*3. Calculate ICC
loneway mcpr v021 [aw=wt]
*4. Calculate DEFF
svy: prop mcpr
estat effects, deff
Figure 1. Estimated ICC for modern contraceptive use among women 15-49 years of age, Mali 2018 DHS
Figure 2. Estimated design effect for modern contraceptive use among women 15-49 years of age, Mali 2018 DHS
Using the latest surveys from 60 countries, ICC and DEFF for MCPR were estimated.