Markdown Document for the purposes of analysing BaYaka child data
from GGIR, with additional comparisons with MCS
and NHANES.
This document was last edited on 2023-11-17 using R version 4.3.1
Here we are reading in data from data cleaning, reducing the data
frame size to core variables, introducing packages, and specify styling
prompts for ggplot and gt
Packages used in this markdown are:
c("quantreg","gamlss","tidyr",
"gtsummary","broom","broom.mixed",
"ggplot2","dplyr","forcats",
"foreign","tableone","kableExtra","psych",
"trackdown","sjPlot","broom.mixed",
"geomtextpath","ggmap","ggplot2","tmap",
"leaflet","ggalt","grid","ggpp","gridExtra",
"cowplot","haven","data.table","stringr",
"readxl","xlsx","gtools","naniar","sjmisc",
"sjstats","sjlabelled","knitr","gt","gtExtras",
"lme4","emojifont","huxtable","jtools","multcomp"
)
Styling features for the later figures
default_colour_pal <- c("#E69F00", "#56B4E9", "#009E73", "#F0E442", "#009E73", "#CC79A7", "#CC79A7")
alternative_colour_pal <- c("#009E73","#CC79A7")
ggtheme <- theme_minimal()+
The BaYaka’s physical activity data was collected during fieldwork in 2018 and 2022 in the Likouala region of Congo’s Ndoki swamp forest. The 2018 fieldwork took place in July and August, when the main subsistence activities involved caterpillar and honey collecting, hunting, collecting wild yams and plants and occasional fishing. The 2022 fieldwork took place during fishing season in February when BaYaka move to deeper parts of the forest (leaving their camps by mud roads) and establish camps close to river streams to engage in fishing.
Accelerometer recordings were processed using the package ‘GGIR’. The maximum recording length was set to 7 days from the start of recording. While no restriction on excluding the first or last day of study was stipulated, this was included de facto by excluding all incomplete days. Data was calculated in 24-hour windows from midnight to midnight, based on West Africa time (UTC +1 hour).
GGIR is run twice, once for day level summaries, a
second for hour level summaries.
The GGIR command for day level output is as follows:
datadir= "G:.."
outputdir= "G:..."
mode= 1:5
studyname="BaYaka Child PA"
f0 = 1
f1 = X #How files in the folder specified above
metadatadir= "G:..."
g.shell.GGIR(#-------------------------------
# General parameters
#-------------------------------
mode=mode,
datadir=datadir,
outputdir=outputdir,
studyname=studyname,
metadatadir = metadatadir,
f0=f0,
f1=f1,
overwrite = TRUE,
do.imp=TRUE,
idloc=2,#Is the ID stored in the ID columns? 1 if true,2 if is filename
print.filename=FALSE,
storefolderstructure = FALSE,
#-------------------------------
# Part 1 parameters:
#-------------------------------
windowsizes = c(5,900,3600),
do.cal=TRUE, # Calibrate?
do.enmo = TRUE, #Use ENMO as metric?
do.anglez=TRUE,
chunksize=1,
printsummary=TRUE,
do.parallel=TRUE,
daylimit=FALSE,
#-------------------------------
# Part 2 parameters:
#-------------------------------
strategy = 1, #use the values in hrs.del (1 is default)
ndayswindow=7, #Maximum study length
hrs.del.start = 0, #how many hours to ignore at the start
hrs.del.end = 0, #how many hours to ignore at the end
maxdur = 7, # How many days after start did the study definitely end
includedaycrit = 16, #Only include days that have atleast 16hrs of recording (16 is default)
#includnenightcrit = 1,
L5M5window = c(0,24), #Period of time to calculate when the least active hours are (as default)
M5L5res = c(1,5), #Same but most active hours (as default)
winhr = 5, #Size of window to be used in L5m5 analyses (5 is default)
qlevels = c(c(1380/1440),c(1410/1440)),
qwindow=c(0,24), #Window over which all values are calculated 0,24 means over a full day
ilevels = c(seq(0,400,by=50),8000), # Thresholds at which acceleration is calculated
mvpathreshold =c(100), #Cutoffs for MVPA calculation
#-------------------------------
# Part 3 parameters:
#-------------------------------
timethreshold= c(10), #Time Thresholds at which 'sustained inactivity' are calculated
anglethreshold=5, #Amount of movement in degrees allowed to still be considered inactive
ignorenonwear = FALSE, #Ignore periods of non-wear?
##-------------------------------
## Part 4 parameters:
##-------------------------------
loglocation= c(), #If you have a sleep log include the location - if not leave blank like this
excludefirstlast = FALSE, #Exclude first and last night?
includenightcrit = 16, #Maximum number of hours that can be considered sleep in one bout
def.noc.sleep = c(), # time window in which sustained sleep will be considered. If blank GGIR will use the least active hours to centre
relyonsleeplog = FALSE, #Use a sleep log to define start of sleep? False means GGIR will calculate instead
##-------------------------------
## Part 5 parameters:
##-------------------------------
## Key functions: Merging physical activity with sleep analyses
threshold.lig = c(50), #Threshold for Light activity
threshold.mod = c(100), #Threshold for moderate activity
threshold.vig = c(400), #Threshold for vigorous activity
boutcriter = 0.8, #what percentage of MVPA bout needs to be above threshold. 0.8 is standard
boutcriter.in = 0.9, #Same but for inactive
boutcriter.lig = 0.8, #same for light
boutcriter.mvpa = 0.8, #for MVPA
boutdur.in = c(10,30), #Length of bouts of inactivity
boutdur.lig = c(1,10), #lenth of bouts for light
boutdur.mvpa = c(1,5,10), #lenth of bouts for MVPA
timewindow = "MM", #Time window for averaging? WW = wake to wake. MM = midnight to midnight (or both)
excludefirstlast.part5 = FALSE,
do.sibreport = TRUE,
#dayborder = 10,
#-----------------------------------
# Report generation
#-------------------------------
do.report=c(2,4,5)) #What sections to build reports for? (1 & 3 can't report)
## End
To obtain hour level summaries, the qwindow command is
edited to the following:
qwindow=seq(0,24,1)
This is namely an issue for the reports generated in sections 2 and 4
of GGIR
file2$filename <- paste(file2$filename,".RData",sep = "")
file2day$filename <- paste(file2day$filename,".RData",sep = "")
file2hour$filename <- paste(file2hour$filename,".RData",sep = "")
file2hourday$filename <- paste(file2hourday$filename,".RData",sep = "")
file4$filename <- paste(file4$filename,".RData",sep = "")
mergedfile <- merge(file2,file4,by= "filename",all = TRUE)
mergedfile <- merge(mergedfile,file5,by= "filename",all = TRUE)
mergedfileday <- merge(file2day,file4day,by=c("filename","weekday"),all = TRUE)
mergedfileday <- merge(mergedfileday,file5day,by=c("filename","weekday"),all = TRUE)
Don’t need to do this as there is only 1 file location
accelerometery <- mergedfile
accelerometery_by_day <- mergedfileday
accelerometery_by_hour <- file2hour
accelerometery_by_hour_by_day <- file2hourday
Data frame is subset to reduce working memory
bayakachild <- subset(bayakachild, select = c("filename",
"device_sn",
"name.x",
"gender",
"age",
... )
names(demographic) <- tolower(names(demographic))
demographic$gender <- ifelse(demographic$sex=="m","boy",
ifelse(demographic$sex=="f","girl",NA))
demographic <- subset(demographic,!is.na(demographic$`watch id`))
demographic$device_sn <- demographic$`watch id`
demographic <- subset(demographic,select=c("name",
"age",
"gender",
"device_sn"))
demographic$age <- as.numeric(demographic$age)
accelerometery_by_day <- separate(accelerometery_by_day,filename,c("1","2","device_sn","4"),sep = "_")
accelerometery_by_hour$device_sn <- paste("0",accelerometery_by_hour$device_sn,sep = "")
accelerometery_by_hour_by_day <- separate(accelerometery_by_hour_by_day,filename,c("1","2","device_sn","4"),sep = "_")
accelerometery$device_sn <- paste("0",accelerometery$device_sn,sep = "")
bayakachild <- merge(demographic,accelerometery,by= "device_sn",all = TRUE)
bayakachild_day <- merge(demographic,accelerometery_by_day,by= "device_sn",all = TRUE)
bayakachild_hour <- merge(demographic,accelerometery_by_hour,by= "device_sn",all = TRUE)
bayakachild_hour_day <- merge(demographic,accelerometery_by_hour_by_day,by= "device_sn",all = TRUE)
names(bayakachild) <- tolower(names(bayakachild))
names(bayakachild_day) <- tolower(names(bayakachild_day))
names(bayakachild_hour) <- tolower(names(bayakachild_hour))
names(bayakachild_hour_day) <- tolower(names(bayakachild_hour_day))
NAbayakachild %>%
replace_with_na_all(condition = ~.x %in% c("NA"))
bayakachild_day %>%
replace_with_na_all(condition = ~.x %in% c("NA"))
bayakachild_hour %>%
replace_with_na_all(condition = ~.x %in% c("NA"))
bayakachild_hour_day %>%
replace_with_na_all(condition = ~.x %in% c("NA"))
i <- sapply(bayakachild, is.character)
bayakachild[i] <- lapply(bayakachild[i], as.factor)
remove(i)
i <- sapply(bayakachild_day, is.character)
bayakachild_day[i] <- lapply(bayakachild_day[i], as.factor)
remove(i)
i <- sapply(bayakachild_hour, is.character)
bayakachild_hour[i] <- lapply(bayakachild_hour[i], as.factor)
remove(i)
i <- sapply(bayakachild_hour_day, is.character)
bayakachild_hour_day[i] <- lapply(bayakachild_hour_day[i], as.factor)
remove(i)
bayakachild_hour_day$sedentary <- bayakachild_hour_day$x.0.50._enmo_mg
bayakachild_hour_day$light <- bayakachild_hour_day$x.50.100._enmo_mg
bayakachild_hour_day$mvpa <- bayakachild_hour_day$x.100.150._enmo_mg +
bayakachild_hour_day$x.150.200._enmo_mg +
bayakachild_hour_day$x.200.250._enmo_mg+
bayakachild_hour_day$x.250.300._enmo_mg+
bayakachild_hour_day$x.300.350._enmo_mg+
bayakachild_hour_day$x.350.400._enmo_mg+
bayakachild_hour_day$x.400.8e.03._enmo_mg
bayakachild_hour_day <- subset(bayakachild_hour_day,select = c("device_sn","name","age","gender","id","n_valid_hours","n_hours","n_valid_hours_in_window", "n_hours_in_window","weekday","measurementday","qwindow_timestamps","qwindow_name", "mean_enmo_mg", "sedentary","light","mvpa"))
bayakachild$sedentaryp2 <- bayakachild$ad_.0.50._enmo_mg_0.24hr
bayakachild$sedentaryp5 <- bayakachild$dur_spt_wake_in_min_pla+bayakachild$dur_day_total_in_min_pla+bayakachild$dur_spt_sleep_min_pla
bayakachild$sedentary <- bayakachild$dur_spt_wake_in_min_pla+bayakachild$dur_day_total_in_min_pla+bayakachild$dur_spt_sleep_min_pla
bayakachild$sedentary_wakeful <- bayakachild$dur_spt_wake_in_min_pla+bayakachild$dur_day_total_in_min_pla
bayakachild$sleep <- bayakachild$sleepdurationinspt_ad_t10a5_mn+bayakachild$sleepdurationinspt_ad_t10a5_mn
bayakachild$sleep_night <- bayakachild$sleepdurationinspt_ad_t10a5_mn
bayakachild$sleep_day <- bayakachild$sleepdurationinspt_ad_t10a5_mn
bayakachild$sedentary_wakeful <- bayakachild$sedentary - bayakachild$sleep
bayakachild$lightp2 <- bayakachild$ad_.50.100._enmo_mg_0.24hr
bayakachild$lightp5 <- bayakachild$dur_spt_wake_lig_min_pla+bayakachild$dur_day_total_lig_min_pla
plot(bayakachild$lightp2~bayakachild$lightp5)
bayakachild$light <- bayakachild$dur_spt_wake_lig_min_pla+bayakachild$dur_day_total_lig_min_pla
bayakachild$moderatep2 <- bayakachild$ad_.100.150._enmo_mg_0.24hr+
bayakachild$ad_.150.200._enmo_mg_0.24hr+
bayakachild$ad_.200.250._enmo_mg_0.24hr+
bayakachild$ad_.250.300._enmo_mg_0.24hr+
bayakachild$ad_.300.350._enmo_mg_0.24hr+
bayakachild$ad_.350.400._enmo_mg_0.24hr
bayakachild$moderatep5 <- bayakachild$dur_spt_wake_mod_min_pla+bayakachild$dur_day_total_mod_min_pla
plot(bayakachild$moderatep2~bayakachild$moderatep5)
bayakachild$moderate <- bayakachild$dur_spt_wake_mod_min_pla+bayakachild$dur_day_total_mod_min_pla
bayakachild$vigorousp2 <- bayakachild$ad_.400.8e.03._enmo_mg_0.24hr
bayakachild$vigorousp5 <- bayakachild$dur_spt_wake_vig_min_pla+bayakachild$dur_day_total_vig_min_pla
plot(bayakachild$vigorousp2~bayakachild$vigorousp5)
bayakachild$vigorous <- bayakachild$dur_spt_wake_vig_min_pla+bayakachild$dur_day_total_vig_min_pla
bayakachild$mvpa <- bayakachild$moderate+bayakachild$vigorous
bayakachild$mean_acc <- bayakachild$ad_mean_enmo_mg_0.24hr
bayakachild$p2total <- bayakachild$vigorousp2+bayakachild$moderatep2+bayakachild$lightp2+bayakachild$sedentaryp2
bayakachild$p5total <- bayakachild$vigorousp5+bayakachild$moderatep5+bayakachild$lightp5+bayakachild$sedentaryp5
plot(bayakachild$p2total ~ bayakachild$p5total)
bayakachild <- bayakachild[complete.cases(bayakachild[,c("sleep","sedentary","light","moderate","vigorous")]),]
columns_to_gather <- c("sedentary_wakeful","light","moderate","vigorous")
bayakachild_long <- gather(bayakachild,intensity,time,columns_to_gather)
remove(columns_to_gather)
Creating a time variable mainly for use in plotting. Renaming the time window from the two adjoining hours, to the half hour mark between the two of them
columns_to_gather<-c("ad_mean_enmo_mg_0.1hr",
"ad_mean_enmo_mg_1.2hr",
...)
bayakachild_hour_long <- gather(bayakachild_hour,time,mean_enmo,columns_to_gather)
remove(columns_to_gather)
bayakachild_hour_long$time2 <- ifelse(
bayakachild_hour_long$time=="ad_mean_enmo_mg_0.1hr","0:30",
ifelse(...))
bayakachild_hour_long$time2 <- as.POSIXct(bayakachild_hour_long$time2,format="%H:%M")
columns_to_gather<-c("ad_mvpa_e5s_t100_enmo_0.1hr",
"ad_mvpa_e5s_t100_enmo_1.2hr",
...)
bayakachild_hour_long_mvpa <- gather(bayakachild_hour,time,mvpa,columns_to_gather)
remove(columns_to_gather)
bayakachild_hour_long_mvpa$time2 <- ifelse(
bayakachild_hour_long_mvpa$time=="ad_mvpa_e5s_t100_enmo_0.1hr","0:30",
ifelse(...))
bayakachild_hour_long_mvpa$time2 <- as.POSIXct(bayakachild_hour_long_mvpa$time2,format="%R")
bayakachild_hour_day$time <- ifelse(
bayakachild_hour_day$qwindow_timestamps=="00:00-01:00","00:30:00",ifelse(
...,NA
))
bayakachild_hour_day$time <- as.POSIXct(bayakachild_hour_day$time,format="%R")
While analysis is mixed effect (1 row per person per day) for plots we need 1 row per person. The below script was used to calculate mean for each child for each variable of interest.
bayakachild_day_collapsed <- subset(bayakachild_day,bayakachild_day$n.valid.hours == 24)
bayakachild_day_collapsed<-
bayakachild_day_collapsed%>%
group_by(device_sn)%>%
summarise( name=first(name),
sleeponset.x=mean(sleeponset.x,na.rm=T),
wakeup.x=mean(wakeup.x,na.rm=T),
sptduration=mean(sptduration,na.rm=T),
fraction_night_invalid=mean(fraction_night_invalid,na.rm=T),
sleepdurationinspt=mean(sleepdurationinspt,na.rm=T),
duration_sib_wakinghours=mean(duration_sib_wakinghours,na.rm=T),
number_sib_wakinghours=mean(number_sib_wakinghours,na.rm=T),
duration_sib_wakinghours_atleast15min=mean(duration_sib_wakinghours_atleast15min,na.rm=T),
dur_spt_sleep_min=mean(dur_spt_sleep_min,na.rm=T),
dur_day_spt_min=mean(dur_day_spt_min,na.rm=T),
sleep_efficiency=mean(sleep_efficiency,na.rm=T))
bayakachild <- merge(bayakachild,bayakachild_day_collapsed,by= "device_sn",all = TRUE)
Introduce Cleaned MCS dataset
mcs <- read.csv("mcs_cleaned.csv")
Collapse MCS to 1 row per person
mcs_merge = mcs %>%
group_by(id) %>%
summarise(id = first(id)...)
names(pa_day) <- tolower(names(pa_day))
names(pa_hour) <- tolower(names(pa_hour))
names(demographics) <- tolower(names(demographics))
pa_day <- subset(pa_day,select = c(seqn, paxdaywd, paxvmd, paxmtsd))
pa_hour <- subset(pa_hour, select = c(seqn, paxtmh, paxmtsh, paxdaywh,paxdayh,paxssnhp))
demographics <- subset(demographics, select = c(seqn, riagendr,ridageyr,ridreth1,ridexagm))
pa_day <- subset(pa_day,pa_day$paxvmd == 1440)
pa_day <- subset(pa_day,select = -c(paxvmd))
pa_by_day <- pa_day
pa_by_day$day <- ifelse(pa_by_day$paxdaywd==1,"sunday",
ifelse(pa_by_day$paxdaywd==2,"monday",
ifelse(pa_by_day$paxdaywd==3,"tuesday",
ifelse(pa_by_day$paxdaywd==4,"wednesday",
ifelse(pa_by_day$paxdaywd==5,"thursday",
ifelse(pa_by_day$paxdaywd==6,"friday",
ifelse(pa_by_day$paxdaywd==7,"saturday",NA
)))))))
pa_hour_2 <- pa_hour
pa_hour_2$day <- ifelse(pa_hour_2$paxdaywh==1,"sunday",
ifelse(pa_hour_2$paxdaywh==2,"monday",
ifelse(pa_hour_2$paxdaywh==3,"tuesday",
ifelse(pa_hour_2$paxdaywh==4,"wednesday",
ifelse(pa_hour_2$paxdaywh==5,"thursday",
ifelse(pa_hour_2$paxdaywh==6,"friday",
ifelse(pa_hour_2$paxdaywh==7,"saturday",NA
)))))))
pa_by_day$mims_min <- pa_by_day$paxmtsd/1440
pa_day <- spread(pa_day, key = "paxdaywd", value = "paxmtsd")
pa_day$mean_mims <- rowMeans(pa_day[ , c(2:8)], na.rm=TRUE)
pa_day$mims_min <- pa_day$mean_mims/1440
pa_day <- subset(pa_day, select=c(seqn,mean_mims,mims_min))
pa_hour <- subset(pa_hour,pa_hour$paxtmh==60)
pa_hour <- subset(pa_hour,select = -c(paxtmh))
pa_hour <- subset(pa_hour, paxdayh!=1 )
pa_hour <- subset(pa_hour, paxdayh!=9 )
pa_hour_wd <- subset(pa_hour,paxdaywh !=1 & paxdaywh !=7)
pa_hour <- pa_hour %>%
group_by(seqn) %>%
mutate(rank = order(order(paxssnhp,paxdayh, decreasing=FALSE))) %>%
mutate(hour = rep(seq_len(24), each = 1, length.out = n()))
pa_hour <- pa_hour %>%
group_by(seqn,hour) %>%
summarize(mean_mims = mean(paxmtsh))
pa_hour_wd <- pa_hour_wd %>%
group_by(seqn) %>%
mutate(rank = order(order(paxssnhp,paxdayh, decreasing=FALSE))) %>%
mutate(hour = rep(seq_len(24), each = 1, length.out = n()))
pa_hour_wd <- pa_hour_wd %>%
group_by(seqn,hour) %>%
summarize(mean_mims = mean(paxmtsh))
pa_hour_2 <- subset(pa_hour_2, paxdayh!=1 )
pa_hour_2 <- subset(pa_hour_2, paxdayh!=9 )
pa_hour_2 <- pa_hour_2 %>%
group_by(seqn) %>%
mutate(rank = order(order(paxssnhp,paxdayh, decreasing=FALSE))) %>%
mutate(hour = rep(seq_len(24), each = 1, length.out = n()))
pa_hour_2 <- subset(pa_hour_2,pa_hour_2$paxtmh==60)
pa_hour_2 <- subset(pa_hour_2,!is.na(pa_hour_2$paxmtsh))
nhanes <- merge(demographics,pa_day,by= "seqn",all = TRUE)
nhanes_day <- merge(demographics,pa_by_day,by= "seqn",all = TRUE)
nhanes_hour <- merge(demographics,pa_hour,by= "seqn",all = TRUE)
nhanes_hour_wd <- merge(demographics,pa_hour_wd,by= "seqn",all = TRUE)
nhanes_hour_2 <- merge(demographics,pa_hour_2,by= "seqn",all = TRUE)
nhanes_day <- nhanes_day %>%
group_by(seqn) %>%
fill(seqn, .direction = "updown") %>%
fill(riagendr, .direction = "updown") %>%
fill(ridageyr, .direction = "updown") %>%
fill(ridreth1, .direction = "updown") %>%
fill(ridexagm, .direction = "updown")
nhanes <- subset(nhanes, ridageyr <=18 & ridageyr >=3)
nhanes_day <- subset(nhanes_day, ridageyr <=18 & ridageyr >=3)
nhanes_hour <- subset(nhanes_hour, ridageyr <=18 & ridageyr >=3)
nhanes_hour_wd <- subset(nhanes_hour_wd, ridageyr <=18 & ridageyr >=3)
nhanes_hour_2 <- subset(nhanes_hour_2, ridageyr <=18 & ridageyr >=5)
nhanes_hournhanes_hour2 <- nhanes_hour
nhanes_hour$hour <- as.character(nhanes_hour$hour)
nhanes_hour$hour <- paste(nhanes_hour$hour,":00",sep="")
nhanes_hour$hour <- as.POSIXct(nhanes_hour$hour,format="%H:%M",origin = "2013-01-01")
nhanes_hour_wd$hour <- as.character(nhanes_hour_wd$hour)
nhanes_hour_wd$hour <- paste(nhanes_hour_wd$hour,":00",sep="")
nhanes_hour_wd$hour <- as.POSIXct(nhanes_hour_wd$hour,format="%H:%M",origin = "2013-01-01")
nhanes_hour_2$hour <- as.character(nhanes_hour_2$hour)
nhanes_hour_2$hour <- paste(nhanes_hour_2$hour,":00",sep="")
nhanes_hour_2$timestamp <- case_when(nhanes_hour_2$day == "tuesday" ~ as.POSIXct(nhanes_hour_2$hour,format="%H:%M",origin = "2013-01-01"),
nhanes_hour_2$day == "wednesday" ~ as.POSIXct(nhanes_hour_2$hour,format="%H:%M",origin = "2013-01-02"),
...))
nhanes_hour_2_grouped <- nhanes_hour_2 %>%
group_by(seqn,day) %>%
summarise(
age = ridexagm/12,
gender = ifelse(riagendr==1,"boy",
ifelse(riagendr==2,"girl",NA)),
ethnicity = ridreth1,
timestamp = timestamp,
paxmtsh = paxmtsh,
`0-3am` = mean(paxmtsh[hour=="1:00"|hour=="2:00"|hour=="3:00"],na.rm = T),
`4-6am` = mean(paxmtsh[hour=="4:00"|hour=="5:00"|hour=="6:00"],na.rm = T),
...
)
nhanes_hour$mims_min <- nhanes_hour$mean_mims/60
nhanes_hour_2$mims_min <- nhanes_hour_2$paxmtsh/60
nhanes_hour_wd$mims_min <- nhanes_hour_wd$mean_mims/60
interview <- read_dta("mcs6_cm_interview.dta")
accelerometer <- read_dta("mcs6_cm_accelerometer_derived.dta")
derived <- read_dta("mcs6_cm_derived.dta")
Set names to lower case
names(interview) <- tolower(names(interview))
names(derived) <- tolower(names(derived))
names(accelerometer) <- tolower(names(accelerometer))
interview <- subset(interview,select = c("mcsid",
"fcnum00",
"fchtcm00",
"fcwtcm00",
"fcbfpc00",
"fcintrob"))
derived<-subset(derived,select=c("mcsid",
"fcnum00",
"fccsex00",
"fccdbm00",
"fccdby00",
"fccage00",
"fdce0600",
"fcmcs6ag",
"fcbmin6"))
accelerometer <- subset(accelerometer,select = c( "mcsid",
"fcnum00",
"fcaccmonth",
"fcaccweekday",
"fcacc_n_valid_hrs",
"fcacc_mean_acc_24h",
"fcacc_mean_acc_5am_9am",
"fcacc_m5_hour_start",
"fcacc_m5_mean_acc_mg_24h",
"fcacc_l5_hour_start",
"fcacc_l5_mean_acc_mg_24h",
"fcacc_mvpa_mean_acc_e5sec_100mg",
"fcacc_mvpa_mean_acc_e1min_100mg",
"fcacc_mvpa_e5s_b10m80_t100_enmo"))
MCS uses two identifiers, $MCSID identifies the
household, with $FCNUM00 identifying the individual within
that household. Where recruitment included twins, there are 2
individuals for a particular household. In this sample there are twins.
As such, household number and individual ID are merged to make a unique
identifier for all subsequent analysis
interview$id <- paste(interview$mcsid,interview$fcnum00,sep="")
derived$id <- paste(derived$mcsid,derived$fcnum00,sep="")
accelerometer$id <- paste(accelerometer$mcsid,accelerometer$fcnum00,sep="")
$mcsid’s and $fcnum00’s so that they don’t
become .x , .y …derived<-subset(derived,select=c("id",
"fccsex00",
"fccdbm00",
"fccdby00",
"fccage00",
"fdce0600",
"fcmcs6ag",
"fcbmin6"))
accelerometer <- subset(accelerometer,select = c( "id",
"fcaccmonth",
"fcaccweekday",
"fcacc_n_valid_hrs",
"fcacc_mean_acc_24h",
"fcacc_mean_acc_5am_9am",
"fcacc_m5_hour_start",
"fcacc_m5_mean_acc_mg_24h",
"fcacc_l5_hour_start",
"fcacc_l5_mean_acc_mg_24h",
"fcacc_mvpa_mean_acc_e5sec_100mg",
"fcacc_mvpa_mean_acc_e1min_100mg",
"fcacc_mvpa_e5s_b10m80_t100_enmo"))
Accelerometer contains one row per day per person, so these values are averaged to give a mean value per person over all contributing days
mcs <- merge(interview,derived,by= "id",all = TRUE)
accelerometer <- subset(accelerometer,fcacc_n_valid_hrs > 16)
accel <- accelerometer %>%
group_by(id) %>%
summarise(fcaccmonth = mean(fcaccmonth,na.rm=T),
fcaccweekday = mean(fcaccweekday,na.rm=T),
fcacc_n_valid_hrs = mean(fcacc_n_valid_hrs,na.rm=T),
fcacc_mean_acc_24h = mean(fcacc_mean_acc_24h,na.rm=T),
fcacc_mean_acc_5am_9am = mean(fcacc_mean_acc_5am_9am,na.rm=T),
fcacc_m5_hour_start = mean(fcacc_m5_hour_start,na.rm=T),
fcacc_m5_mean_acc_mg_24h = mean(fcacc_m5_mean_acc_mg_24h,na.rm=T),
fcacc_l5_hour_start = mean(fcacc_l5_hour_start,na.rm=T),
fcacc_l5_mean_acc_mg_24h = mean(fcacc_l5_mean_acc_mg_24h,na.rm=T),
fcacc_mvpa_mean_acc_e5sec_100 = mean(fcacc_mvpa_mean_acc_e5sec_100mg,na.rm=T),
fcacc_mvpa_mean_acc_e1min_100 = mean(fcacc_mvpa_mean_acc_e1min_100mg,na.rm=T)
)
mcs <- merge(mcs,accelerometer,by= "id",all = TRUE)
mcs <- mcs %>%
group_by(id) %>%
fill(mcsid , .direction = "updown") %>%
fill(...)
mcs <- subset(mcs,!is.na(fcacc_n_valid_hrs))
rm(accel,accelerometer,derived,interview)
mcs$gender <- ifelse(mcs$fccsex00==1,"boy",
ifelse(mcs$fccsex00==2,"girl",
ifelse(mcs$fccsex00==-1,NA,NA)))
mcs <- subset(mcs,select = -c(fccsex00))
NAsNA is -1
Collapse to one row per person
mcs_merge = mcs %>%
group_by(id) %>%
summarise(id = first(id)...)
Summary statistics for
| Overall | boy | girl | |
|---|---|---|---|
| n | 51 | 29 | 22 |
| age (mean (SD)) | 10.90 (4.35) | 10.20 (4.60) | 11.81 (3.93) |
| year = 2022 (%) | 23 (45.1) | 12 (41.4) | 11 (50.0) |
| mean_acc (mean (SD)) | 46.60 (13.23) | 48.12 (15.45) | 44.61 (9.55) |
| mvpa (mean (SD)) | 199.95 (70.94) | 206.24 (75.14) | 191.66 (65.79) |
| sedentary (mean (SD)) | 1066.63 (89.69) | 1071.05 (99.89) | 1060.79 (76.06) |
| wakeful_sed (mean (SD)) | 585.87 (129.17) | 583.07 (133.14) | 589.55 (126.74) |
| light (mean (SD)) | 182.74 (38.38) | 174.24 (38.69) | 193.94 (35.78) |
| moderate (mean (SD)) | 179.90 (60.45) | 183.45 (64.96) | 175.21 (55.06) |
| vigorous (mean (SD)) | 19.36 (14.82) | 21.64 (17.10) | 16.36 (10.79) |
| Overall | boy | girl | |
|---|---|---|---|
| n | 3308 | 1699 | 1609 |
| age (mean (SD)) | 10.65 (4.55) | 10.55 (4.55) | 10.76 (4.56) |
| ridreth1 (%) | |||
| 1 | 729 (22.0) | 350 (20.6) | 379 (23.6) |
| 2 | 333 (10.1) | 176 (10.4) | 157 ( 9.8) |
| 3 | 854 (25.8) | 462 (27.2) | 392 (24.4) |
| 4 | 844 (25.5) | 441 (26.0) | 403 (25.0) |
| 5 | 548 (16.6) | 270 (15.9) | 278 (17.3) |
| mean_mims (mean (SD)) | 14711.23 (6051.93) | 14563.69 (6220.36) | 14860.63 (5874.98) |
*NHANES data and documentation is available here
| Overall | boy | girl | |
|---|---|---|---|
| n | 4509 | 2169 | 2340 |
| age (mean (SD)) | 14.25 (0.34) | 14.26 (0.34) | 14.24 (0.33) |
| height (mean (SD)) | 163.86 (8.07) | 166.68 (8.70) | 161.23 (6.41) |
| weight (mean (SD)) | 57.55 (12.80) | 57.90 (13.40) | 57.22 (12.18) |
| bmi (mean (SD)) | 21.34 (4.07) | 20.70 (3.84) | 21.95 (4.19) |
| bodyfat (mean (SD)) | 21.91 (9.13) | 16.46 (7.89) | 27.09 (6.97) |
| season (%) | |||
| autumn | 961 (22.0) | 472 (22.6) | 489 (21.5) |
| spring | 1410 (32.3) | 671 (32.1) | 739 (32.5) |
| summer | 1385 (31.7) | 651 (31.1) | 734 (32.3) |
| winter | 608 (13.9) | 297 (14.2) | 311 (13.7) |
| ethnicity (%) | |||
| white | 3714 (83.1) | 1785 (83.0) | 1929 (83.1) |
| black or black british | 116 ( 2.6) | 53 ( 2.5) | 63 ( 2.7) |
| indian | 112 ( 2.5) | 60 ( 2.8) | 52 ( 2.2) |
| pakistani or bangladeshi | 251 ( 5.6) | 109 ( 5.1) | 142 ( 6.1) |
| mixed | 181 ( 4.0) | 97 ( 4.5) | 84 ( 3.6) |
| other | 97 ( 2.2) | 46 ( 2.1) | 51 ( 2.2) |
| ad_mean_enmo_mg_0.24hr (mean (SD)) | 33.95 (13.25) | 36.75 (15.32) | 31.36 (10.33) |
| mvpa (mean (SD)) | 128.23 (54.62) | 133.70 (59.58) | 123.16 (49.06) |
*MCS data available via UK data service
Mean ENMO appears to increase by approximately 1mg ENMO for each additional year of age. Sample size is small, but implications are an increase with age. s put forward in Rowlands et al., 2020 , 1mg ENMO should be the minimum clinically important difference between inactive individuals, with each 1mg equivalent to an additional 500 steps per day (in adults), or a 5 minute brisk walk.
| mean_acc | mean_acc | |
|---|---|---|
| Predictors | Estimates | Estimates |
| (Intercept) |
37.30 *** (27.41 – 47.19) |
37.69 *** (25.91 – 49.48) |
| age |
0.87 * (0.10 – 1.64) |
0.83 (-0.14 – 1.80) |
| gender [girl] |
-4.51 (-11.14 – 2.12) |
-5.65 (-25.42 – 14.12) |
| year [2018] |
4.58 (-1.97 – 11.13) |
4.55 (-2.10 – 11.20) |
| age × gender [girl] |
0.10 (-1.53 – 1.73) |
|
| Random Effects | ||
| σ2 | 82.13 | 82.03 |
| τ00 | 95.31 device_sn | 98.64 device_sn |
| ICC | 0.54 | 0.55 |
| N | 51 device_sn | 51 device_sn |
| Observations | 150 | 150 |
| Marginal R2 / Conditional R2 | 0.099 / 0.583 | 0.097 / 0.590 |
|
||
MVPA appears to increase by a volume of 11 minutes a day per additional year of age.
| mvpa | mvpa | |
|---|---|---|
| Predictors | Estimates | Estimates |
| (Intercept) |
119.54 *** (72.18 – 166.90) |
125.31 *** (69.11 – 181.50) |
| age |
7.94 *** (4.22 – 11.65) |
7.41 ** (2.78 – 12.04) |
| gender [girl] |
-24.06 (-56.07 – 7.96) |
-41.25 (-136.70 – 54.20) |
| year [2018] |
9.53 (-22.96 – 42.02) |
8.94 (-23.83 – 41.70) |
| age × gender [girl] |
1.50 (-6.34 – 9.34) |
|
| Random Effects | ||
| σ2 | 3231.56 | 3247.39 |
| τ00 | 1685.27 device_sn | 1704.61 device_sn |
| ICC | 0.34 | 0.34 |
| N | 51 device_sn | 51 device_sn |
| Observations | 147 | 147 |
| Marginal R2 / Conditional R2 | 0.193 / 0.469 | 0.192 / 0.470 |
|
||
The volume of sedentary activity appears to fall with increasing age
| sedentary | sedentary | |
|---|---|---|
| Predictors | Estimates | Estimates |
| (Intercept) |
1167.92 *** (1104.93 – 1230.91) |
1163.49 *** (1088.61 – 1238.37) |
| age |
-9.94 *** (-14.85 – -5.03) |
-9.53 ** (-15.67 – -3.38) |
| gender [girl] |
7.59 (-34.76 – 49.93) |
20.45 (-105.94 – 146.84) |
| year [2018] |
-6.66 (-48.68 – 35.36) |
-6.27 (-48.84 – 36.30) |
| age × gender [girl] |
-1.14 (-11.56 – 9.28) |
|
| Random Effects | ||
| σ2 | 4090.25 | 4094.81 |
| τ00 | 3577.49 device_sn | 3677.25 device_sn |
| ICC | 0.47 | 0.47 |
| N | 51 device_sn | 51 device_sn |
| Observations | 148 | 148 |
| Marginal R2 / Conditional R2 | 0.186 / 0.566 | 0.183 / 0.570 |
|
||
| wakeful_sed | wakeful_sed | |
|---|---|---|
| Predictors | Estimates | Estimates |
| (Intercept) |
647.75 *** (546.18 – 749.31) |
601.90 *** (486.38 – 717.42) |
| age |
-5.37 (-13.39 – 2.65) |
-1.23 (-10.77 – 8.32) |
| gender [girl] |
30.00 (-39.00 – 98.99) |
176.04 (-23.32 – 375.41) |
| year [2018] |
-27.00 (-99.55 – 45.54) |
-21.78 (-92.85 – 49.30) |
| age × gender [girl] |
-12.63 (-28.98 – 3.72) |
|
| Random Effects | ||
| σ2 | 21763.38 | 22115.31 |
| τ00 | 5575.15 device_sn | 4579.81 device_sn |
| ICC | 0.20 | 0.17 |
| N | 51 device_sn | 51 device_sn |
| Observations | 145 | 145 |
| Marginal R2 / Conditional R2 | 0.027 / 0.225 | 0.050 / 0.213 |
|
||
Marginal changes in light activity. As such, transition in mean ENMO with age seems to be a result of increased time in MVPA exchanged with time spent sedentary.
| light | light | |
|---|---|---|
| Predictors | Estimates | Estimates |
| (Intercept) |
146.20 *** (116.78 – 175.62) |
148.29 *** (113.26 – 183.32) |
| age |
2.52 * (0.23 – 4.81) |
2.32 (-0.55 – 5.20) |
| gender [girl] |
14.28 (-5.49 – 34.05) |
8.07 (-50.56 – 66.70) |
| year [2018] |
5.47 (-14.10 – 25.03) |
5.29 (-14.55 – 25.14) |
| age × gender [girl] |
0.55 (-4.29 – 5.39) |
|
| Random Effects | ||
| σ2 | 699.95 | 699.82 |
| τ00 | 847.31 device_sn | 872.91 device_sn |
| ICC | 0.55 | 0.56 |
| N | 51 device_sn | 51 device_sn |
| Observations | 145 | 145 |
| Marginal R2 / Conditional R2 | 0.108 / 0.597 | 0.107 / 0.603 |
|
||
| mean_acc | mean_acc | mvpa | mvpa | light | light | sedentary | sedentary | wakeful_sed | wakeful_sed | |
|---|---|---|---|---|---|---|---|---|---|---|
| Predictors | Estimates | Estimates | Estimates | Estimates | Estimates | Estimates | Estimates | Estimates | Estimates | Estimates |
| (Intercept) |
37.3 *** (27.4 – 47.2) |
37.7 *** (25.9 – 49.5) |
119.5 *** (72.2 – 166.9) |
125.3 *** (69.1 – 181.5) |
146.2 *** (116.8 – 175.6) |
148.3 *** (113.3 – 183.3) |
1167.9 *** (1104.9 – 1230.9) |
1163.5 *** (1088.6 – 1238.4) |
647.7 *** (546.2 – 749.3) |
601.9 *** (486.4 – 717.4) |
| age |
0.9 * (0.1 – 1.6) |
0.8 (-0.1 – 1.8) |
7.9 *** (4.2 – 11.7) |
7.4 ** (2.8 – 12.0) |
2.5 * (0.2 – 4.8) |
2.3 (-0.6 – 5.2) |
-9.9 *** (-14.8 – -5.0) |
-9.5 ** (-15.7 – -3.4) |
-5.4 (-13.4 – 2.7) |
-1.2 (-10.8 – 8.3) |
| gender [girl] |
-4.5 (-11.1 – 2.1) |
-5.6 (-25.4 – 14.1) |
-24.1 (-56.1 – 8.0) |
-41.2 (-136.7 – 54.2) |
14.3 (-5.5 – 34.1) |
8.1 (-50.6 – 66.7) |
7.6 (-34.8 – 49.9) |
20.5 (-105.9 – 146.8) |
30.0 (-39.0 – 99.0) |
176.0 (-23.3 – 375.4) |
| year [2018] |
4.6 (-2.0 – 11.1) |
4.5 (-2.1 – 11.2) |
9.5 (-23.0 – 42.0) |
8.9 (-23.8 – 41.7) |
5.5 (-14.1 – 25.0) |
5.3 (-14.6 – 25.1) |
-6.7 (-48.7 – 35.4) |
-6.3 (-48.8 – 36.3) |
-27.0 (-99.5 – 45.5) |
-21.8 (-92.9 – 49.3) |
| age × gender [girl] |
0.1 (-1.5 – 1.7) |
1.5 (-6.3 – 9.3) |
0.6 (-4.3 – 5.4) |
-1.1 (-11.6 – 9.3) |
-12.6 (-29.0 – 3.7) |
|||||
| Random Effects | ||||||||||
| σ2 | 82.13 | 82.03 | 3231.56 | 3247.39 | 699.95 | 699.82 | 4090.25 | 4094.81 | 21763.38 | 22115.31 |
| τ00 | 95.31 device_sn | 98.64 device_sn | 1685.27 device_sn | 1704.61 device_sn | 847.31 device_sn | 872.91 device_sn | 3577.49 device_sn | 3677.25 device_sn | 5575.15 device_sn | 4579.81 device_sn |
| ICC | 0.54 | 0.55 | 0.34 | 0.34 | 0.55 | 0.56 | 0.47 | 0.47 | 0.20 | 0.17 |
| N | 51 device_sn | 51 device_sn | 51 device_sn | 51 device_sn | 51 device_sn | 51 device_sn | 51 device_sn | 51 device_sn | 51 device_sn | 51 device_sn |
| Observations | 150 | 150 | 147 | 147 | 145 | 145 | 148 | 148 | 145 | 145 |
| Marginal R2 / Conditional R2 | 0.099 / 0.583 | 0.097 / 0.590 | 0.193 / 0.469 | 0.192 / 0.470 | 0.108 / 0.597 | 0.107 / 0.603 | 0.186 / 0.566 | 0.183 / 0.570 | 0.027 / 0.225 | 0.050 / 0.213 |
|
||||||||||
Trend in ages appear to be diverging, with BaYaka level/increasing with age compared to a decrease in NHANES. Scales are how differing, so going to z-score and re-plot to examine differences between the populations.
Using Z-Scores, For the BaYaka Pa is lowest through late childhood (ages 5-11) and increases into adulthood. Conversely, in the US/NHANES sample, PA is highest during early childhood , peaking at ages 6-7, before decreasing through to adulthood, at which point it starts to plateau.
Age is associated with an increasing PA score but only when the
sample is from the BaYaka, with a low
intercept that has a positive function with age. NHANES starts higher at
younger (marked by a positive estimate for NHANES), but decreases
from there across childhood, marked by a negative estimate for
age:populationNHANES
| pa_z | |
|---|---|
| Predictors | Estimates |
| (Intercept) |
0.07 (-0.17 – 0.31) |
| gender [girl] |
0.05 (-0.01 – 0.11) |
| age z |
0.22 (-0.01 – 0.46) |
| population NHANES |
-0.06 (-0.30 – 0.18) |
|
age z × population NHANES |
-0.51 *** (-0.75 – -0.28) |
| Random Effects | |
| σ2 | 0.35 |
| τ00 id | 0.57 |
| ICC | 0.62 |
| N id | 2771 |
| Observations | 14129 |
| Marginal R2 / Conditional R2 | 0.086 / 0.649 |
|
|
To analyse variation within days, only thoes with from the 2022 field visit who had multiple days of recording were included.
For each individual in the 2022 data, their mean acceleration at each hour averaged across the 5 days of recording is presented as a single point, with colour corresponding to gender. The mean acceleration was calculated for each hour interval and presented at the half hour (i.e., the mean acceleration between 09:00 and 10:00 is presented at 09:30). Presented light phases were taken from the nearest town on the first day of recording. The dashed horizontal lines represent the mean time of wake and sleep onset across the sample.
Sunrise in Pokola on 13/2/22 was at 06:06, with morning twilight beginning at 05:45. Mean wake-up time in this sample was 05:49 with the average onset of sleep at 21:46 as detected by the device
Mixed effect models were also used to analyse differences in activity across the day. Mean accelerations were averaged over 3-hour windows, for each day of recording for each child. The effect of time of day (in 3-hour windows) was modelled against the mean acceleration in that window, with the fixed covariates of age and gender included. To account for the nested clustering of results within days and within individuals, random effects of day and individual ID were included to reflect this nesting of groups.
Plot here is similar to previous, but here each 3hour block is presented separately to display differences between groups of times. As before, mean ENMO rises through the morning, with the highest values observed before 12, then remains high through the afternoon before decreasing into the afternoon
Device estimated (a) wake up and (b) sleep onset times across individuals. Dashed red line represents the estimate from linear model. The algorithm defines the longest sustained inactive bout within each day as sleep, allowing for the sleep bout to be disrupted for up to one hour before. Sleep is classified by GGIR as any block of 10 minutes, in which the average movement of the device is less than 5o in any axis over each 5 second epoch. From this the estimated onset and end of sleep windows can be calculated as the final and first timepoint in each day in which an individual was not in a device defined bout of sleep Y-axes are matched in length to 9 hours each to aid interpretation.
The standard deviation in wakeup time was NA compared to NA
To assess the variation between days in the BaYaka and NHANES sample, intercept only mixed effect models are employed.
Violin plot of variation within Individuals
In comparing ICCs between studies, values can vary between 0 and 1, with lower values indicating greater variation within an individual. a score of 1 would indicate that all observations for an individual were equal to their means.
Using intercept only mixed effect models on standardised mean accelerations amongst BaYaka children with five days of recording (N=23) to match the length of recording in NHANES, a lower intra-class correlation (ICC) (0.51) (Left Hand Column) was observed for BaYaka children than US children (0.65).| pa_z | pa_z | |
|---|---|---|
| Predictors | Estimates | Estimates |
| (Intercept) |
-0.00 (-0.32 – 0.32) |
0.04 * (0.01 – 0.07) |
| Random Effects | ||
| σ2 | 0.51 | 0.35 |
| τ00 | 0.51 device_sn | 0.65 id |
| ICC | 0.50 | 0.65 |
| N | 23 device_sn | 2720 id |
| Observations | 115 | 13979 |
| Marginal R2 / Conditional R2 | 0.000 / 0.497 | 0.000 / 0.648 |
|
||