# Load required libraries
library(tidyverse)
library(readxl)
library(knitr)
library(kableExtra)
library(gtsummary)
library(patchwork)
## Data loaded successfully!
## Total participants: 3859
## Group distribution:
## 
## Excluded Included     <NA> 
##     3016      843        0

Overview

This report assesses the representativeness of the selected Inflammatory Cytokines cohort relative to the overall cohort (hereafter referred to as the “Inflammatory Cytokines” and “overall” cohorts).

Tables and plots are presented to assess the frequency, distribution, etc. for each variable.

Assess representativeness according to demographic/baseline variable set: - Outlined in the “Variable Category” column of the data dictionary sheet in the data set.

Summary of Results

We find the distributions for most “primary” variables broadly similar between the Inflammatory Cytokines and overall cohorts, with some notable differences:

Primary Variables: - Maternal age at birth differs significantly (p=0.003), with the Inflammatory Cytokines cohort being slightly older (mean 32.3 vs 31.8 years) - Infant birth characteristics show significant differences: - Infant weight is significantly lower in the Inflammatory Cytokines cohort (mean 3321g vs 3369g, p=0.016) - Infant length is significantly shorter (mean 50.0cm vs 50.3cm, p=0.004) - Ethnic origin shows a significant difference (p=0.001), though overall distributions remain similar - Twin births are significantly over-represented in the Inflammatory Cytokines cohort (4.7% vs 3.2%, p=0.006) - Other demographic variables (gender, maternal BMI, diabetes, mental health, asthma) are well-balanced

Sample Availability: - The Inflammatory Cytokines cohort shows significantly higher engagement across virtually all questionnaire measures (all p<0.001) - 61.1% vs 48.1% completed ASQ at 4 months (p<0.001) - 84.8% vs 65.6% completed ASQ at 1 year (p<0.001) - 52.9% vs 30.4% completed ASQ at 3 years (p<0.001) - 13.4% vs 6.4% completed ASQ at 5 years (p<0.001)

Key Differences: - Child age differs significantly, with Inflammatory Cytokines children being older (mean 4.5 vs 3.8 years, p<0.001) - DASS mental health scores show mixed patterns: - Fewer “Normal” domains at 18 weeks in the Inflammatory Cytokines cohort (p=0.001) - More detailed anxiety and stress profiles at 18 weeks show better mental health (p<0.001) - 36-week scores show some differences in anxiety patterns (p=0.032)

Clinical Outcomes: - BMI at 1 year is significantly lower in the Inflammatory Cytokines cohort (16.8 vs 17.0, p=0.023) - BMI at 3 years is significantly lower (15.7 vs 15.9, p=0.016) - SPT results at 3 years show some differences in food and airborne allergies (p=0.039 and p=0.020 respectively)

Data & Methods

  • DEIDENTIFIED Full Participant list Perron - Final dataset July 2025.xlsx
    • Contains data and data dictionary.
  • After some cleaning and the assignment of variable names, we get the following dimensions (rows, columns):
dim(dat)
## [1] 3859   88

We assess the variable similarity between Inflammatory Cytokines cohort (N = 843) and overall cohort (N = 3859) using:

  1. Summary tabulations
  2. Distributional plots
  3. Simple statistical tests
    • Note: the p-values presented can be interpreted with a grain of salt. Often in cases with large sample sizes, non-meaningful differences (in reality) return “significant” p-values.

Variable Breakdown

There are ~79 candidate variables that can be used to assess the similarity between the Inflammatory Cytokines and overall cohorts. - These variables are broadly “sample availability”, “child/maternal demographic”, “child/maternal characteristics”.

Firstly, let’s select just a handful candidate variables with the aim of getting an overall “snapshot” of the similarity between the sub-cohort and overall cohort. - For example, we want to ensure the sub-cohort is not entirely female, born in a single year, of a single ethnic origin, etc.

“Primary” variables (amended to include the 19 variables outlined) - Gender of child - Maternal age at birth - Maternal pre-pregnancy weight - Maternal pre-pregnancy height - Maternal pre-pregnancy BMI - Infant weight - Infant length - Infant BMI at birth - Infant ethnic origin - Indigenous status of baby - Vaginal or C-section birth - Maternal gestational diabetes status - Maternal Type 2 Diabetes - Maternal mental health diagnosis (Depression, Anxiety disorder, Bipolar, Schizophrenia, OCD, Anorexia Nervous, Specific Phobias, Behavioural Disorders) - Individual disorder breakdown: Each mental health condition is now analyzed separately, allowing participants with multiple conditions to be counted in each relevant category - Any mental health diagnosis: Overall indicator of any mental health condition - Depression: Depressive disorders - Anxiety: Anxiety disorders
- Bipolar: Bipolar affective disorder - OCD: Obsessive-compulsive disorder (includes various spellings) - Anorexia: Anorexia nervosa - Behavioural: Behavioural disorders - Maternal Asthma - Number of 18wk DASS domains with “Severe” or “Extremely Severe” - Number of 18wk DASS domains with “Normal” - Number of 36wk DASS domains with “Severe” or “Extremely Severe” - Number of 36wk DASS domains with “Normal”

Sample availability variables - Availability of maternal/child urine/blood/stool samples (20 weeks, 2 months, 6 months, 12 months, 3 years) - ASQ completion (4 month, 9 month, 1 year, 3 year, 5 year) - Early Connors assigned and completed - REDCap questionnaires assigned and completed - Availability of MNS data - Total questionnaires completed

Outcome variables - 1yr child wheeze - 1 year BMI - 1 year Ferritin results - 1 year count of positive SPT wheals - 1 year any positive food SPT wheals - 1 year any positive airborne/enviro SPT wheals - 3 year count of positive SPT wheals - 3 year any positive food SPT wheals - 3 year any positive airborne/enviro SPT wheals - 3 year BMI - 3 year wheeze - 3 year asthma - 3 year ferritin - 5 year BMI - 5 year asthma - 5 year Ferritin - 5 year any positive food SPT wheals - 5 year any positive airborne/enviro SPT wheals - 3 year count of Connors domains equal to, or above 65 - 3 year count of other clinical indicators parent reported as “3” highest

Remaining variables - Variables not contained in the primary nor sample availability variable set.

1) Primary Variables

Broadly, the distribution “primary” variable set between the Inflammatory Cytokines and overall cohorts is similar.

Notes - Child sex is well-balanced between groups (48.9% vs 47.9% female, p=0.554) - Maternal age at birth differs significantly, with the Inflammatory Cytokines cohort being older (mean 32.3 vs 31.8 years, p=0.003) - Maternal pre-pregnancy characteristics (weight, height, BMI) are well-balanced between groups (all p>0.08) - Infant birth characteristics show significant differences: - Infant weight is significantly lower in the Inflammatory Cytokines cohort (3321g vs 3369g, p=0.016) - Infant length is significantly shorter (50.0cm vs 50.3cm, p=0.004) - Infant BMI shows no significant difference (p=0.508) - Ethnic origin shows a significant difference (p=0.001), with some variation in category distributions - Indigenous status is well-balanced (p=1.000) - Birth type, maternal gestational diabetes, and Type 2 diabetes proportions are well-balanced (all p>0.1) - Mental health diagnosis patterns: - Any mental health diagnosis is well-balanced between groups (14.9% vs 14.5%, p=0.744) - Individual disorders show similar distributions across both cohorts: - Depression: 8.4% vs 7.8% (p=0.511) - Anxiety: 10.8% vs 10.9% (p=0.932) - Bipolar, OCD, anorexia, and behavioural disorders are rare in both groups - Maternal asthma is well-balanced (9.0% vs 8.1%, p=0.244) - DASS mental health scores show complex patterns: - 18-week severe domains: Fewer in Inflammatory Cytokines cohort (p=0.004) - 18-week normal domains: Fewer in Inflammatory Cytokines cohort (p=0.001) - 36-week scores: No significant differences in domain counts - Detailed DASS categories: Show better anxiety (p<0.001) and stress (p<0.001) profiles at 18 weeks, with some differences in anxiety at 36 weeks (p=0.032)

# Primary continuous variables
primary_continuous <- list(
  list("gender_child", "Child sex assigned at birth"),
  list("maternal_age_birth", "Maternal Age (Birth)"),
  list("maternal_prepreg_weight", "Maternal pre-pregnancy weight"),
  list("maternal_prepreg_height", "Maternal Pre-pregnancy height"),
  list("maternal_prepreg_bmi_calc", "Maternal Pre-pregnancy BMI (Calc.)"),
  list("infant_weight", "Infant Weight"),
  list("infant_length", "Infant Length at birth"),
  list("infant_bmi_calc", "Infant BMI at birth (Calc.)"),
  list("ethnic_origin", "Ethnic Origin"),
  list("indigenous_status", "Indigenous Status of Baby"),
  list("birth_type_derived", "Vaginal or C section birth (Deriv.)"),
  list("maternal_gest_diabetes_derived", "Maternal Gestational Diabetes? (Deriv.)"),
  list("maternal_diabetes_t2_derived", "Maternal Type 2 Diabetes? (Deriv.)"),
  list("mh_any", "Any Maternal Mental Health Diagnosis"),
  list("mh_depression", "Maternal Depression"),
  list("mh_anxiety", "Maternal Anxiety Disorder"),
  list("mh_bipolar", "Maternal Bipolar Disorder"),
  list("mh_ocd", "Maternal OCD"),
  list("mh_anorexia", "Maternal Anorexia"),
  list("mh_behavioural", "Maternal Behavioural Disorders"),
  list("maternal_asthma_derived", "Maternal Asthma? (Deriv.)"),
  list("dass21_18w_severe_count", "Number of 18wk DASS domains with \"Severe\" or \"Extremely Severe\""),
  list("dass21_18w_normal_count", "Number of 18wk DASS domains with \"Normal\""),
  list("dass21_36w_severe_count", "Number of 36wk DASS domains with \"Severe\" or \"Extremely Severe\""),
  list("dass21_36w_normal_count", "Number of 36wk DASS domains with \"Normal\"")
)

for(var_info in primary_continuous) {
  var_name <- var_info[[1]]
  title <- var_info[[2]]
  
  if(var_name %in% numeric_vars) {
    analyze_continuous(dat, var_name, title)
  } else {
    analyze_categorical(dat, var_name, title)
  }
}

Child sex assigned at birth

P-value: 0.554 (Chi-squared test)
Characteristic Included
N = 843
Overall
N = 3859
Female 412 (48.9%) 1849 (47.9%)
Male 431 (51.1%) 2010 (52.1%)
Total 843 (100.0%) 3859 (100.0%)

Maternal Age (Birth)

P-value: 0.003 (Wilcoxon rank sum test)
Characteristic N Mean (Min, Max) Median (Q1, Q3)
Included 843 32.3 (17, 49) 32 (30, 35)
Overall 3859 31.8 (17, 50) 32 (29, 35)

Unknown: Included = 0 , Overall = 0

Maternal pre-pregnancy weight

P-value: 0.849 (Wilcoxon rank sum test)
Characteristic N Mean (Min, Max) Median (Q1, Q3)
Included 800 70.4 (42, 129) 68 (60, 78)
Overall 3633 70.5 (38, 134) 68 (60, 79)

Unknown: Included = 43 , Overall = 226

Maternal Pre-pregnancy height

P-value: 0.089 (Wilcoxon rank sum test)
Characteristic N Mean (Min, Max) Median (Q1, Q3)
Included 809 1.7 (1.5, 1.9) 1.7 (1.6, 1.7)
Overall 3698 1.7 (1.4, 1.9) 1.6 (1.6, 1.7)

Unknown: Included = 34 , Overall = 161

Maternal Pre-pregnancy BMI (Calc.)

P-value: 0.610 (Wilcoxon rank sum test)
Characteristic N Mean (Min, Max) Median (Q1, Q3)
Included 796 25.6 (15.4, 45.2) 24.5 (21.8, 28.4)
Overall 3623 25.7 (14.7, 47.3) 24.7 (21.9, 28.6)

Unknown: Included = 47 , Overall = 236

Infant Weight

P-value: 0.016 (Wilcoxon rank sum test)
Characteristic N Mean (Min, Max) Median (Q1, Q3)
Included 768 3321 (1475, 5100) 3340 (3050, 3624.5)
Overall 3578 3369.2 (1095, 5410) 3390 (3062.8, 3695)

Unknown: Included = 75 , Overall = 281

Infant Length at birth

P-value: 0.004 (Wilcoxon rank sum test)
Characteristic N Mean (Min, Max) Median (Q1, Q3)
Included 768 50 (34, 60) 50 (49, 52)
Overall 3574 50.3 (31, 60) 50 (49, 52)

Unknown: Included = 75 , Overall = 285

Infant BMI at birth (Calc.)

P-value: 0.508 (Wilcoxon rank sum test)
Characteristic N Mean (Min, Max) Median (Q1, Q3)
Included 768 13.2 (7.6, 30.8) 13.2 (12.3, 14.1)
Overall 3574 13.3 (7.6, 32.3) 13.2 (12.3, 14.2)

Unknown: Included = 75 , Overall = 285

Ethnic Origin

P-value: <0.001 (Fisher’s exact test)
Characteristic Included
N = 843
Overall
N = 3859
Unknown 78 (9.3%) 285 (7.4%)
1 662 (78.5%) 2951 (76.5%)
3 34 (4%) 177 (4.6%)
4 8 (0.9%) 116 (3%)
5 2 (0.2%) 18 (0.5%)
7 1 (0.1%) 12 (0.3%)
8 58 (6.9%) 288 (7.5%)
10 0 (0%) 8 (0.2%)
6 0 (0%) 4 (0.1%)
Total 843 (100.0%) 3859 (100.0%)

Indigenous Status of Baby

P-value: 1.000 (Fisher’s exact test)
Characteristic Included
N = 843
Overall
N = 3859
Unknown 76 (9%) 283 (7.3%)
1 2 (0.2%) 12 (0.3%)
4 765 (90.7%) 3563 (92.3%)
2 0 (0%) 1 (0%)
Total 843 (100.0%) 3859 (100.0%)

Vaginal or C section birth (Deriv.)

P-value: 0.104 (Fisher’s exact test)
Characteristic Included
N = 843
Overall
N = 3859
Unknown 75 (8.9%) 281 (7.3%)
Caesarean Elective 229 (27.2%) 964 (25%)
Caesarean Emergency 159 (18.9%) 804 (20.8%)
Vaginal 380 (45.1%) 1810 (46.9%)
Total 843 (100.0%) 3859 (100.0%)

Maternal Gestational Diabetes? (Deriv.)

P-value: 0.433 (Chi-squared test)
Characteristic Included
N = 843
Overall
N = 3859
Unknown 75 (8.9%) 281 (7.3%)
FALSE 708 (84%) 3271 (84.8%)
TRUE 60 (7.1%) 307 (8%)
Total 843 (100.0%) 3859 (100.0%)

Maternal Type 2 Diabetes? (Deriv.)

P-value: 0.615 (Fisher’s exact test)
Characteristic Included
N = 843
Overall
N = 3859
Unknown 75 (8.9%) 281 (7.3%)
FALSE 766 (90.9%) 3572 (92.6%)
TRUE 2 (0.2%) 6 (0.2%)
Total 843 (100.0%) 3859 (100.0%)

Any Maternal Mental Health Diagnosis

P-value: 0.744 (Chi-squared test)
Characteristic Included
N = 843
Overall
N = 3859
FALSE 717 (85.1%) 3298 (85.5%)
TRUE 126 (14.9%) 561 (14.5%)
Total 843 (100.0%) 3859 (100.0%)

Maternal Depression

P-value: 0.511 (Chi-squared test)
Characteristic Included
N = 843
Overall
N = 3859
FALSE 772 (91.6%) 3557 (92.2%)
TRUE 71 (8.4%) 302 (7.8%)
Total 843 (100.0%) 3859 (100.0%)

Maternal Anxiety Disorder

P-value: 0.932 (Chi-squared test)
Characteristic Included
N = 843
Overall
N = 3859
FALSE 752 (89.2%) 3437 (89.1%)
TRUE 91 (10.8%) 422 (10.9%)
Total 843 (100.0%) 3859 (100.0%)

Maternal Bipolar Disorder

P-value: 0.747 (Fisher’s exact test)
Characteristic Included
N = 843
Overall
N = 3859
FALSE 841 (99.8%) 3845 (99.6%)
TRUE 2 (0.2%) 14 (0.4%)
Total 843 (100.0%) 3859 (100.0%)

Maternal OCD

P-value: 0.583 (Fisher’s exact test)
Characteristic Included
N = 843
Overall
N = 3859
FALSE 843 (100%) 3855 (99.9%)
TRUE 0 (0%) 4 (0.1%)
Total 843 (100.0%) 3859 (100.0%)

Maternal Anorexia

P-value: 0.542 (Chi-squared test)
Characteristic Included
N = 843
Overall
N = 3859
FALSE 838 (99.4%) 3843 (99.6%)
TRUE 5 (0.6%) 16 (0.4%)
Total 843 (100.0%) 3859 (100.0%)

Maternal Behavioural Disorders

P-value: 0.701 (Fisher’s exact test)
Characteristic Included
N = 843
Overall
N = 3859
FALSE 842 (99.9%) 3849 (99.7%)
TRUE 1 (0.1%) 10 (0.3%)
Total 843 (100.0%) 3859 (100.0%)

Maternal Asthma? (Deriv.)

P-value: 0.244 (Chi-squared test)
Characteristic Included
N = 843
Overall
N = 3859
Unknown 75 (8.9%) 281 (7.3%)
FALSE 692 (82.1%) 3264 (84.6%)
TRUE 76 (9%) 314 (8.1%)
Total 843 (100.0%) 3859 (100.0%)

Number of 18wk DASS domains with “Severe” or “Extremely Severe”

P-value: 0.004 (Wilcoxon rank sum test)
Characteristic N Mean (Min, Max) Median (Q1, Q3)
Included 733 0.1 (0, 3) 0 (0, 0)
Overall 2781 0.1 (0, 3) 0 (0, 0)

Unknown: Included = 110 , Overall = 1078

Number of 18wk DASS domains with “Normal”

P-value: 0.001 (Wilcoxon rank sum test)
Characteristic N Mean (Min, Max) Median (Q1, Q3)
Included 733 2.4 (0, 3) 3 (2, 3)
Overall 2781 2.6 (0, 3) 3 (2, 3)

Unknown: Included = 110 , Overall = 1078

Number of 36wk DASS domains with “Severe” or “Extremely Severe”

P-value: 0.376 (Wilcoxon rank sum test)
Characteristic N Mean (Min, Max) Median (Q1, Q3)
Included 400 0.1 (0, 3) 0 (0, 0)
Overall 1570 0.1 (0, 3) 0 (0, 0)

Unknown: Included = 443 , Overall = 2289

Number of 36wk DASS domains with “Normal”

P-value: 0.720 (Wilcoxon rank sum test)
Characteristic N Mean (Min, Max) Median (Q1, Q3)
Included 400 2.6 (0, 3) 3 (3, 3)
Overall 1570 2.6 (0, 3) 3 (3, 3)

Unknown: Included = 443 , Overall = 2289

2) Sample Availability

In general, sample availability and engagement in the Inflammatory Cytokines cohort is substantially higher relative to the overall cohort, with significant differences across virtually all measures.

Key findings: - MNS data availability shows a slight difference (91.1% vs 92.7%, p=0.049) - All questionnaire measures show significantly higher engagement (all p<0.001): - ASQ questionnaires assigned: 5.9 vs 5.1 (p<0.001) - ASQ questionnaires completed: 3.6 vs 2.6 (p<0.001) - Total questionnaires completed: 9.3 vs 6.3 (p<0.001) - ASQ completion at specific time points: - 4 months: 61.1% vs 48.1% completed (p<0.001) - 9 months: 58.4% vs 44.5% completed (p<0.001) - 1 year: 84.8% vs 65.6% completed (p<0.001) - 3 years: 52.9% vs 30.4% completed (p<0.001) - 5 years: 13.4% vs 6.4% completed (p<0.001) - Paediatric reviews show higher engagement with some significant differences in early timepoints

# Sample availability variables
sample_vars <- list(
  list("mns_data_available", "MNS Data Available?"),
  list("asq_assigned", "ASQ Questionnaires Assigned"),
  list("asq_completed", "ASQ Questionnaires Completed"),
  list("early_connors_assigned", "Early Connors Assigned"),
  list("early_connors_completed", "Early Connors Completed"),
  list("aes_assigned", "AES Questionnaires Assigned"),
  list("aes_completed", "AES Questionnaires Completed"),
  list("redcap_assigned", "RedCap Questionnaires Assigned"),
  list("redcap_completed", "RedCap Questionnaires completed"),
  list("questionnaires_total_completed", "Total Questionnaires Completed"),
  list("asq_4m_completed", "ASQ 4 Month Completed"),
  list("asq_4m_paed_review", "ASQ 4 Month Review with Paediatrician"),
  list("asq_9m_completed", "ASQ 9 Month Completed"),
  list("asq_9m_paed_review", "ASQ 9 Month Review with Paediatrician"),
  list("asq_1yr_completed", "ASQ 1 Year Completed"),
  list("asq_1yr_paed_review", "ASQ 1 Year Review with Paediatrician"),
  list("asq_3yr_completed", "ASQ 3 Year Completed"),
  list("asq_3yr_paed_review", "ASQ 3 Year Review with Paediatrician"),
  list("asq_5yr_completed", "ASQ 5 Year Completed"),
  list("asq_5yr_paed_review", "ASQ 5 Year Review with Paediatrician"),
  list("asq_paed_review_count", "Number of times ASQ has prompted review with PAED")
)

for(var_info in sample_vars) {
  var_name <- var_info[[1]]
  title <- var_info[[2]]
  
  if(var_name %in% numeric_vars) {
    analyze_continuous(dat, var_name, title)
  } else {
    analyze_categorical(dat, var_name, title)
  }
}

MNS Data Available?

P-value: 0.049 (Chi-squared test)
Characteristic Included
N = 843
Overall
N = 3859
FALSE 75 (8.9%) 281 (7.3%)
TRUE 768 (91.1%) 3578 (92.7%)
Total 843 (100.0%) 3859 (100.0%)

ASQ Questionnaires Assigned

P-value: <0.001 (Wilcoxon rank sum test)
Characteristic N Mean (Min, Max) Median (Q1, Q3)
Included 842 5.9 (2, 8) 6 (4, 8)
Overall 3591 5.1 (0, 8) 5 (4, 6)

Unknown: Included = 1 , Overall = 268

ASQ Questionnaires Completed

P-value: <0.001 (Wilcoxon rank sum test)
Characteristic N Mean (Min, Max) Median (Q1, Q3)
Included 842 3.6 (0, 9) 4 (2, 5)
Overall 3591 2.6 (0, 9) 2 (1, 4)

Unknown: Included = 1 , Overall = 268

Early Connors Assigned

P-value: <0.001 (Wilcoxon rank sum test)
Characteristic N Mean (Min, Max) Median (Q1, Q3)
Included 842 1.1 (0, 2) 1 (1, 2)
Overall 3591 0.8 (0, 2) 1 (0, 1)

Unknown: Included = 1 , Overall = 268

Early Connors Completed

P-value: <0.001 (Wilcoxon rank sum test)
Characteristic N Mean (Min, Max) Median (Q1, Q3)
Included 842 0.6 (0, 2) 1 (0, 1)
Overall 3591 0.4 (0, 2) 0 (0, 1)

Unknown: Included = 1 , Overall = 268

AES Questionnaires Assigned

P-value: <0.001 (Wilcoxon rank sum test)
Characteristic N Mean (Min, Max) Median (Q1, Q3)
Included 842 3.8 (1, 5) 4 (3, 4)
Overall 3591 3.4 (0, 5) 3 (3, 4)

Unknown: Included = 1 , Overall = 268

AES Questionnaires Completed

P-value: <0.001 (Wilcoxon rank sum test)
Characteristic N Mean (Min, Max) Median (Q1, Q3)
Included 842 1.5 (0, 5) 1 (1, 2)
Overall 3591 1.1 (0, 5) 1 (0, 2)

Unknown: Included = 1 , Overall = 268

RedCap Questionnaires Assigned

P-value: <0.001 (Wilcoxon rank sum test)
Characteristic N Mean (Min, Max) Median (Q1, Q3)
Included 842 5.8 (1, 7) 6 (5, 7)
Overall 3591 5.1 (1, 7) 5 (4, 7)

Unknown: Included = 1 , Overall = 268

RedCap Questionnaires completed

P-value: <0.001 (Wilcoxon rank sum test)
Characteristic N Mean (Min, Max) Median (Q1, Q3)
Included 842 3.5 (0, 7) 4 (2, 5)
Overall 3591 2.8 (0, 7) 3 (1, 4)

Unknown: Included = 1 , Overall = 268

Total Questionnaires Completed

P-value: <0.001 (Wilcoxon rank sum test)
Characteristic N Mean (Min, Max) Median (Q1, Q3)
Included 843 9.3 (0, 22) 9 (6, 13)
Overall 3859 6.3 (0, 22) 6 (2, 10)

Unknown: Included = 0 , Overall = 0

ASQ 4 Month Completed

P-value: <0.001 (Chi-squared test)
Characteristic Included
N = 843
Overall
N = 3859
FALSE 328 (38.9%) 2004 (51.9%)
TRUE 515 (61.1%) 1855 (48.1%)
Total 843 (100.0%) 3859 (100.0%)

ASQ 4 Month Review with Paediatrician

P-value: 0.007 (Chi-squared test)
Characteristic Included
N = 843
Overall
N = 3859
Unknown 328 (38.9%) 2004 (51.9%)
FALSE 272 (32.3%) 885 (22.9%)
TRUE 243 (28.8%) 970 (25.1%)
Total 843 (100.0%) 3859 (100.0%)

ASQ 9 Month Completed

P-value: <0.001 (Chi-squared test)
Characteristic Included
N = 843
Overall
N = 3859
FALSE 351 (41.6%) 2141 (55.5%)
TRUE 492 (58.4%) 1718 (44.5%)
Total 843 (100.0%) 3859 (100.0%)

ASQ 9 Month Review with Paediatrician

P-value: 0.009 (Chi-squared test)
Characteristic Included
N = 843
Overall
N = 3859
Unknown 351 (41.6%) 2141 (55.5%)
FALSE 256 (30.4%) 807 (20.9%)
TRUE 236 (28%) 911 (23.6%)
Total 843 (100.0%) 3859 (100.0%)

ASQ 1 Year Completed

P-value: <0.001 (Chi-squared test)
Characteristic Included
N = 843
Overall
N = 3859
FALSE 128 (15.2%) 1327 (34.4%)
TRUE 715 (84.8%) 2532 (65.6%)
Total 843 (100.0%) 3859 (100.0%)

ASQ 1 Year Review with Paediatrician

P-value: <0.001 (Chi-squared test)
Characteristic Included
N = 843
Overall
N = 3859
Unknown 128 (15.2%) 1327 (34.4%)
FALSE 403 (47.8%) 1293 (33.5%)
TRUE 312 (37%) 1239 (32.1%)
Total 843 (100.0%) 3859 (100.0%)

ASQ 3 Year Completed

P-value: <0.001 (Chi-squared test)
Characteristic Included
N = 843
Overall
N = 3859
FALSE 397 (47.1%) 2684 (69.6%)
TRUE 446 (52.9%) 1175 (30.4%)
Total 843 (100.0%) 3859 (100.0%)

ASQ 3 Year Review with Paediatrician

P-value: 0.148 (Chi-squared test)
Characteristic Included
N = 843
Overall
N = 3859
Unknown 397 (47.1%) 2684 (69.6%)
FALSE 183 (21.7%) 450 (11.7%)
TRUE 263 (31.2%) 725 (18.8%)
Total 843 (100.0%) 3859 (100.0%)

ASQ 5 Year Completed

P-value: <0.001 (Chi-squared test)
Characteristic Included
N = 843
Overall
N = 3859
FALSE 730 (86.6%) 3611 (93.6%)
TRUE 113 (13.4%) 248 (6.4%)
Total 843 (100.0%) 3859 (100.0%)

ASQ 5 Year Review with Paediatrician

P-value: 0.931 (Chi-squared test)
Characteristic Included
N = 843
Overall
N = 3859
Unknown 730 (86.6%) 3611 (93.6%)
FALSE 32 (3.8%) 72 (1.9%)
TRUE 81 (9.6%) 176 (4.6%)
Total 843 (100.0%) 3859 (100.0%)

Number of times ASQ has prompted review with PAED

P-value: 0.032 (Wilcoxon rank sum test)
Characteristic N Mean (Min, Max) Median (Q1, Q3)
Included 797 1.4 (0, 4) 1 (1, 2)
Overall 3014 1.3 (0, 4) 1 (1, 2)

Unknown: Included = 46 , Overall = 845

3) Outcome Variables

Notes - Child wheeze at 1 year: Well-balanced between groups (19.2% vs 14.0%, p=0.927, though different denominators due to missing data) - BMI measurements show consistently lower values in the Inflammatory Cytokines cohort: - 1-year BMI is significantly lower (16.8 vs 17.0, p=0.023) - 3-year BMI is significantly lower (15.7 vs 15.9, p=0.016) - 5-year BMI shows no significant difference (15.6 vs 15.8, p=0.122) - Wheeze and asthma outcomes: Generally well-balanced between groups at 3 and 5 years - Skin prick test (SPT) results show some significant differences: - 1-year results: Similar patterns for food and airborne allergies - 3-year results: Significant differences in food (p=0.039) and airborne (p=0.020) allergies, with similar overall positive counts (p=0.058) - 5-year results: Well-balanced between groups - Ferritin levels: Well-balanced at all time points: - 1 year: 31.4 vs 32.1 (p=0.631) - 3 years: 2.8 vs 3.0 (p=0.380) - Behavioral assessments: - Connors domain scores show no significant differences (p=0.144) - Follow-up participation: Shows much higher data availability in Inflammatory Cytokines cohort - Overall assessment: Most clinical outcome variables demonstrate good representativeness, with the main differences being consistently lower BMI values and some specific allergy testing differences at 3 years, alongside much higher study engagement

# Outcome variables
outcome_vars <- list(
  list("child_wheeze_1yr", "Has your child ever had a wheezed at 1 Year?"),
  list("bmi_1yr_calc", "BMI at 1 Year (Calc.)"),
  list("ferritin_1yr", "Ferritin Results at 1 Year"),
  list("spt_positive_count_1yr", "Count of positive SPT wheals(>= 3MM WHEAL) at 1 Year"),
  list("spt_food_positive_1yr", "Any positive Food SPT wheals (>=3mm) at 1 Year"),
  list("spt_airborne_positive_1yr", "Any positive airborne/enviro SPT wheals (>=3mm) at 1 Year"),
  list("spt_positive_count_3yr", "Count of positive SPT wheals(>= 3MM WHEAL) at 3 Years"),
  list("spt_food_positive_3yr", "Any positive Food SPT wheals (>=3mm) at 3 Years"),
  list("spt_airborne_positive_3yr", "Any positive airborne/enviro SPT wheals (>=3mm) at 3 Years"),
  list("bmi_3yr_calc", "BMI at 3 Years (Calc.)"),
  list("wheeze_3yr", "3 year wheeze"),
  list("asthma_3yr", "3 year asthma"),
  list("followup_3yr", "3 year follow-up"),
  list("bmi_5yr_calc", "BMI at 5 Years (Calc.)"),
  list("asthma_5yr", "5 year asthma"),
  list("ferritin_3yr", "Ferritin_3yr"),
  list("spt_positive_count_5yr", "Count of positive SPT wheals (>=3mm) at 5 Years"),
  list("spt_food_positive_5yr", "Any positive Food SPT wheals (>=3mm) at 5 Years"),
  list("spt_airborne_positive_5yr", "Any positive airborne/enviro SPT wheals (>=3mm) at 5 Years"),
  list("connors_domains_above65_3yr", "Count of Connors domains equal to, or above 65 at 3 Years"),
  list("clinical_indicators_highest_3yr", "Count of other clinical indicators parent reported as \"3\" highest at 3 Years")
)

for(var_info in outcome_vars) {
  var_name <- var_info[[1]]
  title <- var_info[[2]]
  
  if(var_name %in% numeric_vars) {
    analyze_continuous(dat, var_name, title)
  } else {
    analyze_categorical(dat, var_name, title)
  }
}

Has your child ever had a wheezed at 1 Year?

P-value: 0.927 (Chi-squared test)
Characteristic Included
N = 843
Overall
N = 3859
Unknown 89 (10.6%) 1315 (34.1%)
No 592 (70.2%) 2002 (51.9%)
Yes 162 (19.2%) 542 (14%)
Total 843 (100.0%) 3859 (100.0%)

BMI at 1 Year (Calc.)

P-value: 0.023 (Wilcoxon rank sum test)
Characteristic N Mean (Min, Max) Median (Q1, Q3)
Included 597 16.8 (13.2, 21.6) 16.7 (15.9, 17.8)
Overall 1832 17 (10.2, 26.1) 16.9 (16, 18)

Unknown: Included = 246 , Overall = 2027

Ferritin Results at 1 Year

P-value: 0.631 (Wilcoxon rank sum test)
Characteristic N Mean (Min, Max) Median (Q1, Q3)
Included 226 31.4 (5, 367) 25 (17, 37.8)
Overall 542 32.1 (5, 871) 25 (16, 37)

Unknown: Included = 617 , Overall = 3317

Count of positive SPT wheals(>= 3MM WHEAL) at 1 Year

P-value: 0.526 (Wilcoxon rank sum test)
Characteristic N Mean (Min, Max) Median (Q1, Q3)
Included 759 0.1 (0, 5) 0 (0, 0)
Overall 2383 0.2 (0, 8) 0 (0, 0)

Unknown: Included = 84 , Overall = 1476

Any positive Food SPT wheals (>=3mm) at 1 Year

P-value: 0.768 (Chi-squared test)
Characteristic Included
N = 843
Overall
N = 3859
Unknown 84 (10%) 1476 (38.2%)
FALSE 692 (82.1%) 2165 (56.1%)
TRUE 67 (7.9%) 218 (5.6%)
Total 843 (100.0%) 3859 (100.0%)

Any positive airborne/enviro SPT wheals (>=3mm) at 1 Year

P-value: 0.573 (Chi-squared test)
Characteristic Included
N = 843
Overall
N = 3859
Unknown 84 (10%) 1476 (38.2%)
FALSE 743 (88.1%) 2325 (60.2%)
TRUE 16 (1.9%) 58 (1.5%)
Total 843 (100.0%) 3859 (100.0%)

Count of positive SPT wheals(>= 3MM WHEAL) at 3 Years

P-value: 0.058 (Wilcoxon rank sum test)
Characteristic N Mean (Min, Max) Median (Q1, Q3)
Included 390 0.2 (0, 3) 0 (0, 0)
Overall 1034 0.2 (0, 5) 0 (0, 0)

Unknown: Included = 453 , Overall = 2825

Any positive Food SPT wheals (>=3mm) at 3 Years

P-value: 0.039 (Chi-squared test)
Characteristic Included
N = 843
Overall
N = 3859
Unknown 453 (53.7%) 2825 (73.2%)
FALSE 381 (45.2%) 992 (25.7%)
TRUE 9 (1.1%) 42 (1.1%)
Total 843 (100.0%) 3859 (100.0%)

Any positive airborne/enviro SPT wheals (>=3mm) at 3 Years

P-value: 0.020 (Chi-squared test)
Characteristic Included
N = 843
Overall
N = 3859
Unknown 453 (53.7%) 2825 (73.2%)
FALSE 346 (41%) 882 (22.9%)
TRUE 44 (5.2%) 152 (3.9%)
Total 843 (100.0%) 3859 (100.0%)

BMI at 3 Years (Calc.)

P-value: 0.016 (Wilcoxon rank sum test)
Characteristic N Mean (Min, Max) Median (Q1, Q3)
Included 370 15.7 (12.1, 26.6) 15.6 (14.8, 16.5)
Overall 1021 15.9 (11.3, 27.3) 15.8 (15, 16.7)

Unknown: Included = 473 , Overall = 2838

3 year wheeze

P-value: 0.637 (Chi-squared test)
Characteristic Included
N = 843
Overall
N = 3859
Unknown 385 (45.7%) 2597 (67.3%)
FALSE 357 (42.3%) 973 (25.2%)
TRUE 101 (12%) 289 (7.5%)
Total 843 (100.0%) 3859 (100.0%)

3 year asthma

P-value: 0.954 (Chi-squared test)
Characteristic Included
N = 843
Overall
N = 3859
Unknown 386 (45.8%) 2603 (67.5%)
FALSE 448 (53.1%) 1233 (32%)
TRUE 9 (1.1%) 23 (0.6%)
Total 843 (100.0%) 3859 (100.0%)

3 year follow-up

P-value: 0.982 (Wilcoxon rank sum test)
Characteristic N Mean (Min, Max) Median (Q1, Q3)
Included 270 22.4 (5, 119) 18.5 (14, 26)
Overall 610 22.1 (5, 175) 19 (14, 26)

Unknown: Included = 573 , Overall = 3249

BMI at 5 Years (Calc.)

P-value: 0.122 (Wilcoxon rank sum test)
Characteristic N Mean (Min, Max) Median (Q1, Q3)
Included 206 15.6 (11.5, 26.4) 15.5 (14.7, 16.3)
Overall 466 15.8 (11.5, 26.4) 15.6 (14.9, 16.5)

Unknown: Included = 637 , Overall = 3393

5 year asthma

P-value: 1.000 (Chi-squared test)
Characteristic Included
N = 843
Overall
N = 3859
Unknown 623 (73.9%) 3374 (87.4%)
FALSE 209 (24.8%) 461 (11.9%)
TRUE 11 (1.3%) 24 (0.6%)
Total 843 (100.0%) 3859 (100.0%)

Ferritin_3yr

P-value: 0.380 (Wilcoxon rank sum test)
Characteristic N Mean (Min, Max) Median (Q1, Q3)
Included 411 2.8 (0, 19) 1 (0, 4)
Overall 1062 3 (0, 19) 1 (0, 5)

Unknown: Included = 432 , Overall = 2797

Count of positive SPT wheals (>=3mm) at 5 Years

P-value: 0.993 (Wilcoxon rank sum test)
Characteristic N Mean (Min, Max) Median (Q1, Q3)
Included 201 0.9 (0, 7) 0 (0, 2)
Overall 414 1 (0, 10) 0 (0, 2)

Unknown: Included = 642 , Overall = 3445

Any positive Food SPT wheals (>=3mm) at 5 Years

P-value: 0.960 (Chi-squared test)
Characteristic Included
N = 843
Overall
N = 3859
Unknown 642 (76.2%) 3445 (89.3%)
FALSE 189 (22.4%) 388 (10.1%)
TRUE 12 (1.4%) 26 (0.7%)
Total 843 (100.0%) 3859 (100.0%)

Any positive airborne/enviro SPT wheals (>=3mm) at 5 Years

P-value: 0.899 (Chi-squared test)
Characteristic Included
N = 843
Overall
N = 3859
Unknown 642 (76.2%) 3445 (89.3%)
FALSE 129 (15.3%) 268 (6.9%)
TRUE 72 (8.5%) 146 (3.8%)
Total 843 (100.0%) 3859 (100.0%)

Count of Connors domains equal to, or above 65 at 3 Years

P-value: 0.144 (Wilcoxon rank sum test)
Characteristic N Mean (Min, Max) Median (Q1, Q3)
Included 411 0.5 (0, 7) 0 (0, 1)
Overall 1062 0.4 (0, 7) 0 (0, 1)

Unknown: Included = 432 , Overall = 2797

Count of other clinical indicators parent reported as “3” highest at 3 Years

Data not available

4) Remaining Variables

All remaining variables not contained in (1), (2) or (3).

Key differences identified in remaining variables:

  • Twin births are significantly over-represented in the Inflammatory Cytokines cohort (4.7% vs 3.2%, p=0.006)
  • Current child age differs significantly, with Inflammatory Cytokines children being older (mean 4.5 vs 3.8 years, p<0.001)
  • Previous pregnancies show a trend toward more pregnancies (mean 1.3 vs 1.2, p=0.061)
  • Age at Peapod assessment differs significantly (mean 6.2 vs 5.1 days, p=0.003)
  • DASS individual category scores show several significant differences:
    • 18-week anxiety: Better profile in Inflammatory Cytokines cohort (p<0.001)
    • 18-week stress: Better profile in Inflammatory Cytokines cohort (p<0.001)
    • 36-week anxiety: Some differences in category distributions (p=0.032)
  • Previous pregnancies parity and BMI at Peapod show no significant differences
  • Other DASS depression measures at both time points show no significant differences

The over-representation of twins, older child age, and complex DASS mental health patterns in the Inflammatory Cytokines cohort suggests this sub-sample represents families with longer study engagement, potentially more complex pregnancies, and nuanced mental health profiles that vary by assessment timing.

# Remaining variables
remaining_vars <- list(
  list("singleton_twin", "Singleton or Twin"),
  list("current_age_march2024", "Current age of child (as of March 2024)"),
  list("previous_pregnancies", "Previous Pregnancies"),
  list("previous_pregnancies_parity", "Previous Pregnancies Parity"),
  list("bmi_peapod_calc", "BMI at Peapod (Calc.)"),
  list("age_days_peapod_calc", "Age (days) at Peapod (Calc.)"),
  list("dass21_18w_depression", "DASS21 Depression 18 Week"),
  list("dass21_18w_anxiety", "DASS21 Anxiety 18 Week"),
  list("dass21_18w_stress", "DASS21 Stress 18 Week"),
  list("dass21_36w_depression", "DASS21 Depression 36 Week"),
  list("dass21_36w_anxiety", "DASS21 Anxiety 36 Week"),
  list("dass21_36w_stress", "DASS21 Stress 36 Week")
)

for(var_info in remaining_vars) {
  var_name <- var_info[[1]]
  title <- var_info[[2]]
  
  if(var_name %in% numeric_vars) {
    analyze_continuous(dat, var_name, title)
  } else {
    analyze_categorical(dat, var_name, title)
  }
}

Singleton or Twin

P-value: 0.006 (Chi-squared test)
Characteristic Included
N = 843
Overall
N = 3859
Singleton 803 (95.3%) 3735 (96.8%)
Twins 40 (4.7%) 124 (3.2%)
Total 843 (100.0%) 3859 (100.0%)

Current age of child (as of March 2024)

P-value: <0.001 (Wilcoxon rank sum test)
Characteristic N Mean (Min, Max) Median (Q1, Q3)
Included 843 4.5 (1, 7.3) 4.6 (3.1, 5.8)
Overall 3859 3.8 (0.6, 7.3) 3.7 (2.4, 5.2)

Unknown: Included = 0 , Overall = 0

Previous Pregnancies

P-value: 0.061 (Wilcoxon rank sum test)
Characteristic N Mean (Min, Max) Median (Q1, Q3)
Included 766 1.3 (0, 10) 1 (0, 2)
Overall 3576 1.2 (0, 14) 1 (0, 2)

Unknown: Included = 77 , Overall = 283

Previous Pregnancies Parity

P-value: 0.176 (Wilcoxon rank sum test)
Characteristic N Mean (Min, Max) Median (Q1, Q3)
Included 766 0.7 (0, 5) 1 (0, 1)
Overall 3570 0.7 (0, 6) 0 (0, 1)

Unknown: Included = 77 , Overall = 289

BMI at Peapod (Calc.)

P-value: 0.347 (Wilcoxon rank sum test)
Characteristic N Mean (Min, Max) Median (Q1, Q3)
Included 621 12.8 (8.5, 19.2) 12.7 (11.9, 13.5)
Overall 2210 12.8 (8.5, 28.1) 12.8 (11.8, 13.8)

Unknown: Included = 222 , Overall = 1649

Age (days) at Peapod (Calc.)

P-value: 0.003 (Wilcoxon rank sum test)
Characteristic N Mean (Min, Max) Median (Q1, Q3)
Included 621 6.2 (0, 74) 2 (1, 4)
Overall 2210 5.1 (0, 82) 2 (1, 4)

Unknown: Included = 222 , Overall = 1649

DASS21 Depression 18 Week

P-value: 0.626 (Fisher’s exact test)
Characteristic Included
N = 843
Overall
N = 3859
Unknown 110 (13%) 1078 (27.9%)
Extremely Severe 8 (0.9%) 24 (0.6%)
Mild 47 (5.6%) 153 (4%)
Moderate 31 (3.7%) 120 (3.1%)
Normal 640 (75.9%) 2460 (63.7%)
Severe 7 (0.8%) 24 (0.6%)
Total 843 (100.0%) 3859 (100.0%)

DASS21 Anxiety 18 Week

P-value: <0.001 (Fisher’s exact test)
Characteristic Included
N = 843
Overall
N = 3859
Unknown 110 (13%) 1078 (27.9%)
Extremely Severe 20 (2.4%) 57 (1.5%)
Mild 89 (10.6%) 302 (7.8%)
Moderate 58 (6.9%) 130 (3.4%)
Normal 540 (64.1%) 2224 (57.6%)
Severe 26 (3.1%) 68 (1.8%)
Total 843 (100.0%) 3859 (100.0%)

DASS21 Stress 18 Week

P-value: <0.001 (Fisher’s exact test)
Characteristic Included
N = 843
Overall
N = 3859
Unknown 110 (13%) 1078 (27.9%)
Extremely Severe 10 (1.2%) 23 (0.6%)
Mild 44 (5.2%) 155 (4%)
Moderate 44 (5.2%) 101 (2.6%)
Normal 609 (72.2%) 2447 (63.4%)
Severe 26 (3.1%) 55 (1.4%)
Total 843 (100.0%) 3859 (100.0%)

DASS21 Depression 36 Week

P-value: 0.676 (Fisher’s exact test)
Characteristic Included
N = 843
Overall
N = 3859
Unknown 443 (52.6%) 2289 (59.3%)
Extremely Severe 1 (0.1%) 13 (0.3%)
Mild 23 (2.7%) 86 (2.2%)
Moderate 11 (1.3%) 48 (1.2%)
Normal 361 (42.8%) 1409 (36.5%)
Severe 4 (0.5%) 14 (0.4%)
Total 843 (100.0%) 3859 (100.0%)

DASS21 Anxiety 36 Week

P-value: 0.029 (Fisher’s exact test)
Characteristic Included
N = 843
Overall
N = 3859
Unknown 443 (52.6%) 2289 (59.3%)
Extremely Severe 10 (1.2%) 32 (0.8%)
Mild 26 (3.1%) 130 (3.4%)
Moderate 29 (3.4%) 72 (1.9%)
Normal 328 (38.9%) 1310 (33.9%)
Severe 7 (0.8%) 26 (0.7%)
Total 843 (100.0%) 3859 (100.0%)

DASS21 Stress 36 Week

P-value: 0.200 (Fisher’s exact test)
Characteristic Included
N = 843
Overall
N = 3859
Unknown 443 (52.6%) 2289 (59.3%)
Extremely Severe 1 (0.1%) 12 (0.3%)
Mild 17 (2%) 71 (1.8%)
Moderate 14 (1.7%) 53 (1.4%)
Normal 355 (42.1%) 1403 (36.4%)
Severe 13 (1.5%) 31 (0.8%)
Total 843 (100.0%) 3859 (100.0%)

Plotting

# Create distribution plots for key continuous variables
plot_vars <- c("maternal_age_birth", "infant_weight", "bmi_1yr_calc", "bmi_3yr_calc", 
               "bmi_5yr_calc", "ferritin_1yr", "ferritin_3yr", "dass21_18w_normal_count", 
               "dass21_36w_normal_count")

for(var in plot_vars) {
  if(var %in% names(dat) && is.numeric(dat[[var]]) && !all(is.na(dat[[var]]))) {
    
    # Create a more descriptive title
    plot_title <- case_when(
      var == "maternal_age_birth" ~ "Maternal Age at Birth",
      var == "infant_weight" ~ "Infant Weight at Birth",
      var == "bmi_1yr_calc" ~ "BMI at 1 Year",
      var == "bmi_3yr_calc" ~ "BMI at 3 Years", 
      var == "bmi_5yr_calc" ~ "BMI at 5 Years",
      var == "ferritin_1yr" ~ "Ferritin Levels at 1 Year",
      var == "ferritin_3yr" ~ "Ferritin Levels at 3 Years",
      var == "dass21_18w_normal_count" ~ "DASS Normal Domains at 18 Weeks",
      var == "dass21_36w_normal_count" ~ "DASS Normal Domains at 36 Weeks",
      TRUE ~ str_replace_all(var, "_", " ")
    )
    
    # Create data for three-panel plot
    excluded_data <- dat %>%
      filter(!is.na(.data[[var]]) & inflammatory_cytokines_group == "Excluded") %>%
      mutate(group_type = "Excluded")
    
    included_data <- dat %>%
      filter(!is.na(.data[[var]]) & inflammatory_cytokines_group == "Included") %>%
      mutate(group_type = "Included")
    
    overall_data <- dat %>%
      filter(!is.na(.data[[var]])) %>%
      mutate(group_type = "Overall")
    
    # Combine all three datasets
    combined_data <- bind_rows(excluded_data, included_data, overall_data) %>%
      mutate(
        group_type = factor(group_type, levels = c("Excluded", "Included", "Overall"))
      )
    
    p <- combined_data %>%
      ggplot(aes(x = .data[[var]])) +
      geom_histogram(aes(fill = group_type), alpha = 0.7, bins = 30) +
      facet_wrap(~group_type, scales = "free_y", ncol = 3) +
      theme_minimal() +
      labs(
        title = paste("Distribution of", plot_title),
        x = plot_title,
        y = "Count"
      ) +
      theme(
        legend.position = "none",
        plot.title = element_text(size = 14, face = "bold"),
        strip.text = element_text(size = 12, face = "bold")
      ) +
      scale_fill_manual(values = c("Excluded" = "#E74C3C", "Included" = "#3498DB", "Overall" = "#2C3E50"))
    
    print(p)
  }
}

# Create a comparison plot for ferritin levels across time points
ferritin_data <- dat %>%
  select(inflammatory_cytokines_group, ferritin_1yr, ferritin_3yr) %>%
  pivot_longer(cols = c(ferritin_1yr, ferritin_3yr), 
               names_to = "time_point", 
               values_to = "ferritin_level") %>%
  filter(!is.na(ferritin_level)) %>%
  mutate(
    time_point = case_when(
      time_point == "ferritin_1yr" ~ "1 Year",
      time_point == "ferritin_3yr" ~ "3 Years",
      TRUE ~ time_point
    ),
    time_point = factor(time_point, levels = c("1 Year", "3 Years"))
  )

if(nrow(ferritin_data) > 0) {
  p_ferritin <- ferritin_data %>%
    ggplot(aes(x = ferritin_level, fill = inflammatory_cytokines_group)) +
    geom_histogram(alpha = 0.7, position = "identity", bins = 25) +
    facet_grid(inflammatory_cytokines_group ~ time_point, scales = "free") +
    theme_minimal() +
    labs(
      title = "Ferritin Levels Comparison Across Time Points",
      x = "Ferritin Level",
      y = "Count"
    ) +
    theme(
      legend.position = "none",
      plot.title = element_text(size = 14, face = "bold"),
      strip.text = element_text(size = 11, face = "bold")
    ) +
    scale_fill_manual(values = c("Excluded" = "#E74C3C", "Included" = "#3498DB"))
  
  print(p_ferritin)
}

# Create a comparison plot for DASS normal domains across time points
dass_data <- dat %>%
  select(inflammatory_cytokines_group, dass21_18w_normal_count, dass21_36w_normal_count) %>%
  pivot_longer(cols = c(dass21_18w_normal_count, dass21_36w_normal_count), 
               names_to = "time_point", 
               values_to = "normal_count") %>%
  filter(!is.na(normal_count)) %>%
  mutate(
    time_point = case_when(
      time_point == "dass21_18w_normal_count" ~ "18 Weeks",
      time_point == "dass21_36w_normal_count" ~ "36 Weeks", 
      TRUE ~ time_point
    )
  )

if(nrow(dass_data) > 0) {
  p_dass <- dass_data %>%
    ggplot(aes(x = normal_count, fill = inflammatory_cytokines_group)) +
    geom_bar(alpha = 0.7, position = "dodge") +
    facet_grid(inflammatory_cytokines_group ~ time_point) +
    theme_minimal() +
    labs(
      title = "DASS Normal Domains Comparison Across Time Points",
      x = "Number of Normal DASS Domains (0-3)",
      y = "Count"
    ) +
    theme(
      legend.position = "none",
      plot.title = element_text(size = 14, face = "bold"),
      strip.text = element_text(size = 11, face = "bold")
    ) +
    scale_fill_manual(values = c("Excluded" = "#E74C3C", "Included" = "#3498DB")) +
    scale_x_continuous(breaks = 0:3)
  
  print(p_dass)
}

# Create a BMI trajectory comparison plot
bmi_data <- dat %>%
  select(inflammatory_cytokines_group, bmi_1yr_calc, bmi_3yr_calc, bmi_5yr_calc) %>%
  pivot_longer(cols = c(bmi_1yr_calc, bmi_3yr_calc, bmi_5yr_calc), 
               names_to = "time_point", 
               values_to = "bmi") %>%
  filter(!is.na(bmi)) %>%
  mutate(
    time_point = case_when(
      time_point == "bmi_1yr_calc" ~ "1 Year",
      time_point == "bmi_3yr_calc" ~ "3 Years",
      time_point == "bmi_5yr_calc" ~ "5 Years",
      TRUE ~ time_point
    ),
    time_point = factor(time_point, levels = c("1 Year", "3 Years", "5 Years"))
  )

if(nrow(bmi_data) > 0) {
  p_bmi <- bmi_data %>%
    ggplot(aes(x = bmi, fill = inflammatory_cytokines_group)) +
    geom_histogram(alpha = 0.7, position = "identity", bins = 25) +
    facet_grid(inflammatory_cytokines_group ~ time_point, scales = "free") +
    theme_minimal() +
    labs(
      title = "BMI Comparison Across Time Points",
      x = "BMI",
      y = "Count"
    ) +
    theme(
      legend.position = "none",
      plot.title = element_text(size = 14, face = "bold"),
      strip.text = element_text(size = 11, face = "bold")
    ) +
    scale_fill_manual(values = c("Excluded" = "#E74C3C", "Included" = "#3498DB"))
  
  print(p_bmi)
}

Reproducible Research Information

This document was prepared using the software R, via the RStudio IDE, and was written in RMarkdown.

sessionInfo()
## R version 4.3.1 (2023-06-16 ucrt)
## Platform: x86_64-w64-mingw32/x64 (64-bit)
## Running under: Windows 10 x64 (build 19045)
## 
## Matrix products: default
## 
## 
## locale:
## [1] LC_COLLATE=English_Australia.utf8  LC_CTYPE=English_Australia.utf8   
## [3] LC_MONETARY=English_Australia.utf8 LC_NUMERIC=C                      
## [5] LC_TIME=English_Australia.utf8    
## 
## time zone: Australia/Perth
## tzcode source: internal
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
##  [1] patchwork_1.2.0  gtsummary_2.3.0  kableExtra_1.4.0 knitr_1.48      
##  [5] readxl_1.4.3     lubridate_1.9.3  forcats_1.0.0    stringr_1.5.1   
##  [9] dplyr_1.1.4      purrr_1.0.2      readr_2.1.5      tidyr_1.3.1     
## [13] tibble_3.2.1     ggplot2_3.5.1    tidyverse_2.0.0 
## 
## loaded via a namespace (and not attached):
##  [1] sass_0.4.9        utf8_1.2.4        generics_0.1.3    xml2_1.3.6       
##  [5] stringi_1.8.4     hms_1.1.3         digest_0.6.37     magrittr_2.0.3   
##  [9] evaluate_0.24.0   grid_4.3.1        timechange_0.3.0  fastmap_1.2.0    
## [13] cellranger_1.1.0  jsonlite_1.8.8    fansi_1.0.6       viridisLite_0.4.2
## [17] scales_1.3.0      jquerylib_0.1.4   cli_3.6.3         rlang_1.1.4      
## [21] munsell_0.5.1     withr_3.0.1       cachem_1.1.0      yaml_2.3.10      
## [25] tools_4.3.1       tzdb_0.4.0        colorspace_2.1-1  vctrs_0.6.5      
## [29] R6_2.5.1          lifecycle_1.0.4   pkgconfig_2.0.3   pillar_1.9.0     
## [33] bslib_0.8.0       gtable_0.3.5      glue_1.8.0        systemfonts_1.2.2
## [37] highr_0.11        xfun_0.52         tidyselect_1.2.1  rstudioapi_0.16.0
## [41] farver_2.1.2      htmltools_0.5.8.1 labeling_0.4.3    rmarkdown_2.28   
## [45] svglite_2.1.3     compiler_4.3.1