# Processing the raw data
df_clean <- df %>%
clean_names() %>%
mutate(
# 1. Convert Excel Serial Dates and fix typos
date_onset = case_when(
str_detect(date_of_onset_of_fever, "^[0-9]{5}$") ~
as.Date(as.numeric(date_of_onset_of_fever), origin = "1899-12-30"),
str_detect(date_of_onset_of_fever, "-1012$") ~
dmy(str_replace(date_of_onset_of_fever, "-1012$", "-2012")),
TRUE ~ ymd(str_remove(date_of_interview, " UTC"))
),
# 2. Standardize Age
age_years = as.numeric(age),
age_group = cut(age_years,
breaks = c(0, 5, 18, 50, 65, Inf),
labels = c("<5", "5-17", "18-49", "50-64", "65+")),
# 3. Outcome Variable
severity = factor(type_of_case, levels = c("ILI", "SARI")),
# 4. Comorbidity Logic (Expanded to include Pregnancy, Asthma, Cancer)
# We create numeric versions for the sum, then clean the factors
across(c(diabetes, hiv_aids, heart_disease, asthma, cancer, pregnancy),
~case_when(. == "Yes" ~ 1, TRUE ~ 0), .names = "{.col}_num"),
# 5. Create Comorbidity Count (Summing the numeric versions)
comorbid_count = diabetes_num + hiv_aids_num + heart_disease_num +
asthma_num + cancer_num + pregnancy_num,
comorbid_factor = factor(case_when(
comorbid_count == 0 ~ "0",
comorbid_count == 1 ~ "1",
comorbid_count >= 2 ~ "2+",
TRUE ~ "0"
), levels = c("0", "1", "2+"))
) %>%
# Filter for Influenza Positives
filter(influenza_type %in% c("Flu A", "Flu B", "Flu A&B"))
# 6. Standardizing Subtypes
df_clean <- df_clean %>%
mutate(
influenza_sub_type = str_trim(str_to_upper(influenza_sub_type)),
influenza_sub_type = case_match(
influenza_sub_type,
c("B VICTORIA", "B/VICTORIA") ~ "B/Victoria",
c("NOT SUBTYPED", "NOT_SUBTYPED") ~ "Not Subtyped",
c("A/H1", "2009 A/H1N1", "A/H1N1") ~ "A(H1N1)pdm09",
"A/H3" ~ "A/H3N2",
"ALL NEGATIVE" ~ "Negative",
.default = influenza_sub_type
),
influenza_sub_type = factor(influenza_sub_type)
)
| Patient Characteristic | Overall N = 4321 |
ILI N = 3621 |
SARI N = 701 |
p-value2 |
|---|---|---|---|---|
| Age Group (Years) | 0.7 | |||
| Â Â Â Â <5 | 254 (59%) | 211 (59%) | 43 (61%) | |
| Â Â Â Â 5-17 | 72 (17%) | 59 (16%) | 13 (19%) | |
| Â Â Â Â 18-49 | 79 (18%) | 70 (19%) | 9 (13%) | |
| Â Â Â Â 50-64 | 16 (3.7%) | 13 (3.6%) | 3 (4.3%) | |
| Â Â Â Â 65+ | 9 (2.1%) | 7 (1.9%) | 2 (2.9%) | |
| sex | >0.9 | |||
| Â Â Â Â Female | 223 (52%) | 187 (52%) | 36 (51%) | |
| Â Â Â Â Male | 208 (48%) | 174 (48%) | 34 (49%) | |
| Â Â Â Â Missing | 1 (0.2%) | 1 (0.3%) | 0 (0%) | |
| Diabetes Mellitus | 0.4 | |||
| Â Â Â Â Missing | 13 (3.0%) | 10 (2.8%) | 3 (4.3%) | |
| Â Â Â Â No | 415 (96%) | 349 (96%) | 66 (94%) | |
| Â Â Â Â Yes | 4 (0.9%) | 3 (0.8%) | 1 (1.4%) | |
| HIV/AIDS Positive | >0.9 | |||
| Â Â Â Â Missing | 3 (0.7%) | 3 (0.8%) | 0 (0%) | |
| Â Â Â Â No | 419 (97%) | 350 (97%) | 69 (99%) | |
| Â Â Â Â Yes | 10 (2.3%) | 9 (2.5%) | 1 (1.4%) | |
| Chronic Heart Disease | 2 (0.5%) | 2 (0.6%) | 0 (0%) | >0.9 |
| Asthma | 5 (1.2%) | 3 (0.8%) | 2 (2.9%) | 0.2 |
| Cancer/Malignancy | 1 (0.2%) | 1 (0.3%) | 0 (0%) | >0.9 |
| Pregnancy Status | 0.7 | |||
| Â Â Â Â Missing | 12 (2.8%) | 11 (3.0%) | 1 (1.4%) | |
| Â Â Â Â No | 420 (97%) | 351 (97%) | 69 (99%) | |
| Number of Comorbidities | 0.8 | |||
| Â Â Â Â 0 | 410 (95%) | 344 (95%) | 66 (94%) | |
| Â Â Â Â 1 | 22 (5.1%) | 18 (5.0%) | 4 (5.7%) | |
| Â Â Â Â 2+ | 0 (0%) | 0 (0%) | 0 (0%) | |
| 1 n (%) | ||||
| 2 Fisher’s exact test | ||||
A total of 432 laboratory-confirmed influenza cases were included in the analysis, comprising 362 (83.8%) patients with Influenza-Like Illness (ILI) and 70 (16.2%) with Severe Acute Respiratory Infection (SARI). The demographic and clinical profiles of these patients are summarized in Table 1.
The study population was predominantly pediatric, with children under five years of age accounting for 58.8% (n=254) of the total cohort. The proportion of children under five was slightly higher among SARI cases (61.4%) compared to ILI cases (58.3%), though this difference was not statistically significant (p=0.7). Sex distribution was nearly equal, with females representing 51.6% (n=223) of all cases. No significant association was found between sex and disease severity (p>0.9).
The overall prevalence of recorded chronic comorbidities was low in this cohort, with 94.9% (n=410) of patients having no documented underlying conditions. Among the specific conditions investigated: - HIV/AIDS was the most frequent comorbidity, present in 2.3% (n=10) of the population, followed by Asthma (1.2%, n=5) and Diabetes Mellitus (0.9%, n=4). - Asthma showed the most notable trend toward severity, being present in 2.9% of SARI cases compared to 0.8% of ILI cases (p=0.2). - Diabetes Mellitus was recorded in 1.4% of SARI cases and 0.8% of ILI cases (p=0.4). - Pregnancy, Chronic Heart Disease, and Cancer/Malignancy were rare, each occurring in <=0.5% of the total cohort.
Statistical analysis using Fisher’s Exact Test revealed that no single comorbidity or the cumulative number of comorbidities (p=0.8) was significantly associated with an increased risk of SARI in this specific sample.
| Symptom Reported | ILI N = 3621 |
SARI N = 701 |
p-value2 |
|---|---|---|---|
| Fever | 0.029 | ||
| Â Â Â Â Missing | 46 (13%) | 15 (21%) | |
| Â Â Â Â No | 141 (39%) | 17 (24%) | |
| Â Â Â Â Yes | 175 (48%) | 38 (54%) | |
| Cough | 0.2 | ||
| Â Â Â Â Missing | 0 (0%) | 1 (1.4%) | |
| Â Â Â Â No | 10 (2.8%) | 1 (1.4%) | |
| Â Â Â Â Yes | 352 (97%) | 68 (97%) | |
| Sore Throat | 0.2 | ||
| Â Â Â Â Missing | 3 (0.8%) | 1 (1.4%) | |
| Â Â Â Â No | 288 (80%) | 49 (70%) | |
| Â Â Â Â Yes | 71 (20%) | 20 (29%) | |
| Shortness of Breath | <0.001 | ||
| Â Â Â Â Missing | 3 (0.8%) | 0 (0%) | |
| Â Â Â Â No | 326 (90%) | 37 (53%) | |
| Â Â Â Â Yes | 33 (9.1%) | 33 (47%) | |
| 1 n (%) | |||
| 2 Fisher’s exact test | |||
| Subtype | N = 4321 |
|---|---|
| Influenza Subtype | |
| Â Â Â Â A(H1N1)pdm09 | 70 (16%) |
| Â Â Â Â A/H3N2 | 13 (3.0%) |
| Â Â Â Â B/Victoria | 33 (7.6%) |
| Â Â Â Â Negative | 1 (0.2%) |
| Â Â Â Â Not Subtyped | 315 (73%) |
| 1 n (%) | |
This chart quantifies how the accumulation of health conditions
correlates with SARI.