Background: Cardiovascular disease is an important source of morbidity among people living with HIV (PLWH), particularly as survival improves with antiretroviral therapy (ART). This analysis examines whether ART duration and ARV regimen type are associated with elevated NT-proBNP, a biomarker of cardiac stress.
Methods: This cross-sectional analysis included PLWH aged 40 years and older receiving care in Almaty, Kazakhstan. Elevated NT-proBNP was defined as NT-proBNP ≥125 pg/mL. Logistic regression models estimated crude and adjusted associations between ART duration and elevated NT-proBNP.
Results: The unadjusted model showed a modest positive association between ART duration and elevated NT-proBNP. After adjustment for age, hypertension, ARV regimen type, sex, BMI, and smoking status, the ART duration association was attenuated. Age was the strongest predictor of elevated NT-proBNP.
Conclusion: In this sample, ART duration was not independently associated with elevated NT-proBNP after adjustment. Findings suggest that aging-related cardiovascular risk may be more important than cumulative ART duration in this population.
hiv_raw <- read_excel("C:/Users/userp/OneDrive/Рабочий стол/HSTA553/Project EPI 553/nursultan.xlsx")
n_start <- nrow(hiv_raw)
n_start
## [1] 150
# Recode helper function for ARV regimen groups
classify_regimen <- function(x) {
x <- tolower(as.character(x))
has_pi <- grepl("резолста|rezolsta|калетра|kaletra|lopinavir|дарунавир|darun", x)
has_insti <- grepl("теград|триград|dtg|долутегравир|dolutegravir|тивикай|tivicay|триумек|triumeq", x)
has_nnrti <- grepl("efv|эфавиренз|атрипла|atripla|тенмифа|комплера|complera|rilp|рилпивирин", x)
n_classes <- sum(c(has_pi, has_insti, has_nnrti))
if (n_classes > 1) {
"Other/mixed"
} else if (has_pi) {
"PI-based"
} else if (has_insti) {
"INSTI-based"
} else if (has_nnrti) {
"NNRTI-based"
} else {
"Other/mixed"
}
}
hiv <- hiv_raw %>%
mutate(
bmi = Weight / (Height / 100)^2,
elevated_ntprobnp = ifelse(proBNP >= 125, 1, 0),
sex_label = factor(ifelse(sex == 1, "Male", "Female"),
levels = c("Female", "Male")),
current_smoker = ifelse(smoke100 == 1 & smokeday %in% c(1, 2), 1, 0),
smoker_label = factor(ifelse(current_smoker == 1, "Current", "Non-current"),
levels = c("Non-current", "Current")),
htn_label = factor(ifelse(HTN == 1, "Yes", "No"),
levels = c("No", "Yes")),
regimen_group4 = sapply(ART, classify_regimen),
regimen_group3 = case_when(
regimen_group4 %in% c("PI-based", "Other/mixed") ~ "PI/Other",
TRUE ~ regimen_group4
),
regimen_group3 = factor(regimen_group3,
levels = c("INSTI-based", "NNRTI-based", "PI/Other")),
regimen_group4 = factor(regimen_group4,
levels = c("INSTI-based", "NNRTI-based", "PI-based", "Other/mixed")),
art_duration_cat = case_when(
ARTyrs < 5 ~ "<5 years",
ARTyrs >= 5 & ARTyrs <= 9 ~ "5-9 years",
ARTyrs >= 10 ~ ">=10 years",
TRUE ~ NA_character_
),
art_duration_cat = factor(art_duration_cat,
levels = c("<5 years", "5-9 years", ">=10 years"))
) %>%
filter(age >= 40)
n_final <- nrow(hiv)
n_excluded_age <- n_start - n_final
analytic_sample <- data.frame(
Step = c("Raw dataset", "Excluded age <40 years", "Final analytic sample"),
N = c(n_start, n_excluded_age, n_final)
)
kable(analytic_sample)
| Step | N |
|---|---|
| Raw dataset | 150 |
| Excluded age <40 years | 0 |
| Final analytic sample | 150 |
The dataset was imported from a single Excel worksheet. The final analytic sample included 150 participants aged 40 years and older.
The primary outcome was elevated NT-proBNP, defined as NT-proBNP ≥125 pg/mL. The primary exposure was ART duration in years. ARV regimen type was classified into INSTI-based, NNRTI-based, and PI/Other groups for regression analysis.
Covariates included age, sex, BMI, hypertension, and current smoking status. These variables were selected because they are clinically relevant cardiovascular risk factors and potential confounders of the ART duration and NT-proBNP relationship.
key_vars <- c("proBNP", "ARTyrs", "ART", "age", "sex", "Height", "Weight", "HTN", "smoke100", "smokeday")
missing_table <- data.frame(
Variable = key_vars,
Missing_n = sapply(key_vars, function(v) sum(is.na(hiv[[v]]))),
Missing_pct = sapply(key_vars, function(v) mean(is.na(hiv[[v]])) * 100)
)
kable(missing_table, digits = 1)
| Variable | Missing_n | Missing_pct | |
|---|---|---|---|
| proBNP | proBNP | 0 | 0.0 |
| ARTyrs | ARTyrs | 0 | 0.0 |
| ART | ART | 0 | 0.0 |
| age | age | 0 | 0.0 |
| sex | sex | 0 | 0.0 |
| Height | Height | 0 | 0.0 |
| Weight | Weight | 0 | 0.0 |
| HTN | HTN | 0 | 0.0 |
| smoke100 | smoke100 | 0 | 0.0 |
| smokeday | smokeday | 31 | 20.7 |
table1 <- bind_rows(
data.frame(Characteristic = "Age, years", Summary = sprintf("%.1f (%.1f)", mean(hiv$age, na.rm = TRUE), sd(hiv$age, na.rm = TRUE))),
data.frame(Characteristic = "BMI, kg/m^2", Summary = sprintf("%.1f (%.1f)", mean(hiv$bmi, na.rm = TRUE), sd(hiv$bmi, na.rm = TRUE))),
data.frame(Characteristic = "NT-proBNP, pg/mL", Summary = sprintf("%.1f [%.1f, %.1f]",
median(hiv$proBNP, na.rm = TRUE),
quantile(hiv$proBNP, 0.25, na.rm = TRUE),
quantile(hiv$proBNP, 0.75, na.rm = TRUE))),
data.frame(Characteristic = "Elevated NT-proBNP: No", Summary = sprintf("%d (%.1f%%)",
sum(hiv$elevated_ntprobnp == 0, na.rm = TRUE),
mean(hiv$elevated_ntprobnp == 0, na.rm = TRUE) * 100)),
data.frame(Characteristic = "Elevated NT-proBNP: Yes", Summary = sprintf("%d (%.1f%%)",
sum(hiv$elevated_ntprobnp == 1, na.rm = TRUE),
mean(hiv$elevated_ntprobnp == 1, na.rm = TRUE) * 100)),
data.frame(Characteristic = "Sex: Male", Summary = sprintf("%d (%.1f%%)",
sum(hiv$sex_label == "Male", na.rm = TRUE),
mean(hiv$sex_label == "Male", na.rm = TRUE) * 100)),
data.frame(Characteristic = "Sex: Female", Summary = sprintf("%d (%.1f%%)",
sum(hiv$sex_label == "Female", na.rm = TRUE),
mean(hiv$sex_label == "Female", na.rm = TRUE) * 100)),
data.frame(Characteristic = "Current smoker: Current", Summary = sprintf("%d (%.1f%%)",
sum(hiv$smoker_label == "Current", na.rm = TRUE),
mean(hiv$smoker_label == "Current", na.rm = TRUE) * 100)),
data.frame(Characteristic = "Current smoker: Non-current", Summary = sprintf("%d (%.1f%%)",
sum(hiv$smoker_label == "Non-current", na.rm = TRUE),
mean(hiv$smoker_label == "Non-current", na.rm = TRUE) * 100)),
data.frame(Characteristic = "Hypertension: No", Summary = sprintf("%d (%.1f%%)",
sum(hiv$htn_label == "No", na.rm = TRUE),
mean(hiv$htn_label == "No", na.rm = TRUE) * 100)),
data.frame(Characteristic = "Hypertension: Yes", Summary = sprintf("%d (%.1f%%)",
sum(hiv$htn_label == "Yes", na.rm = TRUE),
mean(hiv$htn_label == "Yes", na.rm = TRUE) * 100)),
data.frame(Characteristic = "ARV regimen type: INSTI-based", Summary = sprintf("%d (%.1f%%)",
sum(hiv$regimen_group4 == "INSTI-based", na.rm = TRUE),
mean(hiv$regimen_group4 == "INSTI-based", na.rm = TRUE) * 100)),
data.frame(Characteristic = "ARV regimen type: NNRTI-based", Summary = sprintf("%d (%.1f%%)",
sum(hiv$regimen_group4 == "NNRTI-based", na.rm = TRUE),
mean(hiv$regimen_group4 == "NNRTI-based", na.rm = TRUE) * 100)),
data.frame(Characteristic = "ARV regimen type: PI-based", Summary = sprintf("%d (%.1f%%)",
sum(hiv$regimen_group4 == "PI-based", na.rm = TRUE),
mean(hiv$regimen_group4 == "PI-based", na.rm = TRUE) * 100)),
data.frame(Characteristic = "ARV regimen type: Other/mixed", Summary = sprintf("%d (%.1f%%)",
sum(hiv$regimen_group4 == "Other/mixed", na.rm = TRUE),
mean(hiv$regimen_group4 == "Other/mixed", na.rm = TRUE) * 100)),
data.frame(Characteristic = "ART duration: <5 years", Summary = sprintf("%d (%.1f%%)",
sum(hiv$art_duration_cat == "<5 years", na.rm = TRUE),
mean(hiv$art_duration_cat == "<5 years", na.rm = TRUE) * 100)),
data.frame(Characteristic = "ART duration: 5-9 years", Summary = sprintf("%d (%.1f%%)",
sum(hiv$art_duration_cat == "5-9 years", na.rm = TRUE),
mean(hiv$art_duration_cat == "5-9 years", na.rm = TRUE) * 100)),
data.frame(Characteristic = "ART duration: >=10 years", Summary = sprintf("%d (%.1f%%)",
sum(hiv$art_duration_cat == ">=10 years", na.rm = TRUE),
mean(hiv$art_duration_cat == ">=10 years", na.rm = TRUE) * 100))
)
kable(table1, col.names = c("Characteristic", paste0("Overall (n = ", n_final, ")")))
| Characteristic | Overall (n = 150) |
|---|---|
| Age, years | 50.6 (7.9) |
| BMI, kg/m^2 | 23.8 (3.6) |
| NT-proBNP, pg/mL | 90.5 [46.2, 190.2] |
| Elevated NT-proBNP: No | 95 (63.3%) |
| Elevated NT-proBNP: Yes | 55 (36.7%) |
| Sex: Male | 82 (54.7%) |
| Sex: Female | 68 (45.3%) |
| Current smoker: Current | 111 (74.0%) |
| Current smoker: Non-current | 39 (26.0%) |
| Hypertension: No | 128 (85.3%) |
| Hypertension: Yes | 22 (14.7%) |
| ARV regimen type: INSTI-based | 76 (50.7%) |
| ARV regimen type: NNRTI-based | 63 (42.0%) |
| ARV regimen type: PI-based | 7 (4.7%) |
| ARV regimen type: Other/mixed | 4 (2.7%) |
| ART duration: <5 years | 56 (37.3%) |
| ART duration: 5-9 years | 59 (39.3%) |
| ART duration: >=10 years | 35 (23.3%) |
Note: Continuous variables are reported as mean (SD), except NT-proBNP, which is reported as median [IQR] because of right skew.
ggplot(hiv, aes(x = proBNP)) +
geom_histogram(bins = 30, color = "black", fill = "steelblue", alpha = 0.8) +
labs(
title = "Figure 1. Distribution of NT-proBNP",
x = "NT-proBNP (pg/mL)",
y = "Frequency"
) +
theme_minimal()
NT-proBNP values were right-skewed, with most observations concentrated at lower values and a smaller number of high values.
ggplot(hiv, aes(x = ARTyrs, y = proBNP)) +
geom_point(alpha = 0.7) +
geom_smooth(method = "lm", se = TRUE) +
labs(
title = "Figure 2. NT-proBNP by ART Duration",
x = "ART duration (years)",
y = "NT-proBNP (pg/mL)"
) +
theme_minimal()
The scatterplot suggests a weak positive pattern between ART duration and NT-proBNP, but there is substantial variability.
ggplot(hiv, aes(x = regimen_group4, y = proBNP)) +
geom_boxplot() +
labs(
title = "Figure 3. NT-proBNP Across ARV Regimen Groups",
x = "ARV regimen group",
y = "NT-proBNP (pg/mL)"
) +
theme_minimal()
NT-proBNP distributions overlapped across regimen groups. PI-based and Other/mixed groups were small, so these visual differences should be interpreted cautiously.
A logistic regression model was used because the outcome was binary: elevated NT-proBNP versus not elevated. Model 1 was unadjusted and included ART duration only. Model 2 adjusted for age, hypertension, ARV regimen group, sex, BMI, and smoking status.
model1 <- glm(
elevated_ntprobnp ~ ARTyrs,
family = binomial(link = "logit"),
data = hiv
)
model2 <- glm(
elevated_ntprobnp ~ ARTyrs + age + htn_label + regimen_group3 + sex_label + bmi + smoker_label,
family = binomial(link = "logit"),
data = hiv
)
tbl_regression(model1, exponentiate = TRUE) %>%
modify_header(label = "**Term**") %>%
modify_caption("**Model 1. Unadjusted logistic regression**")
| Term | OR | 95% CI | p-value |
|---|---|---|---|
| ARTyrs | 1.07 | 0.99, 1.15 | 0.094 |
| Abbreviations: CI = Confidence Interval, OR = Odds Ratio | |||
tbl_regression(model2, exponentiate = TRUE) %>%
modify_header(label = "**Term**") %>%
modify_caption("**Model 2. Adjusted logistic regression**")
| Term | OR | 95% CI | p-value |
|---|---|---|---|
| ARTyrs | 1.05 | 0.97, 1.14 | 0.2 |
| age | 1.08 | 1.03, 1.13 | 0.003 |
| htn_label | |||
| No | — | — | |
| Yes | 0.88 | 0.29, 2.56 | 0.8 |
| regimen_group3 | |||
| INSTI-based | — | — | |
| NNRTI-based | 0.95 | 0.45, 1.99 | 0.9 |
| PI/Other | 2.01 | 0.48, 8.69 | 0.3 |
| sex_label | |||
| Female | — | — | |
| Male | 1.01 | 0.46, 2.24 | >0.9 |
| bmi | 0.98 | 0.88, 1.08 | 0.7 |
| smoker_label | |||
| Non-current | — | — | |
| Current | 0.72 | 0.30, 1.70 | 0.4 |
| Abbreviations: CI = Confidence Interval, OR = Odds Ratio | |||
model1_or <- tidy(model1, exponentiate = TRUE, conf.int = TRUE)
model2_or <- tidy(model2, exponentiate = TRUE, conf.int = TRUE)
model1_or %>% kable(digits = 3)
| term | estimate | std.error | statistic | p.value | conf.low | conf.high |
|---|---|---|---|---|---|---|
| (Intercept) | 0.380 | 0.307 | -3.153 | 0.002 | 0.202 | 0.679 |
| ARTyrs | 1.066 | 0.038 | 1.673 | 0.094 | 0.992 | 1.154 |
model2_or %>% kable(digits = 3)
| term | estimate | std.error | statistic | p.value | conf.low | conf.high |
|---|---|---|---|---|---|---|
| (Intercept) | 0.021 | 1.774 | -2.176 | 0.030 | 0.001 | 0.644 |
| ARTyrs | 1.049 | 0.040 | 1.211 | 0.226 | 0.972 | 1.139 |
| age | 1.076 | 0.025 | 2.948 | 0.003 | 1.026 | 1.133 |
| htn_labelYes | 0.884 | 0.552 | -0.224 | 0.823 | 0.287 | 2.559 |
| regimen_group3NNRTI-based | 0.948 | 0.378 | -0.141 | 0.888 | 0.449 | 1.987 |
| regimen_group3PI/Other | 2.009 | 0.728 | 0.958 | 0.338 | 0.476 | 8.686 |
| sex_labelMale | 1.013 | 0.402 | 0.032 | 0.975 | 0.459 | 2.237 |
| bmi | 0.978 | 0.052 | -0.431 | 0.666 | 0.880 | 1.081 |
| smoker_labelCurrent | 0.716 | 0.437 | -0.765 | 0.444 | 0.303 | 1.700 |
In the unadjusted model, each additional year of ART duration was associated with higher odds of elevated NT-proBNP. After adjustment for age, hypertension, ARV regimen, sex, BMI, and smoking, the ART duration estimate was attenuated. Age was the strongest predictor of elevated NT-proBNP in the adjusted model.
Longer ART duration showed a modest positive association with elevated NT-proBNP in the unadjusted model, but this association was attenuated after adjustment. Age was the strongest predictor of elevated NT-proBNP. These results suggest that aging-related cardiovascular risk may be more important than ART duration alone in this sample of PLWH aged 40 years and older.