#Loading libraries

library(MASS)
library(grid)
library(table1)
## 
## Attaching package: 'table1'
## The following objects are masked from 'package:base':
## 
##     units, units<-
library(haven)
library(tidyverse)
## Warning: package 'tidyverse' was built under R version 4.3.2
## Warning: package 'ggplot2' was built under R version 4.3.2
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.2     ✔ readr     2.1.4
## ✔ forcats   1.0.0     ✔ stringr   1.5.0
## ✔ ggplot2   3.5.0     ✔ tibble    3.2.1
## ✔ lubridate 1.9.2     ✔ tidyr     1.3.0
## ✔ purrr     1.0.2
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ✖ dplyr::select() masks MASS::select()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(magrittr)
## 
## Attaching package: 'magrittr'
## 
## The following object is masked from 'package:purrr':
## 
##     set_names
## 
## The following object is masked from 'package:tidyr':
## 
##     extract
library(ggfortify)
## Warning: package 'ggfortify' was built under R version 4.3.2
library(png)
library(dplyr)
library(ggplot2)
library(caret)
## Warning: package 'caret' was built under R version 4.3.3
## Loading required package: lattice
## 
## Attaching package: 'caret'
## 
## The following object is masked from 'package:purrr':
## 
##     lift
library(fontawesome)
## Warning: package 'fontawesome' was built under R version 4.3.3
library(tinytex)

Association between sexual risk-taking behaviors and the intent to seek HIV/AIDs testing and prevention services among New York residents.

Research Question: How do sexual risk-taking behaviors influence the intent to seek HIV/AIDs testing and prevention services among New York residents?

Data Source and Population of Interest

The data for this study was obtained from the 2020 New York City Community Health Survey (NYC CHS), an annual telephone survey conducted by the Division of Epidemiology, Bureau of Epidemiology Services.1 The NYC CHS provides robust data on the health of New Yorkers, including neighborhood and city-wide estimates on a broad range of chronic diseases and behavioral risk factors including HIV/AIDs, sexual risk taking behavior, alcohol consumption the variables of interest.2,3 The population of interest for this study comprises adult residents of New York.1 These individuals were adults in non-group quarters aged 18 years and older and had a cellular telephone or lived in a house with a landline telephone.1,4 The CHS uses a stratified random sample to produce neighborhood and citywide estimates.1 Neighborhoods are defined using the United Hospital Fund’s (UHF) neighborhood designation, which assigns neighborhoods based on the respondent’s ZIP code.3,5 To avoid small sample sizes for the CHS estimates, UHF estimates are collapsed into 34 UHF groups.1 Sampling and Data Collection The NYC CHS employed the computer-Assisted Telephone Interviewing (CATI) system to collect the survey data.1,4 The sampling frame was constructed with a list of telephone numbers provided by a commercial vendor.1,4 Upon agreement to participate in the survey, one adult was randomly selected from each household to complete the interview.1

Study Variables

The outcome variable for this study is HIV testing (HIV12month20), which represents whether the individual has been tested for HIV in the past 12 months.2 The predictor and exposure variables include Birth-sex, which represents the biological sex of the individual at birth; Number of Sex-partner, which represents the number of male and female sex partners an individual has had in the past 12 months; Condom (condom20), a variable that queries participants about whether they used a condom the last time they had sex; and Heavy drinking (heavydrink20), which represents heavy drinking, defined as men having more than 2 drinks per day or women having more than 1 drink per day.6

Sample Characteristics

The sample from the NYC CHS 2020 may not fully represent the entire population of New York City due to potential non-response bias, coverage bias, and the survey administration method that requires a working landline or cellular phone.1,3 Method Section

Statistical Analysis

In this study, I hypothesized that having several numbers of sexual partners, not using condoms for HIV prevention increased someone’s intent to seek an HIV test among residents of New York City. Modeling technique A logistic regression was employed as the modeling technique due to the binary nature of the outcome variable to assess the association between sexual risk-taking behaviors and the intent to seek HIV/AIDs testing and prevention services among New York residents. We investigated and controlled for how factors like number of sex partners, birthsex or condom use influence the likelihood of someone seeking an HIV test (yes) compared to not getting tested (no). Model fitness was assessed to evaluate how well the model explains the variation in HIV testing behavior. Statistical significance was set at p < 0.025 while preserving a global type 1 error rate, Odds ratios and the corresponding 95% confidence intervals were reported as estimates for the association between the predictor variable and the outcome. R studio statistical software was used for data analysis and model fitting and some specific packages were installed, used for the analysis such as glm for logistic regression and are mentioned in the code accompanied.7

Loading the dataset

chs2020_data <- read_sas(file("chs2020_public (1).sas7bdat"))

Selecting the required variables

chs2020_Sorted <- chs2020_data %>% select(hiv12months20, sexpartner, condom20,heavydrink20, birthsex)
frequency <-table(chs2020_Sorted$sexpartner)
print(frequency)
## 
##    1    2    3    4 
## 2686 4263  374  597

Cleaning the data

chs2020_Clean <- chs2020_Sorted %>%
  mutate(birthsex = recode_factor(birthsex,
                               `1` = "Male", 
                               `2` = "Female"))%>%
  mutate(condom20 = recode_factor(condom20,
                                  `1` = "Yes",
                                  `2` = "No",))%>%
  mutate(hiv12months20 = recode_factor(hiv12months20,
                                  `1` = "Yes",
                                  `2` = "No",))%>%
  mutate(heavydrink20= recode_factor (heavydrink20,
                                      `1` ="Yes",
                                      `2`="No"))%>%
  mutate(sexpartner = recode_factor(sexpartner,
                                   `1`= "None or One",
                                   `2` = "None or One",
                                   `3` = "Two",
                                   `4` = "Three or more"))%>%
  drop_na()
frequency <-table(chs2020_Clean$sexpartner)
print(frequency)
## 
##   None or One           Two Three or more 
##          4063           365           565

Tabulating the variables

my_table <- table1(
  ~ birthsex + sexpartner + condom20 + hiv12months20 + heavydrink20,
  data = chs2020_Clean
)

Result Section

The table 1, below presents data from a survey of 4,993 individuals, highlighting key behavioral and demographic factors. Notably 35.4% reported HIV testing in the last 12 months, while a significant majority, 64.6%, did not. The data on sexual partners showed a vast majority (81.4%) had one partner, with 11.3% reporting three or more. Condom usage was low, with 70.2% not using them. Alcohol consumption patterns indicated a mean heavy drinking score of 1.92, suggesting moderate consumption while the gender distribution was nearly even, with a slight male majority.

Table 1: Sample Characteristics

Creating Table stratified by (Table 2, Predictor variables Stratified by HIV testing)

library(tableone)
## Warning: package 'tableone' was built under R version 4.3.2
vars <- c("birthsex", "sexpartner", "condom20", "heavydrink20")
table1 <- CreateTableOne(vars = vars, strata = "hiv12months20", data = chs2020_Clean, test = TRUE)

Checking normality

ggplot(data = chs2020_Clean) +
  geom_bar(aes_string(x = "sexpartner"), bins = 5) +
  facet_wrap(~ hiv12months20)
## Warning: `aes_string()` was deprecated in ggplot2 3.0.0.
## ℹ Please use tidy evaluation idioms with `aes()`.
## ℹ See also `vignette("ggplot2-in-packages")` for more information.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
## Warning in geom_bar(aes_string(x = "sexpartner"), bins = 5): Ignoring unknown
## parameters: `bins`

Figure 1: Bar graph showing HIV testing by Number of sexual partners.

Testing the model

model <- glm(hiv12months20 ~ sexpartner * condom20 + heavydrink20 + birthsex,
             family = binomial(link = "logit"),
             data = chs2020_Clean)
summary(model)
## 
## Call:
## glm(formula = hiv12months20 ~ sexpartner * condom20 + heavydrink20 + 
##     birthsex, family = binomial(link = "logit"), data = chs2020_Clean)
## 
## Coefficients:
##                                    Estimate Std. Error z value Pr(>|z|)    
## (Intercept)                         0.93022    0.13520   6.880 5.97e-12 ***
## sexpartnerTwo                      -0.61934    0.15938  -3.886 0.000102 ***
## sexpartnerThree or more            -0.93982    0.13510  -6.956 3.49e-12 ***
## condom20No                          0.38378    0.07789   4.927 8.34e-07 ***
## heavydrink20No                     -0.09408    0.11422  -0.824 0.410136    
## birthsexFemale                     -0.60116    0.06318  -9.516  < 2e-16 ***
## sexpartnerTwo:condom20No           -0.18767    0.22647  -0.829 0.407278    
## sexpartnerThree or more:condom20No -0.50537    0.18981  -2.663 0.007756 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 6489.2  on 4992  degrees of freedom
## Residual deviance: 6201.6  on 4985  degrees of freedom
## AIC: 6217.6
## 
## Number of Fisher Scoring iterations: 4

Results of logistic regression model on HIV testing in NYC CHS 2020 The analysis investigated how factors related to sexual behavior influenced the likelihood of getting an HIV test among New Yorck city residents in 2020. The Data was obtained from the NYC CHS survey (N=4993). A logistic regression was performed to predict the likelihood of having an HIV test in the past year based on number of sexual partners, condom use, heavy drinking (alcohol consumption), and biological sex.

Testing out the association of HIV testing (response) and the predictors.

broom::tidy(model, exponentiate = TRUE, conf.int = TRUE) %>%
select(-statistic, p.value)

We have valid evidence to confirm that the odds of getting HIV tested changed with a unit increase with having two or more sex-partners in the past year (95% CI of Two sex partners [039-0.74], p = < 0.05, Three and more sex partners [0.30-0.50], p = < 0.05). There was an observed unit change increase HIV testing in participants who reported not using a condom during the last sex encounter compared to those who reported using a condom (95% CI = [1.26 -1.71], p-value=0.05). Taking the biological sex into account, female sex was associated with an increased odds of an observed unit change in HIV testing compared to male sex (95% CI= [0.48-0.62], p-value =0.05). Alcohol with heavy drinking was not statistically significantly associated with HIV testing behavior (95% CI= [0.73-1.14], p-value=0.41)

# Checking for mulit-colinerarity 
car::vif(model)
## there are higher-order terms (interactions) in this model
## consider setting type = 'predictor'; see ?vif
##                         GVIF Df GVIF^(1/(2*Df))
## sexpartner          4.206667  2        1.432137
## condom20            1.429976  1        1.195816
## heavydrink20        1.024423  1        1.012138
## birthsex            1.069123  1        1.033984
## sexpartner:condom20 3.977649  2        1.412234

Due to cross-sectional nature of the study design, it was a limiting factor to establish causality and the data being obtained from a self-reported survey, might be subjected to non-response, and reporting biases. Another limitation was that the model does not account for all the other potential factors that may influence HIV testing behavior. Further research could explore the reasons behind the observed association between sexual behavior, condom use and HIV testing hence including data on access to testing services and other social determinants of health for example level of education which could provide a more comprehensive picture of the future research.

Overall, these findings suggest that sexual behavior patterns, condom use and sex at birth are associated with HIV testing rates in New York city. These findings can inform public health efforts to promote and increase HIV testing among high-risk populations.

References

  1. Community Health Survey Methodology - NYC Health. Accessed February 29, 2024. https://www.nyc.gov/site/doh/data/data-sets/community-health-survey-methodology.page
  2. chs2020-codebook.pdf. Accessed February 29, 2024. https://www.nyc.gov/assets/doh/downloads/pdf/episrv/chs2020-codebook.pdf
  3. Community Health Survey Public Use Data - NYC Health. Accessed February 29, 2024. https://www.nyc.gov/site/doh/data/data-sets/community-health-survey-public-use-data.page
  4. Computer-assisted telephone interviewing. In: Wikipedia. ; 2021. Accessed February 29, 2024. https://en.wikipedia.org/w/index.php?title=Computer-assisted_telephone_interviewing&oldid=1033845941
  5. uhf_map_100604.pdf. Accessed February 29, 2024. https://www.nyc.gov/assets/doh/downloads/pdf/survey/uhf_map_100604.pdf
  6. chs2020survey.pdf. Accessed February 29, 2024. https://www.nyc.gov/assets/doh/downloads/pdf/episrv/chs2020survey.pdf
  7. The Comprehensive R Archive Network. Accessed April 29, 2024. https://cran.r-project.org/