R Markdown

to do list:

  1. replace 0 for mean with NA_real (check)
  2. need to summarize duod_1 and duod_2 as a mean between the two trials (check)
  3. same for ileum1, ileum2, colo1, colo2 - measuring ohms / cm2 for TEER measurement- colo_teer obs1, obs2. (check)
  4. Trials = ohms, REDCap is calculating, turning it into TEER
  5. Goals - one final TEER value per gut segment per patient
  6. Compare duod vs ileal vs colonic TEER (check) - ANOVA or model - confounders - presence of portal HTN - variable phtn, history of HE, use of lactulose_4wks (check)
  7. compare to healthy volunteer TEER (later, none yet)
  8. table 1 demographics - age, sex, race, gender, n/percent EGD/colon, percentages for etiology of cirrhosis, hx varices, hx ascites, PPI, probiotic use, MELD score/SD
  9. study_id - each positive screen, not all enrolled - if outcome == 0, enrolled, or enrolled_date - if exist, then enrolled.

Read in and clean data

endo_data <- read_csv("BloomEndoscopyStudy_DATA_2021-02-25_0904.csv")
## 
## ── Column specification ────────────────────────────────────────────────────────
## cols(
##   .default = col_double(),
##   appt_dte = col_date(format = ""),
##   not_approached_note = col_character(),
##   declined_note = col_character(),
##   date_enrolled = col_date(format = ""),
##   other_prep = col_character(),
##   edg_result_note = col_character(),
##   colo_notes = col_character(),
##   other_cirrh_dx = col_character(),
##   hcv_load = col_character(),
##   hbv_load = col_character(),
##   comorbid_med_conditions = col_character(),
##   med_list = col_character()
## )
## ℹ Use `spec()` for the full column specifications.
endo_data %>% 
  mutate(d1_mean = case_when(d1_mean == 0 ~ NA_real_,
                         d1_mean != 0 ~ d1_mean),
         d2_mean = case_when(d2_mean == 0 ~ NA_real_,
                         d2_mean != 0 ~ d2_mean),
      ileum1_mean = case_when(ileum1_mean == 0 ~ NA_real_,
                         ileum1_mean != 0 ~ ileum1_mean),
      ileum2_mean = case_when(ileum2_mean == 0 ~ NA_real_,
                         ileum2_mean != 0 ~ ileum2_mean),
      colo1_mean = case_when(colo1_mean == 0 ~ NA_real_,
                         ileum1_mean != 0 ~ colo1_mean),
      colo2_mean = case_when(colo2_mean == 0 ~ NA_real_,
                         colo2_mean != 0 ~ colo2_mean)) %>%
  rowwise() %>% 
  mutate(duod_teer_ohmcm2 = mean(c(d1_mean, d2_mean), 
                                 na.rm = TRUE),
  ileum_teer_ohmcm2 = mean(c(ileum1_mean, ileum2_mean),
                           na.rm = TRUE),
  colo_teer_ohmcm2 = mean(c(colo1_mean, colo2_mean, 
                            na.rm = TRUE))) ->
endo_data_clean

Now to build a table 1 with gtsummary

endo_data_clean %>% 
  select(age, sex, hx_varices, phtn, ascites, he, meld) %>%
  tbl_summary()
## Warning: The `.dots` argument of `group_by()` is deprecated as of dplyr 1.0.0.
Characteristic N = 591
age 59 (55, 68)
Unknown 43
sex 10 (62%)
Unknown 43
hx_varices 11 (69%)
Unknown 43
phtn 13 (81%)
Unknown 43
ascites 8 (50%)
Unknown 43
he 3 (19%)
Unknown 43
meld
6 1 (6.2%)
7 6 (38%)
8 2 (12%)
10 2 (12%)
11 3 (19%)
12 1 (6.2%)
141 1 (6.2%)
Unknown 43

1 Median (IQR); n (%)

Initial ANOVA for TEER by Intestinal Segment

Pretty small N (mostly duodenum), so not expecting much.

endo_data_clean %>% 
  select(study_id, phtn, he, probiotics_4wks, lactulose_4wks, contains("ohmcm2")) %>% 
  pivot_longer(cols = contains("ohmcm2"), 
               names_to = "location",
               values_to = "teer_ohmcm2") %>% 
  separate(location, sep = "_",
           into = c("location", "teer", "units")) %>% 
  select(study_id, phtn, he, probiotics_4wks, lactulose_4wks, location, teer_ohmcm2) %>% 
  filter(!is.na(teer_ohmcm2)) ->
location_teer

location_teer %>% 
  tabyl(location) %>% 
  flextable::flextable()

Lots of duodenum.

On to modeling!

location_teer %>% 
  lm(formula = teer_ohmcm2 ~ location, data = .) ->
teer_model

print("colon is reference category")
## [1] "colon is reference category"
teer_model %>% 
  broom::tidy() %>% 
  flextable()
# with gtsummary::tbl_regression
teer_model %>% 
  tbl_regression()
Characteristic Beta 95% CI1 p-value
location
colo
duod 5.1 0.34, 9.8 0.038
ileum -2.8 -8.9, 3.4 0.4

1 CI = Confidence Interval

teer_model %>% 
  broom::glance() %>% 
  flextable()

Now for a more complex model with confounders

location_teer %>% 
  lm(formula = teer_ohmcm2 ~  location + 
       # phtn + 
       # he + 
       # probiotics_4wks + 
       # lactulose_4wks + 
        study_id +
       NULL,  
     data = .) ->
teer_model

print("colon is reference category")
## [1] "colon is reference category"
teer_model %>% 
  broom::tidy() %>% 
  flextable()
teer_model %>% 
  broom::glance() %>% 
  flextable()