Creating the Sample

library(nsqipPEHR)
library(magrittr)

Defining the sample population

Question

Do 30-day complication rates significantly differ between paraesophageal hernia repair (PEHR), Heller myotomy, and/or fundoplication patients who are discharged on post-op day (POD) #1 vs. POD #0?

\(H_0\): There is no difference in the composite outcome of 30-day morbidity between patients undergoing the aforementioned procedures discharged on POD #1 and POD #0.

Sampling population

The population to whom we would like our results to be generalizable:

Otherwise healthy patients undergoing routine elective paraesophageal hernia repair, Heller myotomy, or fundoplication.

Inclusion criteria

Primary procedures

Build SQL Query

Selecting the correct CPT codes

First, I will select for the correct CPT codes. We want only patients that are undergoing the above listed procedures. If a patient has additional CPT codes listed, they are excluded. However, they are included if the other CPT codes are one of the above four.

Join to main table

Since this does not require any targeted procedural data, I will simply join it to the main table:

Enacting inclusion criteria

In order to produce our sample population, I will enforce the inclusion criteria below:

df %>%
  dplyr::filter(tothlos < 2,
                asaclas < 4,
                fnstatus2 == "Independent",
                wndclas < 3,
                electsurg,
                !emergncy,
                !ventilat)

Creating predictor variables

We need to edit a few of our predictor variables to make them more usable.

df %>%
  dplyr::mutate(bmi = nsqipr::bmi(height, weight),
                insulin = tidyr::replace_na(insulin, FALSE),
                when_dyspnea = tidyr::replace_na(when_dyspnea, "None"),
                type_prsepis = tidyr::replace_na(type_prsepis, "None"))

Creating outcome variables

Because our primary outcome is a 30-day composite rate of morbidity, we need some good way to obtain a composite rate. The easiest is probably the Dindo-Clavien classification scheme. Everyone will then be categorized into “no morbidity”, “minor morbidity”, or “major morbidity”. “Death” will also be accounted for. A patient will be categorized into the highest category for which their outcomes qualify them.

df %>% 
  dplyr::mutate(dindo = nsqipr::dindo(.),
                dindo_class = forcats::fct_recode(factor(dindo), `None` = "0", `Minor` = "1", 
                                                  `Minor` = "2", `Major` = "3", `Major` = "4", 
                                                  `Death` = "5"))

Selecting complete cases

Because matching requires complete data sets, we must select only the variables we need and then select records with complete data.

df %>%
  dplyr::select(caseid:sex, inout:age, electsurg, anesthes, surgspec:smoke, dyspnea, 
                when_dyspnea, fnstatus2:ascites, hxchf, hypermed, renafail, dialysis,
                discancr:transfus, prsepis, type_prsepis, emergncy:asaclas, optime, 
                tothlos, supinfec, wndinfd, orgspcssi, dehis, oupneumo, reintub, pulembol,
                failwean, renainsf, oprenafl, urninfec, cnscva, cdarrest, cdmi, othbleed, 
                othdvt, othsysep, othseshock, othcdiff, returnor, dindo, dindo_class) %>%
  dplyr::filter(complete.cases(.))

This leaves us with 10127 patients in our analysis.