Exploratory Data Analysis

Clinical question

Do patients with a BMI of 25-30 have better outcomes after outpatient sigmoid colectomy for diverticular disease than do patients with a BMI of 35-40?

Patient population

SQL Query

My queries will pull from NSQIP 2005-2019.

In order to generate the main dataframe, I have constructed a query to pull only those patients that have undergone one of the above four procedures without any concomitant procedures and only with a single CPT code (a few records had multiple of these four CPT codes):

select *
from 
(
    select caseid, trim(col, 'cpt_') cpt 
    from
    (
        select caseid, 
        cpt_44145, cpt_44146, cpt_44207, cpt_44208
        from
        (
            select caseid,
                countif(cpt = "44145") cpt_44145,
                countif(cpt = "44146") cpt_44146,
                countif(cpt = "44207") cpt_44207,
                countif(cpt = "44208") cpt_44208,
            from nsqip.puf_cpt
            group by caseid
            having not logical_or(cpt not in ("44145", "44146", "44207", "44208"))
        )
        where (cpt_44145 + cpt_44146 + cpt_44207 + cpt_44208) = 1
    )
    unpivot (cpt for col in (cpt_44145, cpt_44146, cpt_44207, cpt_44208))
    where cpt !=0    
) cpt

join (
    select *
    from nsqip.puf
) puf
using(caseid)

join (
    select *
    from nsqip.col
) col
using(caseid)

left join (
    select distinct(caseid), TRUE as readmission from nsqip.puf_readmission
    where readmission is true
) readm
using(caseid)

left join (
    select distinct(caseid), TRUE as reoperation from nsqip.puf_reoperation
    where reoperation is true
) reop
using(caseid)

I will build a companion dataframe for readmissions:

select *
from 
(
    select caseid, trim(col, 'cpt_') cpt 
    from
    (
        select caseid, 
        cpt_44145, cpt_44146, cpt_44207, cpt_44208
        from
        (
            select caseid,
                countif(cpt = "44145") cpt_44145,
                countif(cpt = "44146") cpt_44146,
                countif(cpt = "44207") cpt_44207,
                countif(cpt = "44208") cpt_44208,
            from nsqip.puf_cpt
            group by caseid
            having not logical_or(cpt not in ("44145", "44146", "44207", "44208"))
        )
        where (cpt_44145 + cpt_44146 + cpt_44207 + cpt_44208) = 1
    )
    unpivot (cpt for col in (cpt_44145, cpt_44146, cpt_44207, cpt_44208))
    where cpt !=0    
) cpt

join nsqip.puf_readmission
using(caseid)
where readmission is true

and for reoperations:

select *
from
(
    select caseid, trim(col, 'cpt_') cpt 
    from
    (
        select caseid, 
        cpt_44145, cpt_44146, cpt_44207, cpt_44208
        from
        (
            select caseid,
                countif(cpt = "44145") cpt_44145,
                countif(cpt = "44146") cpt_44146,
                countif(cpt = "44207") cpt_44207,
                countif(cpt = "44208") cpt_44208,
            from nsqip.puf_cpt
            group by caseid
            having not logical_or(cpt not in ("44145", "44146", "44207", "44208"))
        )
        where (cpt_44145 + cpt_44146 + cpt_44207 + cpt_44208) = 1
    )
    unpivot (cpt for col in (cpt_44145, cpt_44146, cpt_44207, cpt_44208))
    where cpt !=0    
) cpt

join nsqip.puf_reoperation
using(caseid)
where reoperation is true

Excluding patients

We start with a total of 17217 patients.

cpt n
44145 4087
44146 1074
44207 10903
44208 1153

Indication

Now we can start culling away patients via exlusion criteria. The first will be indication - we only want patients with an indication of chronic diverticular disease. I will rely strictly on the field col_indication. Additional fields that I may consider using would be col_indication_icd, podiag, or podiag10. But these are ICD codes; I think using the indication field from the colorectal subset is cleaner.

col_indication n
Colon cancer 8076
Chronic diverticular disease 4062
Other 1704
Acute diverticulitis 1497
Non-malignant polyp 706
Colon cancer w/ obstruction 519
Volvulus 386
Chrohn’s disease 113
Bleeding 78
Ulcerative colitis 55
Enterocolitis 11
NA 10

A total of 4062 patients have chronic diverticular disease. This exclusion criteria eliminates 13155 patients.

Approach

The next exclusion criteria will be non-laparoscopic or robotic approach. This will rely on the variable col_approach. We can easily eliminate all records where col_approach is not equal to “Laparoscopic” or “Robotic”.

col_approach n
Laparoscopic 2704
Robotic 761
Open 552
SILS 21
Hybrid 12
Other MIS 6
Endoscopic 4
NOTES 2

A total of 3465 patients undergo laparoscopic or robotic surgery. This exclusion criteria eliminates 597 patients.

Open assist

The next exclusion criteria will remove all patients with a documented open assist technique because there is no ability to document a conversion to open in NSQIP when open-assist is selected. This will use the col_open_assist variable.

col_open_assist n
FALSE 2146
TRUE 1319

A total of 2146 patients do not have an open assist documented. This exclusion criteria eliminates 1319 patients.

Discordance of approach

The next exclusion criteria will remove patients with some kind of discordance in their recorded approach.

We are including CPT codes for open approaches with the assumption that if the col_approach is laparoscopic, this should indicate a conversion to open. This assumption will be easy to validate with the col_unplanned_conversion variable`.

cpt col_unplanned_conversion n
44145 TRUE 193
44145 FALSE 55
44146 TRUE 18
44146 FALSE 3

A total of 211 patients have an open CPT with a laparoscopic or robotic approach and a conversion to open documented. A discordant documented approach eliminates 58 patients.

Elective surgery

The next exclusion criteria will be non-elective surgery. There are multiple fields that contain relevant data - electsurg, emergncy, and col_emergent. I will utilize all of these fields to exclude patients. I will require that electsurg is TRUE while emergncy is FALSE. col_emergent is a factor - I will require that it is equal to NA. Lets take a look at a breakdown of all three variables:

electsurg n
FALSE 101
TRUE 1987
emergncy n
FALSE 2077
TRUE 11
col_emergent n
Obstruction 2
Other 2
Perforation 6
NA 2078

It helps to look at all combinations of these three variables to see why we need to consider all three in our determination of what is non-elective surgery.

electsurg emergncy col_emergent n
TRUE FALSE FALSE 1986
FALSE FALSE FALSE 91
FALSE TRUE TRUE 9
FALSE TRUE FALSE 1
TRUE TRUE TRUE 1

A total of 1986 patients undergo elective surgery. This exclusion criteria eliminates 102 patients.

Outpatient

The next exclusion criteria will be inpatient surgery. Only a single variable is relevant to this criteria - inout.

inout n
FALSE 7
TRUE 1979

As the table makes clear, only 7 patients undergo outpatient surgery. Therefore, we can not use this exclusion critiera.

Admitted from home

The next exclusion criteria will be not admitted from home. Only a single variable is relevant to this criteria - transt.

transt n
Admitted from home 1975
Other 5
Acute care hospital 4
Chronic care facility 1
Outside emergency department 1

A total of 1975 patients are admitted from home. This exclusion criteria eliminates 11 patients.

General Surgery

The next exclusion criteria will be surgical specialty other than general surgery. Only a single variable is relevant - surgspec.

surgspec n
General surgery 1960
Vascular 14
Orthopedics 1

A total of 1960 patients are operated on by a general surgeon. This exclusion criteria eliminates 15 patients.

Co-morbidities

There are various co-morbidities and pre-operative conditions that we will also use to exclude patients. These are:

#>   diabetes        ventilat        ascites          smoke        
#>  Mode :logical   Mode :logical   Mode :logical   Mode :logical  
#>  FALSE:1755      FALSE:1960      FALSE:1960      FALSE:1638     
#>  TRUE :205                                       TRUE :322      
#>     etoh           hxchf            hxmi           hxcopd       
#>  Mode :logical   Mode :logical   Mode :logical   Mode :logical  
#>  FALSE:1         FALSE:1958      FALSE:1         FALSE:1913     
#>  NA's :1959      TRUE :2         NA's :1959      TRUE :47       
#>    esovar         hypermed        discancr         wndinf        type_prsepis 
#>  Mode :logical   Mode :logical   Mode :logical   Mode :logical   Sepsis:   2  
#>  FALSE:1         FALSE:1153      FALSE:1955      FALSE:1946      SIRS  :   9  
#>  NA's :1959      TRUE :807       TRUE :5         TRUE :14        NA's  :1949  
#>                fnstatus2   
#>  Independent        :1929  
#>  Partially dependent:   7  
#>  NA's               :  24

A total of 860 patients do not have any of these excluding co-morbid conditions. This exclusion criteria eliminates 1100 patients.

#>   diabetes        ventilat        ascites          smoke           etoh        
#>  Mode :logical   Mode :logical   Mode :logical   Mode :logical   Mode:logical  
#>  FALSE:860       FALSE:860       FALSE:860       FALSE:860       NA's:860      
#>    hxchf           hxmi           hxcopd         esovar         hypermed      
#>  Mode :logical   Mode:logical   Mode :logical   Mode:logical   Mode :logical  
#>  FALSE:860       NA's:860       FALSE:860       NA's:860       FALSE:860      
#>   discancr         wndinf        type_prsepis       fnstatus2  
#>  Mode :logical   Mode :logical   NA's:860     Independent:860  
#>  FALSE:860       FALSE:860

Incomplete data

We need to exclude patients that are missing required data:

Height missing n
FALSE 856
TRUE 4
Weight missing n
FALSE 857
TRUE 3

A total of 856 patients do not have incomplete data. This exclusion criteria eliminates 4 patients.

BMI

We must now eliminate patients with BMIs outside of the desired ranges (25-30, 35-40). This will use the variable bmi.

In range n
FALSE 487
TRUE 369

A total of 369 patients are within the desired BMI range. This exclusion criteria eliminates 487 patients.

Remaining patients

We can now look at how many patients we have in each arm of the study:

group n
BMI 25-30 303
BMI 35-40 66
group cpt n
BMI 25-30 44145 14
BMI 25-30 44207 286
BMI 25-30 44208 3
BMI 35-40 44145 11
BMI 35-40 44146 2
BMI 35-40 44207 53