Removing patients with inadequate data
There are some variables for which we will not accept a missing value. These are age, height, weight, and pan_approach.
Removing these patients results in a total of 6217 patients remaining.
Applying inclusion criteria
We can now apply the inclusion criteria above. First, I will create a helper function called or_na. There are a few variables for which we make the assumption that missing data is equivalent to whichever condition meets the inclusion criteria. This is an assumption made to keep up our sample size but is a potential point of criticism, so be aware of which variables are subject to this assumption. These variables are:
- Surgical specialty is general surgery (
surgspec)
- Pre-operative ventilator use (
ventilat)
- Disseminated cancer (
discancr)
- Pre-operative wound infection (
wndinf)
- Pre-operative sepsis (
prsepis)
- Elective surgery (
electsurg)
- Emergent surgery (
emergncy)
- M-stage (
pan_mstage)
- Reconstruction (
pan_reconstruction)
- Gastroduodenostomy (
pan_gastduo)
- Open assist (
pan_open_assist)
I can show this works:
unmatch_df <- unmatch_df %>%
filter(
pan_approach %in% c("Laparoscopic","Robotic","Open"), # Only laparoscopic, robotic, or open approaches
inout, # Inpatients
transt == "Admitted from home", # Admitted from home
asaclas < 5, # ASA class < 5
wndclas < 4, # Wound class < 4
optime > 20, # Operative time < 20 minutes
or_na(surgspec, surgspec == "General surgery"), # General surgery
or_na(electsurg, electsurg), # Elective surgery
or_na(emergncy, !emergncy), # Non-emergent surgery
or_na(ventilat, !ventilat), # No pre-operative ventilator
or_na(discancr, !discancr), # No disseminated cancer
or_na(wndinf, !wndinf), # No pre-operative wound infection
or_na(prsepis, !prsepis), # No pre-operative sepsis
or_na(pan_gastduo, pan_gastduo == "Not performed"), # No gastroduodensotomy
or_na(pan_reconstruction, pan_reconstruction == "Not performed"), # No reconstruction
or_na(pan_mstage, pan_mstage == "M0/Mx"), # No metastatic disease
or_na(pan_open_assist, !pan_open_assist) # No open assist
)
After applying these inclusion criteria, we are left with 4414 patients.
Selecting for only complete cases
A match requires that all records have no missing fields. First, we must select for only those variables we care about in the match or the post-match analysis and then select for only those complete cases:
unmatch_df <- unmatch_df %>%
select(caseid, group, sex, age, bmi, diabetes, smoke, fnstatus2, hxcopd, ascites, hxchf, hypermed, dialysis, steroid, wtloss, bleeddis, asaclas, pan_chemo, pan_radio, pan_drains, pan_benign_histologic, pan_tstage, pan_nstage, supinfec, wndinfd, orgspcssi, othdvt, pulembol, reintub, failwean, renainsf, oprenafl, pan_fistula, dindo, dindo_class, dindo_major, dindo_death, log_amy, pod, reoperation, readmission, pan_delgastric, tothlos, wndclas, pan_spleen) %>%
filter(complete.cases(.))
After selecting for complete cases, we are left with 3940 patients.
A summary of our pre-match data looks like:
#> caseid group sex age
#> Min. :3113861 Mode :logical Mode :logical Min. :18.00
#> 1st Qu.:4062299 FALSE:1962 FALSE:2217 1st Qu.:53.00
#> Median :6436328 TRUE :1978 TRUE :1723 Median :64.00
#> Mean :6119729 Mean :61.57
#> 3rd Qu.:7791588 3rd Qu.:72.00
#> Max. :9284181 Max. :90.00
#> bmi diabetes smoke fnstatus2
#> Min. :13.62 Mode :logical Mode :logical Independent:3919
#> 1st Qu.:24.37 FALSE:2947 FALSE:3263 Dependent : 21
#> Median :28.03 TRUE :993 TRUE :677
#> Mean :28.84
#> 3rd Qu.:32.28
#> Max. :66.75
#> hxcopd ascites hxchf hypermed
#> Mode :logical Mode :logical Mode :logical Mode :logical
#> FALSE:3787 FALSE:3936 FALSE:3926 FALSE:1893
#> TRUE :153 TRUE :4 TRUE :14 TRUE :2047
#>
#>
#>
#> dialysis steroid wtloss bleeddis asaclas
#> Mode :logical Mode :logical Mode :logical Mode :logical 1: 45
#> FALSE:3919 FALSE:3807 FALSE:3685 FALSE:3833 2:1230
#> TRUE :21 TRUE :133 TRUE :255 TRUE :107 3:2505
#> 4: 160
#>
#>
#> pan_chemo pan_radio pan_drains pan_benign_histologic
#> Mode :logical Mode :logical Mode:logical Mode :logical
#> FALSE:3485 FALSE:3708 TRUE:3940 FALSE:2279
#> TRUE :455 TRUE :232 TRUE :1661
#>
#>
#>
#> pan_tstage pan_nstage supinfec wndinfd
#> Mode :logical Mode :logical Mode :logical Mode :logical
#> FALSE:1793 FALSE:3183 FALSE:3838 FALSE:3914
#> TRUE :2147 TRUE :757 TRUE :102 TRUE :26
#>
#>
#>
#> orgspcssi othdvt pulembol reintub
#> Mode :logical Mode :logical Mode :logical Mode :logical
#> FALSE:3513 FALSE:3845 FALSE:3878 FALSE:3876
#> TRUE :427 TRUE :95 TRUE :62 TRUE :64
#>
#>
#>
#> failwean renainsf oprenafl pan_fistula
#> Mode :logical Mode :logical Mode :logical Mode :logical
#> FALSE:3883 FALSE:3931 FALSE:3925 FALSE:3468
#> TRUE :57 TRUE :9 TRUE :15 TRUE :472
#>
#>
#>
#> dindo dindo_class dindo_major dindo_death log_amy
#> Min. :0.0000 None :3553 Mode :logical Mode :logical Min. : 0.000
#> 1st Qu.:0.0000 Major: 363 FALSE:3577 FALSE:3916 1st Qu.: 4.443
#> Median :0.0000 Death: 24 TRUE :363 TRUE :24 Median : 6.377
#> Mean :0.8056 Mean : 6.347
#> 3rd Qu.:2.0000 3rd Qu.: 8.154
#> Max. :5.0000 Max. :11.513
#> pod reoperation readmission pan_delgastric
#> Min. : 0.00 Mode :logical Mode :logical Mode :logical
#> 1st Qu.: 1.00 FALSE:3825 FALSE:3236 FALSE:3788
#> Median : 3.00 TRUE :115 TRUE :704 TRUE :152
#> Mean : 3.92
#> 3rd Qu.: 4.00
#> Max. :30.00
#> tothlos wndclas pan_spleen
#> Min. : 0.000 1: 455 Mode :logical
#> 1st Qu.: 4.000 2:3208 FALSE:3742
#> Median : 5.000 3: 277 TRUE :198
#> Mean : 6.457
#> 3rd Qu.: 7.000
#> Max. :81.000