Updated cohort A and cohort B for final manuscript

This reads in the NSQIP data for all shunt cases and all peds cases from 2019. Next, we find all the unique cases of shunt placement and merges with record data. Then, we create our final working dataframe that only contains the factors of interest.


everything above this line is executed locally/on desktop. The below code uses uploaded CSV created on desktop.

Pull out the columns of interest.

COHORT A

NSQIP Cohort A: 1) patient received first permanent shunt 2) patient has a recorded hydrocephalus etiology 3) patient has a recorded entry for premature birth (excl UNKNOWN) 4) patient did not die in first 30 days

1676 meet the criteria

cohortA <- df19[which(df19$PED_SHNT_1STPERM == "Yes"),]
cohortA <- cohortA[which(!cohortA$PREM_BIRTH == "Unknown"),]
cohortA <- cohortA[which(!cohortA$DEATH30YN == "Yes"),]

df <- cohortA

Here, we replace “-99” gestational age with “39” if the baby was not born premature. The reason being that when we initially did this analysis, ages of patients with a reported GESTATIONALAGE_BIRTH were strongly skewed premature. This was determined to be because this category was generally only reported for patients who were born premature. To remediate this, ages were interpolated by finding the known median age of full term birth patients and assigning that age to all full term births known by the category PREM_BIRTH. The median age was found to be 39 weeks.

#first find median age of existing full term births to interpolate missing values 
median(df$GESTATIONALAGE_BIRTH[df$GESTATIONALAGE_BIRTH>36])
## [1] 39
#because the median age of a full term birth from existing data is 39, we interpolate 39 for all cases of a full term before with no listed GA at birth
df$GESTATIONALAGE_BIRTH[df$PREM_BIRTH=="No" & df$GESTATIONALAGE_BIRTH==-99] = 39
df$GESTATIONALAGE_BIRTH[df$GESTATIONALAGE_BIRTH==-99] <- NA

Add category for chronological age in weeks (CHRONAGE_SURGERY) by dividing age at surgery by 7 and rounding to the nearest whole number.

Calculate gestational age at surgery by adding gestational age at birth plus chronological age at surgery.

df$CHRONAGE_SURGERY <- round(df$AGE_DAYS/7,0)
df$GESTATIONALAGE_SURGERY2 <-  df$GESTATIONALAGE_BIRTH+df$CHRONAGE_SURGERY

Hispanic/ethnicity
All Races All etiologies (everything that is not UNC is other)
-UNC categories are Congenital hydrocephalus Intraventricular hemorrhage (IVH) or prematurity or without cause, Neoplastic Post infectious , Post Traumatic , Spin bifida with Chiari malformation/myelomenigocele Syndromic, not otherwise specified

-ONLY include prior procedure in Premie IVH table -premature birth is YES/NO (less than 37 weeks)

descriptive stats

df[,c(3:5,9,13,14,8)] %>%
  mutate(PREM_BIRTH = ifelse(df$PREM_BIRTH=="No", "No", "Yes")) %>%
  mutate(PED_SHNT_HC_ETIOL = ifelse(df$PED_SHNT_HC_ETIOL=="Vascular anomaly" | df$PED_SHNT_HC_ETIOL=="Syndromic, not otherwise specified" | df$PED_SHNT_HC_ETIOL=="Intracranial cyst" | df$PED_SHNT_HC_ETIOL=="Post hemorrhagic from other vascular lesion" | df$PED_SHNT_HC_ETIOL == "Other - insufficient information to classify type", "Syndromic, not otherwise specified  ", PED_SHNT_HC_ETIOL)) %>%
  tbl_summary() 
Characteristic N = 1,6761
SEX
    Female 748 (45%)
    Male 928 (55%)
RACE
    American Indian or Alaska Native 8 (0.5%)
    Asian 43 (2.6%)
    Black or African American 344 (21%)
    Native Hawaiian or Other Pacific Islander 4 (0.2%)
    Unknown/Not Reported 299 (18%)
    White 978 (58%)
ETHNICITY_HISPANIC
    No 1,222 (73%)
    NULL 166 (9.9%)
    Yes 288 (17%)
GESTATIONALAGE_BIRTH 38.0 (32.0, 39.0)
    Unknown 80
PED_SHNT_HC_ETIOL
    Congenital hydrocephalus 520 (31%)
    Intraventricular hemorrhage (IVH) of prematurity without other cause 372 (22%)
    Neoplastic 183 (11%)
    Post infectious 73 (4.4%)
    Post traumatic 39 (2.3%)
    Spina bifida with Chiari malformation / myelomeningocele 117 (7.0%)
    Syndromic, not otherwise specified 372 (22%)
PED_SHNT_PRIOR_HCPROC 591 (35%)
PREM_BIRTH 730 (44%)
1 n (%); Median (IQR)
df %>% 
  filter(!GESTATIONALAGE_BIRTH == -99) %>%
  summarise(GA_birth = median(GESTATIONALAGE_BIRTH), CA_surgery = median(CHRONAGE_SURGERY), GA_surgery = median(GESTATIONALAGE_SURGERY2))
##   GA_birth CA_surgery GA_surgery
## 1       38         16         49
#most frequent etiologies
df %>% 
  group_by(PED_SHNT_HC_ETIOL) %>% 
  dplyr::summarise(n = n()) %>%
  arrange(desc(n))
## # A tibble: 11 × 2
##    PED_SHNT_HC_ETIOL                                                        n
##    <chr>                                                                <int>
##  1 Congenital hydrocephalus                                               520
##  2 Intraventricular hemorrhage (IVH) of prematurity without other cause   372
##  3 Neoplastic                                                             183
##  4 Other - insufficient information to classify type                      149
##  5 Spina bifida with Chiari malformation / myelomeningocele               117
##  6 Intracranial cyst                                                      109
##  7 Post infectious                                                         73
##  8 Post hemorrhagic from other vascular lesion                             58
##  9 Post traumatic                                                          39
## 10 Syndromic, not otherwise specified                                      37
## 11 Vascular anomaly                                                        19
520/nrow(df)*100
## [1] 31.02625
372/nrow(df)*100
## [1] 22.1957
183/nrow(df)*100
## [1] 10.91885
117/nrow(df)*100 #spina bifida
## [1] 6.980907

race breakdown for cohort A

df %>% 
  group_by(RACE) %>% 
  dplyr::summarise(n = n()) %>%
  arrange(desc(n))%>%
  mutate(percentage = n / sum(n)*100)
## # A tibble: 6 × 3
##   RACE                                          n percentage
##   <chr>                                     <int>      <dbl>
## 1 White                                       978     58.4  
## 2 Black or African American                   344     20.5  
## 3 Unknown/Not Reported                        299     17.8  
## 4 Asian                                        43      2.57 
## 5 American Indian or Alaska Native              8      0.477
## 6 Native Hawaiian or Other Pacific Islander     4      0.239
#premature births
df %>% 
  group_by(PREM_BIRTH) %>% 
  dplyr::summarise(n = n()) %>%
  arrange(desc(n))%>%
  mutate(percentage = n / sum(n)*100)
## # A tibble: 9 × 3
##   PREM_BIRTH                                 n percentage
##   <chr>                                  <int>      <dbl>
## 1 No                                       946      56.4 
## 2 35-36 completed weeks gestation          156       9.31
## 3 25-26 completed weeks gestation          125       7.46
## 4 33-34 completed weeks gestation          103       6.15
## 5 27-28 completed weeks gestation           99       5.91
## 6 31-32 completed weeks gestation           65       3.88
## 7 29-30 completed weeks gestation           64       3.82
## 8 24 completed weeks gestation              62       3.70
## 9 Less than 24 completed weeks gestation    56       3.34
print("Percentage of patients born premature:")
## [1] "Percentage of patients born premature:"
nrow(cohortA[which(!cohortA$PREM_BIRTH == "No"),])/nrow(cohortA)*100
## [1] 43.55609
#shunt failure within 30 days
df %>% 
  filter(!PED_SHNT_INTRE1_DAYS == -99) %>%
  summarise(med_days_to_fail = median(PED_SHNT_INTRE1_DAYS))
##   med_days_to_fail
## 1               17
#median time to failure 17 days for the 152 of cases (9.069212%) that failed within 30 days
nrow(cohortA[which(!cohortA$PED_SHNT_INTRE1_DAYS == -99),])
## [1] 152
nrow(cohortA[which(!cohortA$PED_SHNT_INTRE1_DAYS == -99),])/nrow(cohortA)*100
## [1] 9.069212
#Causes of shunt failure
df %>% 
  filter(!PED_SHNT_INTRE1_RSN == "NULL") %>%
  group_by(PED_SHNT_INTRE1_RSN) %>% 
  dplyr::summarise(n = n()) %>%
  arrange(desc(n))
## # A tibble: 7 × 2
##   PED_SHNT_INTRE1_RSN                              n
##   <chr>                                        <int>
## 1 Shunt infection                                 48
## 2 Wound disruption or CSF leak                    36
## 3 Mechanical issues                               21
## 4 Other                                           17
## 5 Proximal catheter obstruction or malposition    15
## 6 Distal catheter obstruction or malposition      11
## 7 Loculated ventricle                              6
#Most common reported causes: Shunt infection (n=48), Wound Disruption or CSF Leak (n=36), and Mechanical issues (n=21). 

48/152*100
## [1] 31.57895
36/152*100
## [1] 23.68421
21/152*100
## [1] 13.81579
#Cases for which we have gestational age at surgery (1596)
df %>% 
  filter(!GESTATIONALAGE_BIRTH == -99) %>%
  dplyr::summarise(n = n()) 
##      n
## 1 1596
#cases that had their surgery BEFORE 40 weeks of gestational age (407)
df_before40 <- df %>% 
  filter(!GESTATIONALAGE_BIRTH == -99) %>%
  filter(GESTATIONALAGE_SURGERY2 < 40) 



#cases that had their surgery AT OR AFTER 40 weeks of gestational age (1189)
df_after40 <- df %>% 
  filter(!GESTATIONALAGE_BIRTH == -99) %>%
  filter(GESTATIONALAGE_SURGERY2 > 39) 

NSQIP Cohort A: Number of patients who had a shunt placed after 40 GA: 1189  # born prematurely (<37 weeks aka 259 days): 346 Failure rate in this population: 10.7% (n=37)  # born full term (>37 weeks aka 260+ days): 843 Failure rate in this population: 7.6% (n=64)

#surgery was after 40 weeks GA AND born premature
df_after40  %>% 
  filter(GESTATIONALAGE_BIRTH <37)%>%
  dplyr::summarise(n = n()) 
##     n
## 1 346
after40_prem_fail <- df_after40  %>% 
  filter(GESTATIONALAGE_BIRTH <37)%>%
  filter(!PED_SHNT_INTRE1 == "None")%>%
  dplyr::summarise(n = n()) 

after40_prem_fail/346*100
##          n
## 1 10.69364
#surgery was after 40 weeks GA AND born full term
df_after40  %>% 
  filter(GESTATIONALAGE_BIRTH >36)%>%
  dplyr::summarise(n = n()) 
##     n
## 1 843
after40_full_fail <- df_after40  %>% 
  filter(GESTATIONALAGE_BIRTH >36)%>%
  filter(!PED_SHNT_INTRE1 == "None")%>%
  dplyr::summarise(n = n()) 

after40_full_fail/843*100
##          n
## 1 7.591934

Failures for under/over 40 wks groups

#47 failures in the before40 weeks op group
df_before40 %>% 
  filter(!PED_SHNT_INTRE1_DAYS == -99) %>%
  dplyr::summarise(n = n()) 
##    n
## 1 47
#100 failures in the after40 weeks op group
df_after40 %>% 
  filter(!PED_SHNT_INTRE1_DAYS == -99) %>%
  dplyr::summarise(n = n()) 
##     n
## 1 100
#Before 40 has a 12% failure; after 40 has an 8% failure
47/407*100;100/1189*100
## [1] 11.54791
## [1] 8.410429
#Median of 17 days to failure in before40 group
df_before40 %>% 
  filter(!PED_SHNT_INTRE1_DAYS == -99) %>%
  summarise(med_days_to_fail = median(PED_SHNT_INTRE1_DAYS))
##   med_days_to_fail
## 1               17
#Median of 17 days to failure in before40 group

df_after40 %>% 
  filter(!PED_SHNT_INTRE1_DAYS == -99) %>%
  summarise(med_days_to_fail = median(PED_SHNT_INTRE1_DAYS))
##   med_days_to_fail
## 1               17

make new df of failed and non failed cases

#cases that had their shunt FAIL in the first 30 days (n=147)
df_fail<- df %>% 
  filter(!GESTATIONALAGE_BIRTH == -99) %>%
  filter(!PED_SHNT_INTRE1_DAYS == -99) 

#cases that did not have a shunt failure in the first 30 days (n=1449)
df_ok <- df %>% 
  filter(!GESTATIONALAGE_BIRTH == -99) %>%
  filter(PED_SHNT_INTRE1_DAYS == -99) 

fishers exact predicting if pre-maturity predicts shunt failure

table(df_ok$prem_cat)
## < table of extent 0 >
table(df_fail$prem_cat)
## < table of extent 0 >
577/(577+872)
## [1] 0.3982057
74/(74+73)
## [1] 0.5034014
df_ok$prem_cat <- ifelse(df_ok$PREM_BIRTH=="No","Not Prem", "Premature")
df_fail$prem_cat <- ifelse(df_fail$PREM_BIRTH=="No","Not Prem", "Premature")
#<40 week surgery: y = 47, n = 360
#>40 week surgery: y = 100, n = 1089

wilcox test of age between failure and non failure groups.

wilcox.test(df_fail$GESTATIONALAGE_BIRTH,df_ok$GESTATIONALAGE_BIRTH)
## 
##  Wilcoxon rank sum test with continuity correction
## 
## data:  df_fail$GESTATIONALAGE_BIRTH and df_ok$GESTATIONALAGE_BIRTH
## W = 98240, p-value = 0.1136
## alternative hypothesis: true location shift is not equal to 0
wilcox.test(df_fail$GESTATIONALAGE_SURGERY2,df_ok$GESTATIONALAGE_SURGERY2)
## 
##  Wilcoxon rank sum test with continuity correction
## 
## data:  df_fail$GESTATIONALAGE_SURGERY2 and df_ok$GESTATIONALAGE_SURGERY2
## W = 89724, p-value = 0.00162
## alternative hypothesis: true location shift is not equal to 0
wilcox.test(df_fail$CHRONAGE_SURGERY,df_ok$CHRONAGE_SURGERY)
## 
##  Wilcoxon rank sum test with continuity correction
## 
## data:  df_fail$CHRONAGE_SURGERY and df_ok$CHRONAGE_SURGERY
## W = 89136, p-value = 0.001104
## alternative hypothesis: true location shift is not equal to 0

Cohort B

For this analysis we need the patients with hydrocephalus etiology of only Intraventricular Hemorrhage AND Born Prematurely (which I am calling NSQIP Cohort B). To identify this group in the NSQIP dataset - they have “Intraventricular hemorrhage (IVH) or prematurity or without cause” (PED_SHNT_HC_ETIOL) AND were born <37 weeks gestational age aka yes born pre-maturely in the (PREM_BIRTH) variable))

#unique(df19$PED_SHNT_HC_ETIOL)

IVH <- df[which(df$PED_SHNT_HC_ETIOL == "Intraventricular hemorrhage (IVH) of prematurity without other cause"),]
cat("Number of patients with Intraventricular hemorrhage (IVH) of prematurity without other cause:",nrow(IVH)) #372
## Number of patients with Intraventricular hemorrhage (IVH) of prematurity without other cause: 372

Within IVH subset, how many are actually prematurely born: Within IVH category, # of patients with (PREM_BIRTH) variable any input other than “no” or “unknown” = ________

cohortB <- IVH[which(!IVH$PREM_BIRTH=="No"),]
#cohortB <- IVH[which(IVH$GESTATIONALAGE_BIRTH <37),]
cohortB <- cohortB[which(!cohortB$GESTATIONALAGE_BIRTH == -99),]
cat("\nnumber of patients prematurely born within IVH subset:",nrow(cohortB)) #321
## 
## number of patients prematurely born within IVH subset: 321
df <- cohortB

#list of the different prem_birth categories and frequencies
df %>% 
  group_by(PREM_BIRTH) %>% 
  dplyr::summarise(n = n()) %>%
  arrange(desc(n))
## # A tibble: 8 × 2
##   PREM_BIRTH                                 n
##   <chr>                                  <int>
## 1 25-26 completed weeks gestation           83
## 2 27-28 completed weeks gestation           57
## 3 24 completed weeks gestation              51
## 4 Less than 24 completed weeks gestation    42
## 5 35-36 completed weeks gestation           27
## 6 29-30 completed weeks gestation           26
## 7 31-32 completed weeks gestation           19
## 8 33-34 completed weeks gestation           16

Now we have NSQIP Cohort B (patients with both IVH and were born prematurely)

Within NSQIP Cohort B: how many had shunt implant surgery before 40 weeks gestational age (this is that calculated variable): ________ how many had shunt implant surgery after 40 weeks gestational age (this is that calculated variable): ________

#IVH cases that had their surgery BEFORE 40 weeks of gestational age 
ivh_before40 <- df %>% 
  filter(!GESTATIONALAGE_SURGERY2 < 40) 

cat("number of IVH patients with shunt implant surgery before 40 weeks gestational age:",nrow(ivh_before40)) #163
## number of IVH patients with shunt implant surgery before 40 weeks gestational age: 157
#IVH cases that had their surgery AT OR AFTER 40 weeks of gestational age 
ivh_after40 <- df %>% 
  filter(!GESTATIONALAGE_SURGERY2 > 39) 

cat("\nnumber of IVH patients with shunt implant surgery at or after 40 weeks gestational age:",nrow(ivh_after40)) #179
## 
## number of IVH patients with shunt implant surgery at or after 40 weeks gestational age: 164

Split NSQIP Cohort B into Before40Weeks and After40Weeks

In Before40Weeks gestational age surgery group: How many had a prior hydrocephalus procedure: ______ Variable “PED_SHNT_PRIOR_HCPROC” (yes, no), we want “yes”

ivh_hydro <- ivh_before40[which(ivh_before40$PED_SHNT_PRIOR_HCPROC == "Yes"),]
cat("\nnumber of IVH patients w surgery before 40 wks with prior hydro procedure:",nrow(ivh_hydro)) #103
## 
## number of IVH patients w surgery before 40 wks with prior hydro procedure: 102

How many had a shunt failure: ______ Variable “PED_SHNT_INTRE1”, any entry other than “none”

ivh_fail<- ivh_before40[which(!ivh_before40$PED_SHNT_INTRE1_RSN == "NULL"),]
cat("\nnumber of IVH patients w surgery before 40 wks with shunt failure:",nrow(ivh_fail)) #18
## 
## number of IVH patients w surgery before 40 wks with shunt failure: 18
#calculated differently
ivh_fail2 <- ivh_before40[which(!ivh_before40$PED_SHNT_INTRE1 == "None"),]
cat("/nnumber of IVH patients w surgery before 40 weeks with shunt failure:",nrow(ivh_fail2))
## /nnumber of IVH patients w surgery before 40 weeks with shunt failure: 18

In after 40 weeks gestational age surgery group: How many had a prior hydrocephalus procedure: ______

ivh_hydro <- ivh_after40[which(ivh_after40$PED_SHNT_PRIOR_HCPROC == "Yes"),]
cat("\nnumber of IVH patients w surgery after 40 wks with prev hydro procedure:",nrow(ivh_hydro)) #113
## 
## number of IVH patients w surgery after 40 wks with prev hydro procedure: 106

How many had a shunt failure: ______ Variable “PED_SHNT_INTRE1”, any entry other than “none”

ivh_fail<- ivh_after40[which(!ivh_after40$PED_SHNT_INTRE1_RSN == "NULL"),]
cat("\nnumber of IVH patients w surgery after 40 wks with shunt failure:",nrow(ivh_fail)) #19
## 
## number of IVH patients w surgery after 40 wks with shunt failure: 17

number of IVH patients with shunt implant surgery before 40 weeks gestational age: number of IVH patients w surgery before 40 wks with shunt failure:

cat("Percentage of shunt failures in IVH patients w surgery BEFORE 40 weeks:")
## Percentage of shunt failures in IVH patients w surgery BEFORE 40 weeks:

number of IVH patients with shunt implant surgery at or after 40 weeks gestational age: number of IVH patients w surgery after 40 wks with shunt failure:

cat("\nPercentage of shunt failures in IVH patients w surgery AT/AFTER 40 weeks:")
## 
## Percentage of shunt failures in IVH patients w surgery AT/AFTER 40 weeks: