MH readmits

Read in cleaned data (39179 rows)

df1.input.data <- read_csv(here("results",
                                "code output", 
                  "2018-12-18_mh-readmissions-dataset-for-classification.csv"))

library(lubridate)

# some more cleaning: 
df2.cleaned.data <- 
      df1.input.data %>% 
      
      select(denom_PHN, 
             Gender, 
             AdmissionAge, 
             LHAName, 
             PostalCode, 
             GPCode, 
             AdjustedAdmissionDate, 
             denom_Institution, 
             denom_AdmissionDate, 
             denom_DischargeDate, 
             AdmissionFacilityLongName, 
             denom_nursingUnitCode, 
             AdmissionNursingUnit, 
             AdmittingDrService, 
             AdmissionAttendingDoctorService, 
             AdmissionCategoryCode, 
             AdmissionCategoryGroup, 
             AdmissionPatientServiceDADDescription, 
             ED_visits_all_time:ED_admits_last_365days, 
             elective_admits_all_time:elective_admits_last_365_days, 
             num_schizophrenia_dx1_all_time, 
             Readmit_Flag) %>% 
      
      mutate(LOS = difftime(denom_DischargeDate, 
                            denom_AdmissionDate, 
                            units = "days") %>% 
                   as.numeric(), 
             AdmissionAge = as.numeric(AdmissionAge), 
             AdjustedAdmissionDate = ymd(AdjustedAdmissionDate)
             ) %>% 
      
      # convert all chars to factors: 
      mutate_if(is.character, factor) %>% 
      
      rename(Institution = denom_Institution, 
             AdmitAge = AdmissionAge, 
             ReadmitFlag = Readmit_Flag) %>% 
      
      clean_names()


# view results: 
str(df2.cleaned.data)  # 39179 obs. of  33 variables
head(df2.cleaned.data)
summary(df2.cleaned.data)

# skim(df2.cleaned.data)

1. Missing values in field adjusted_admission_date - 786 cases

# df2.cleaned.data %>%
#       filter(is.na(adjusted_admission_date)) %>% View("missing vals")

df2.cleaned.data %>% 
      filter(is.na(adjusted_admission_date)) %>% 
      select(admit_age) %>% 
      summary  # 786 cases



df2.cleaned.data %>% 
      filter(admission_facility_long_name == "NULL") %>% 
      select(admission_facility_long_name) %>% 
      summary  # 800 cases


df2.cleaned.data %>% 
      filter(gender == "NULL") %>% 
      select(gender) %>% 
      summary  # 786 cases

todo: delete these cases

These 786 cases should be removed. Don’t really know what’s going on, but there’s important info missing. e.g. all have Gender = NULL. It’s possible that these are weird PatientIDs that aren’t joining properly across data sources.
Then again: perhaps this is useful information. There may be patterns specific to these patients who have weird non-matching PatientIDs??

Recall that in SQL script 02_mh-readmissions_adding-predictor-fields.sql, all joins across data sources (ED, ADTC, ADR, Community) are done using PatientID, not PHN.

2. Num cases by site, with median and 90th percentile

df2.cleaned.data %>% 
      group_by(institution) %>% 
      summarise(median = median(los), 
                perc.90 = quantile(los, 0.90), 
                num_cases = n()) %>% 
      arrange(desc(num_cases)) %>% 
      kable() %>% 
      kable_styling(bootstrap_options = c("striped"))

institution	median	perc.90	num_cases
VGH	7	35.0	12401
SPH	7	40.0	12010
RHS	2	27.0	4987
LGH	6	32.0	4971
UBCH	24	60.0	1872
SMH	4	20.0	1552
PRGH	7	32.9	782
BCGH	2	6.0	247
SGH	2	8.0	243
RWLMH	2	3.7	114

3. LOS = 0 cases (1886 rows)

# summary(df2.cleaned.data$los)

# ecdf(df2.cleaned.data %>% pull(los)) %>% plot

df3.los.zero <- 
      df2.cleaned.data %>% 
      filter(los == 0) %>% 
      select(institution, 
             denom_admission_date, 
             denom_discharge_date, 
             admit_age, 
             denom_nursing_unit_code,
             admission_nursing_unit, 
             admitting_dr_service) # %>% nrow()  # 1886 rows  

df3.los.zero$denom_nursing_unit_code %>% summary

df3.los.zero %>% 
      select(institution, 
             denom_nursing_unit_code, 
             denom_admission_date, 
             denom_discharge_date) %>% 
      sample_n(20)

df2.cleaned.data %>% 
      select(los, 
             institution) %>% 
      filter(los <= 30) %>% 
      
      ggplot(aes(x = los)) + 
      geom_density() + 
      facet_wrap(~institution)

df3.los.zero %>% 
      group_by(denom_nursing_unit_code) %>% 
      summarize(count = n()) %>% 
      arrange(desc(count)) %>% 
      kable() %>% 
      kable_styling() %>% 
      scroll_box(width = "100%", height = "400px")

denom_nursing_unit_code	count
EMRG	480
REMR	365
EIP	227
EMERB	196
EMERA	130
EMERU	116
EA-SM	85
EMERC	63
EMERR	48
NULL	39
PER/INPT	16
UD1E	15
PAU	11
PPSY	9
MH-SM	8
MERS	7
EMERD	6
UD1W	6
WP3	6
4NW	5
EMERT	4
HLD	4
RPEU	4
SAU	3
2A-SM	2
2E	2
EMERM	2
ICU	2
M1S	2
MIU	2
SG1	2
UD2S	2
WP2	2
1AA-SM	1
3E	1
A2	1
HBU	1
HCE1	1
HCE2	1
HCW1	1
M3C	1
N/A	1
PAR	1
PED	1
R2W	1
RAMB	1
S5MH	1
USPO	1

Proportion by site:

df2.cleaned.data %>% 
      group_by(institution) %>% 
      summarise(num.zero.los = sum(ifelse(los == 0, 1,0)), 
                other.los = sum(ifelse(los > 0, 1, 0))) %>% 
      
      mutate(prop = num.zero.los/(num.zero.los + other.los), 
             total = num.zero.los + other.los) %>% 
      arrange(desc(total)) %>% 
      kable() %>% 
      kable_styling()

institution	num.zero.los	other.los	prop	total
VGH	525	11876	0.0423353	12401
SPH	590	11420	0.0491257	12010
RHS	376	4611	0.0753960	4987
LGH	239	4732	0.0480789	4971
UBCH	8	1864	0.0042735	1872
SMH	102	1450	0.0657216	1552
PRGH	25	757	0.0319693	782
BCGH	5	242	0.0202429	247
SGH	11	232	0.0452675	243
RWLMH	5	109	0.0438596	114

Tentative conclusion: the LOS = 0 cases are mostly patients who were ED inpatients. That is, a decision to admit was made, but they were discharged before they ever moved up into the nursing units.

todo: remove LOS = 0 cases?

We may want to exclude these patients in most of the rest of the analysis.

3.1 Create nested dataframe by institution

# nest by site: 
df4.site.nested <- 
      df2.cleaned.data %>% 
      group_by(institution) %>% 
      nest() 



# add column to find overall readmit rate: 

# load function: 
source(here::here("code",
                  "readmit-rates-overall_function.R"))

df4.site.nested %<>% 
      mutate(readmission_overall = map(data, 
                                       readmit_rates_overall_function, 
                                       exclude_los_zero = 1, 
                                       last_2_years = 1))

# result: 
# df4.site.nested$readmission_overall
# df4.site.nested %>% unnest(readmission_overall)


# extract overall readmit rate: 
df4.site.nested %<>% 
      mutate(overall_readmit = map_dbl(readmission_overall, 
                                       function(x){x %>% pull(prop)})) %>% 
      select(institution, 
             overall_readmit, 
             everything())

4. Distributions with and without LOS = 0 cases

# function for plotting densities: 
source(here::here("code",
                  "plot-density_function.R"))

# function for plotting histograms: 
source(here::here("code",
                  "plot-histogram_function.R"))



# map function across nested df: 
df4.site.nested %<>% 
      mutate(densities = map2(data, 
                               institution, 
                               ~plot_density_function(..1, 
                                                      ..2, 
                                                      exclude_los_zero = 1)), 
             
             # now add histograms: 
             histograms = map2(data, 
                               institution, 
                               ~plot_histogram_function(..1, 
                                                      ..2, 
                                                      exclude_los_zero = 0)))

Densities

# view densities: 
df4.site.nested$densities

## [[1]]

## 
## [[2]]

## 
## [[3]]

## 
## [[4]]

## 
## [[5]]

## 
## [[6]]

## 
## [[7]]

## 
## [[8]]

## 
## [[9]]

## 
## [[10]]

Histograms

df4.site.nested$histograms

## [[1]]

## 
## [[2]]

## 
## [[3]]

## 
## [[4]]

## 
## [[5]]

## 
## [[6]]

## 
## [[7]]

## 
## [[8]]

## 
## [[9]]

## 
## [[10]]

# densities:
# pdf(here::here("results",
#                "code output",
#                "2019-02-12_los-densities_since-2016.pdf"),
#     width = 10)
# df4.site.nested$densities
# dev.off()

# histograms: 
# pdf(here::here("results",
#                "code output",
#                "2019-02-12_los-histograms_since-2016.pdf"),
#     width = 10)
# df4.site.nested$histograms
# dev.off()

5. Age categories

Are there natural splits in age suggested by the data?

# function for plotting age cdfs: 
source(here::here("code",
                  "plot-age-cdf_function.R"))


# add plots in a column: 
df4.site.nested %<>% 
      mutate(age_ecdf = map2(data, 
                            institution,
                            plot_age_cdf_function))

# results: 
df4.site.nested$age_ecdf

## [[1]]

## 
## [[2]]

## 
## [[3]]

## 
## [[4]]

## 
## [[5]]

## 
## [[6]]

## 
## [[7]]

## 
## [[8]]

## 
## [[9]]

## 
## [[10]]

# save outputs: 
# pdf(here::here("results", 
#                "code output",
#                "2019-02-12_age-ecdf.pdf"))
# df4.site.nested$age_ecdf
# dev.off()

Segmenting LOS by age quantiles

# function for boxplot by quartile: 
source(here::here("code",
                  "los-boxplot_function.R"))

# map function across nesed df: 
df4.site.nested %<>% 
      mutate(los_boxplot = map2(data, 
                                institution, 
                                los_boxplot_function,
                                exclude_los_zero = 1))

# print boxplots: 
# df4.site.nested$los_boxplot[[1]]
df4.site.nested$los_boxplot

## [[1]]

## 
## [[2]]

## 
## [[3]]

## 
## [[4]]

## 
## [[5]]

## 
## [[6]]

## 
## [[7]]

## 
## [[8]]

## 
## [[9]]

## 
## [[10]]

# save outputs: 
# pdf(here::here("results",
#                "code output",
#                "2019-02-13_los-distribution-by-age-quartile.pdf"))
# df4.site.nested$los_boxplot
# dev.off()

6. Problems with BCGH and RWLMH

Note that age is missing in all cases, as well as gender
Explanation: PatientID in ADR isn’t matching with the PatientIDs in ADTC for these sites?

df2.cleaned.data %>% 
      filter(institution == "BCGH") %>%
      summary()

##       denom_phn   gender      admit_age                     lha_name  
##  9026468137:52   F   :  0   Min.   : NA   NULL                  :247  
##  9023956948:38   M   :  0   1st Qu.: NA   100 Mile House        :  0  
##  9020625092:23   NULL:247   Median : NA   Abbotsford            :  0  
##  9014980462:17              Mean   :NaN   Agassiz-Harrison      :  0  
##  9020458731:11              3rd Qu.: NA   Alberni               :  0  
##  9064682565:10              Max.   : NA   Armstrong-Spallumcheen:  0  
##  (Other)   :96              NA's   :247   (Other)               :  0  
##   postal_code     gp_code    adjusted_admission_date  institution 
##  NULL   :247   NULL   :170   Min.   :NA              BCGH   :247  
##  A0A0A0 :  0   23440  :  7   1st Qu.:NA              LGH    :  0  
##  OutofC :  0   63606  :  6   Median :NA              PRGH   :  0  
##  OutofP :  0   63195  :  3   Mean   :NA              RHS    :  0  
##  UnkBC  :  0   4559   :  1   3rd Qu.:NA              RWLMH  :  0  
##  V0A1H0 :  0   (Other):  3   Max.   :NA              SGH    :  0  
##  (Other):  0   NA's   : 57   NA's   :247             (Other):  0  
##  denom_admission_date denom_discharge_date
##  Min.   :2012-01-10   Min.   :2012-04-08  
##  1st Qu.:2013-12-19   1st Qu.:2013-12-21  
##  Median :2015-03-27   Median :2015-03-31  
##  Mean   :2015-04-10   Mean   :2015-04-14  
##  3rd Qu.:2016-09-18   3rd Qu.:2016-09-20  
##  Max.   :2018-02-16   Max.   :2018-02-17  
##                                           
##                 admission_facility_long_name denom_nursing_unit_code
##  NULL                         :247           NULL   :247            
##  Holy Family Hospital         :  0           10A    :  0            
##  Lions Gate Hospital          :  0           10C    :  0            
##  Mount Saint Joseph Hospital  :  0           1AA-SM :  0            
##  Powell River General Hospital:  0           2A-SM  :  0            
##  Richmond Hospital            :  0           2E     :  0            
##  (Other)                      :  0           (Other):  0            
##                       admission_nursing_unit
##  NULL                            :247       
##  1AA-SM                          :  0       
##  2 East - Med/ Post Coronary Care:  0       
##  2A-SM                           :  0       
##  3 East - Paediatrics            :  0       
##  3 West - Maternity              :  0       
##  (Other)                         :  0       
##              admitting_dr_service      admission_attending_doctor_service
##  NULL                  :247       NULL                  :247             
##  Anaesthesiology       :  0       Anaesthesiology       :  0             
##  Anatomical Pathology  :  0       Cardiac Surgery       :  0             
##  Cardiac Surgery       :  0       Cardiology            :  0             
##  Cardiology            :  0       Critical Care Medicine:  0             
##  Critical Care Medicine:  0       Diagnostic Radiology  :  0             
##  (Other)               :  0       (Other)               :  0             
##  admission_category_code     admission_category_group
##  NULL    :247            Elective        :  0        
##  1       :  0            Emergent        :  0        
##  18984780:  0            Newborn         :  0        
##  2       :  0            NULL            :247        
##  3       :  0            Urgent          :  0        
##  309203  :  0            Urgent\\Emergent:  0        
##  (Other) :  0                                        
##    admission_patient_service_dad_description ed_visits_all_time
##  NULL                   :247                 Min.   : 0.0000   
##  Alternate Level of Care:  0                 1st Qu.: 0.0000   
##  Cardiology             :  0                 Median : 0.0000   
##  Cardiovascular Surgery :  0                 Mean   : 0.8057   
##  Family Practice        :  0                 3rd Qu.: 1.0000   
##  Gastro-Enterology      :  0                 Max.   :20.0000   
##  (Other)                :  0                                   
##  ed_visits_last_30days ed_visits_last_180days ed_visits_last_365days
##  Min.   :0.00000       Min.   : 0.0000        Min.   : 0.0000       
##  1st Qu.:0.00000       1st Qu.: 0.0000        1st Qu.: 0.0000       
##  Median :0.00000       Median : 0.0000        Median : 0.0000       
##  Mean   :0.01619       Mean   : 0.1457        Mean   : 0.2186       
##  3rd Qu.:0.00000       3rd Qu.: 0.0000        3rd Qu.: 0.0000       
##  Max.   :2.00000       Max.   :13.0000        Max.   :14.0000       
##                                                                     
##  ed_admits_all_time ed_admits_last_30days ed_admits_last_180days
##  Min.   :0.0000     Min.   :0.000000      Min.   :0.00000       
##  1st Qu.:0.0000     1st Qu.:0.000000      1st Qu.:0.00000       
##  Median :0.0000     Median :0.000000      Median :0.00000       
##  Mean   :0.2024     Mean   :0.004049      Mean   :0.04453       
##  3rd Qu.:0.0000     3rd Qu.:0.000000      3rd Qu.:0.00000       
##  Max.   :3.0000     Max.   :1.000000      Max.   :1.00000       
##                                                                 
##  ed_admits_last_365days elective_admits_all_time
##  Min.   :0.00000        Min.   :0.0000          
##  1st Qu.:0.00000        1st Qu.:0.0000          
##  Median :0.00000        Median :0.0000          
##  Mean   :0.06883        Mean   :0.3846          
##  3rd Qu.:0.00000        3rd Qu.:0.0000          
##  Max.   :1.00000        Max.   :6.0000          
##                                                 
##  elective_admits_last_30_days elective_admits_last_180_days
##  Min.   :0.00000              Min.   :0.00000              
##  1st Qu.:0.00000              1st Qu.:0.00000              
##  Median :0.00000              Median :0.00000              
##  Mean   :0.01215              Mean   :0.06073              
##  3rd Qu.:0.00000              3rd Qu.:0.00000              
##  Max.   :1.00000              Max.   :1.00000              
##                                                            
##  elective_admits_last_365_days num_schizophrenia_dx1_all_time
##  Min.   :0.0000                Min.   :0.00000               
##  1st Qu.:0.0000                1st Qu.:0.00000               
##  Median :0.0000                Median :0.00000               
##  Mean   :0.1174                Mean   :0.01215               
##  3rd Qu.:0.0000                3rd Qu.:0.00000               
##  Max.   :2.0000                Max.   :1.00000               
##                                                              
##   readmit_flag        los        
##  Min.   :0.000   Min.   : 0.000  
##  1st Qu.:0.000   1st Qu.: 1.000  
##  Median :0.000   Median : 2.000  
##  Mean   :0.332   Mean   : 3.255  
##  3rd Qu.:1.000   3rd Qu.: 3.000  
##  Max.   :1.000   Max.   :98.000  
##

7. Readmission rates by age quartile

# load functions: 
source(here::here("code",
                  "readmit-rates-by-age_function.R"))


# map function across nested df: 
df4.site.nested %<>% 
      mutate(readmit_rates = map(data, 
                                 readmit_rates_function, 
                                 exclude_los_zero = 1), 
             
             # now add the column plots: 
             readmit_plot = pmap(list(readmit_rates, 
                                      as.character(institution), 
                                      overall_readmit), 
                                 readmit_rates_plot))

# results: 
df4.site.nested$readmit_plot[[1]]

df4.site.nested$readmit_plot

## [[1]]

## 
## [[2]]

## 
## [[3]]

## 
## [[4]]

## 
## [[5]]

## 
## [[6]]

## 
## [[7]]

## 
## [[8]]

## 
## [[9]]

## 
## [[10]]

Notes

Recall the effects of Age and LOS are correlated: it looks like people in age quantile 4 have lowest readmit %, but this is not necessarily the effect of age alone: see df4.site.nested$los_boxplot - the oldest patients also have longest LOS * So maybe it’s the long LOS that’s causing them to have low readmits, not the fact that they are older.
… Actually, it’s both, which is why both variables should be in the model.

# save results: 
# pdf(here::here("results",
#                "code output",
#                "2019-02-13_readmit-rates-by-age-quantile.pdf"),
#     width = 10)
# df4.site.nested$readmit_plot
# dev.off()

df4.site.nested %>% 
      unnest(readmit_rates) %>%
      kable() %>% 
      kable_styling(bootstrap_options = c("striped")) %>% 
      scroll_box(width = "100%", height = "400px")

institution	overall_readmit	age_quantile	cases	readmit_cases	prop	exclude_los_zero	last_2_years
VGH	0.1794504	0	11	2	0.1818182	1	1
VGH	0.1794504	1	964	162	0.1680498	1	1
VGH	0.1794504	2	1095	200	0.1826484	1	1
VGH	0.1794504	3	1063	235	0.2210724	1	1
VGH	0.1794504	4	1052	152	0.1444867	1	1
UBCH	0.0828804	0	6	0	0.0000000	1	1
UBCH	0.0828804	1	178	10	0.0561798	1	1
UBCH	0.0828804	2	175	18	0.1028571	1	1
UBCH	0.0828804	3	191	23	0.1204188	1	1
UBCH	0.0828804	4	186	10	0.0537634	1	1
SPH	0.1881597	0	12	3	0.2500000	1	1
SPH	0.1881597	1	1066	235	0.2204503	1	1
SPH	0.1881597	2	1079	239	0.2215014	1	1
SPH	0.1881597	3	1103	199	0.1804170	1	1
SPH	0.1881597	4	1098	144	0.1311475	1	1
LGH	0.1189904	0	8	0	0.0000000	1	1
LGH	0.1189904	1	402	40	0.0995025	1	1
LGH	0.1189904	2	425	54	0.1270588	1	1
LGH	0.1189904	3	399	61	0.1528822	1	1
LGH	0.1189904	4	430	43	0.1000000	1	1
RHS	0.1450428	0	3	0	0.0000000	1	1
RHS	0.1450428	1	373	44	0.1179625	1	1
RHS	0.1450428	2	405	73	0.1802469	1	1
RHS	0.1450428	3	422	53	0.1255924	1	1
RHS	0.1450428	4	431	67	0.1554524	1	1
PRGH	0.1423841	0	2	0	0.0000000	1	1
PRGH	0.1423841	1	71	18	0.2535211	1	1
PRGH	0.1423841	2	73	8	0.1095890	1	1
PRGH	0.1423841	3	77	9	0.1168831	1	1
PRGH	0.1423841	4	79	8	0.1012658	1	1
SMH	0.1646859	0	2	0	0.0000000	1	1
SMH	0.1646859	1	146	20	0.1369863	1	1
SMH	0.1646859	2	143	39	0.2727273	1	1
SMH	0.1646859	3	144	22	0.1527778	1	1
SMH	0.1646859	4	154	16	0.1038961	1	1
SGH	0.1038961	1	18	3	0.1666667	1	1
SGH	0.1038961	2	19	3	0.1578947	1	1
SGH	0.1038961	3	19	0	0.0000000	1	1
SGH	0.1038961	4	21	2	0.0952381	1	1
RWLMH	0.0303030	0	33	1	0.0303030	1	1
BCGH	0.3670886	0	79	29	0.3670886	1	1

8. Readmission rates by LOS decile

# load functions: 
source(here::here("code",
                  "readmit-rates-by-los_function.R"))


# map function across nested df: 
df4.site.nested %<>% 
      mutate(readmit_rates_by_los = map(data, 
                                        readmit_rates_los_function,
                                        exclude_los_zero = 1), 
             readmit_by_los_plot = pmap(list(readmit_rates_by_los, 
                                             as.character(institution),
                                             overall_readmit), 
                                        readmit_rates_by_los_plot))

# results: 
# df4.site.nested$readmit_rates_by_los

# df4.site.nested$readmit_by_los_plot[[2]]
df4.site.nested$readmit_by_los_plot

## [[1]]

## 
## [[2]]

## 
## [[3]]

## 
## [[4]]

## 
## [[5]]

## 
## [[6]]

## 
## [[7]]

## 
## [[8]]

## 
## [[9]]

## 
## [[10]]

# write output: 
# pdf(here::here("results",
#                "code output",
#                "2019-02-13_readmit-rates-by-LOS-decile.pdf"),
#     width = 10)
# df4.site.nested$readmit_by_los_plot
# dev.off()

9. Readmission rates by Gender

# load functions: 
source(here::here("code",
                  "readmit-rates-by-gender_function.R"))


# map function across nested df: 
df4.site.nested %<>% 
      mutate(readmit_rates_by_gender = map(data, 
                                           readmit_rates_gender_function,
                                           exclude_los_zero = 1), 
             readmit_plot_by_gender = pmap(list(readmit_rates_by_gender, 
                                                as.character(institution), 
                                                overall_readmit), 
                                           readmit_rates_by_gender_plot)
                                           
             ) 

# result: 
df4.site.nested %>% 
      unnest(readmit_rates_by_gender) %>% 
      kable() %>% 
      kable_styling(bootstrap_options = c("striped")) %>% 
      scroll_box(width = "100%", height = "400px")

institution	overall_readmit	gender	cases	readmit_cases	prop	exclude_los_zero	last_2_years
VGH	0.1794504	F	1790	325	0.1815642	1	1
VGH	0.1794504	M	2384	424	0.1778523	1	1
VGH	0.1794504	NULL	11	2	0.1818182	1	1
UBCH	0.0828804	F	420	28	0.0666667	1	1
UBCH	0.0828804	M	310	33	0.1064516	1	1
UBCH	0.0828804	NULL	6	0	0.0000000	1	1
SPH	0.1881597	F	1616	262	0.1621287	1	1
SPH	0.1881597	M	2730	555	0.2032967	1	1
SPH	0.1881597	NULL	12	3	0.2500000	1	1
LGH	0.1189904	F	835	109	0.1305389	1	1
LGH	0.1189904	M	821	89	0.1084044	1	1
LGH	0.1189904	NULL	8	0	0.0000000	1	1
RHS	0.1450428	F	672	90	0.1339286	1	1
RHS	0.1450428	M	959	147	0.1532847	1	1
RHS	0.1450428	NULL	3	0	0.0000000	1	1
PRGH	0.1423841	F	159	25	0.1572327	1	1
PRGH	0.1423841	M	141	18	0.1276596	1	1
PRGH	0.1423841	NULL	2	0	0.0000000	1	1
SMH	0.1646859	F	314	53	0.1687898	1	1
SMH	0.1646859	M	273	44	0.1611722	1	1
SMH	0.1646859	NULL	2	0	0.0000000	1	1
SGH	0.1038961	F	43	5	0.1162791	1	1
SGH	0.1038961	M	34	3	0.0882353	1	1
RWLMH	0.0303030	NULL	33	1	0.0303030	1	1
BCGH	0.3670886	NULL	79	29	0.3670886	1	1

# graphs: 
df4.site.nested$readmit_plot_by_gender

## [[1]]

## 
## [[2]]

## 
## [[3]]

## 
## [[4]]

## 
## [[5]]

## 
## [[6]]

## 
## [[7]]

## 
## [[8]]

## 
## [[9]]

## 
## [[10]]

# write output: 
# pdf(here::here("results",
#                "code output",
#                "2019-02-13_readmit-rates-by-gender.pdf"),
#     width = 10)
# df4.site.nested$readmit_plot_by_gender
# dev.off()

10. Pull nested data together to print

df5.0.overall.readmits <- 
      df4.site.nested %>% 
      unnest(readmission_overall) %>% 
      select(institution, 
             cases, 
             readmit_cases, 
             prop)
      

df5.1.age.readmits <- 
      df4.site.nested %>% 
      unnest(readmit_rates) %>% 
      select(institution, 
             age_quantile, 
             prop) %>% 
      spread(key = age_quantile, 
             value = prop) 

df5.2.los.readmits <- 
      df4.site.nested %>% 
      unnest(readmit_rates_by_los) %>% 
      select(institution, 
             los_decile, 
             prop) %>% 
      spread(key = los_decile, 
             value = prop) %>% 
      select(-`10`,
             `10`)

df5.3.gender.readmits <- 
      df4.site.nested %>% 
      unnest(readmit_rates_by_gender) %>% 
      select(institution, 
             gender, 
             prop) %>% 
      spread(key = gender, 
             value = prop) 


# join them all together: 
df5.summary <- 
      df5.0.overall.readmits %>% 
      inner_join(df5.1.age.readmits, 
                 by = c("institution" = "institution")) %>%
      inner_join(df5.2.los.readmits, 
                 by = c("institution" = "institution")) %>% 
      inner_join(df5.3.gender.readmits, 
                 by = c("institution" = "institution")) %>% 
      
      purrr::set_names(c("institution", 
                         "overall_cases", 
                         "overall_readmits",
                         "overall_prop", 
                         
                         "age_quantile_0", 
                         "age_quantile_1",
                         "age_quantile_2", 
                         "age_quantile_3", 
                         "age_quantile_4", 
                         
                         "los_decile_1", 
                         "los_decile_2", 
                         "los_decile_3", 
                         "los_decile_4", 
                         "los_decile_5", 
                         "los_decile_6", 
                         "los_decile_7", 
                         "los_decile_8", 
                         "los_decile_9", 
                         "los_decile_10", 
                         
                         "gender_female", 
                         "gender_male", 
                         "gender_NULL"))

# result
df5.summary %>% 
      kable() %>% 
      kable_styling(bootstrap_options = c("striped")) %>% 
      scroll_box(width = "100%", height = "400px")

institution	overall_cases	overall_readmits	overall_prop	age_quantile_0	age_quantile_1	age_quantile_2	age_quantile_3	age_quantile_4	los_decile_1	los_decile_2	los_decile_3	los_decile_4	los_decile_5	los_decile_6	los_decile_7	los_decile_8	los_decile_9	los_decile_10	gender_female	gender_male	gender_NULL
VGH	4185	751	0.1794504	0.1818182	0.1680498	0.1826484	0.2210724	0.1444867	NA	0.1807229	0.2176871	0.2061657	0.2202073	0.1953782	0.2019002	0.1789976	0.1294964	0.0898345	0.1815642	0.1778523	0.1818182
UBCH	736	61	0.0828804	0.0000000	0.0561798	0.1028571	0.1204188	0.0537634	0.1944444	0.0769231	0.1071429	0.0428571	0.0649351	0.0821918	0.0625000	0.0641026	0.0253165	0.1081081	0.0666667	0.1064516	0.0000000
SPH	4358	820	0.1881597	0.2500000	0.2204503	0.2215014	0.1804170	0.1311475	NA	0.1869031	0.1917211	0.2033582	0.2556818	0.2490637	0.1818182	0.1698925	0.1341176	0.1185682	0.1621287	0.2032967	0.2500000
LGH	1664	198	0.1189904	0.0000000	0.0995025	0.1270588	0.1528822	0.1000000	NA	0.1446281	0.1398964	0.0970149	0.1336406	0.1386139	0.1034483	0.0943396	0.1028571	0.0892857	0.1305389	0.1084044	0.0000000
RHS	1634	237	0.1450428	0.0000000	0.1179625	0.1802469	0.1255924	0.1554524	NA	NA	0.1343284	NA	0.1554054	0.1621622	0.1681416	0.1272727	0.1428571	0.1301775	0.1339286	0.1532847	0.0000000
PRGH	302	43	0.1423841	0.0000000	0.2535211	0.1095890	0.1168831	0.1012658	NA	0.1754386	0.1200000	0.1666667	0.0952381	0.1200000	0.1250000	0.1333333	0.1142857	0.2187500	0.1572327	0.1276596	0.0000000
SMH	589	97	0.1646859	0.0000000	0.1369863	0.2727273	0.1527778	0.1038961	NA	0.2500000	NA	0.1904762	0.1333333	0.1346154	0.1578947	0.1269841	0.1355932	0.1093750	0.1687898	0.1611722	0.0000000
SGH	77	8	0.1038961	NA	0.1666667	0.1578947	0.0000000	0.0952381	NA	0.2000000	NA	NA	0.1363636	NA	0.0833333	0.0909091	0.0000000	0.0000000	0.1162791	0.0882353	NA
RWLMH	33	1	0.0303030	0.0303030	NA	NA	NA	NA	NA	NA	NA	NA	0.0000000	NA	NA	0.0000000	0.2500000	0.0000000	NA	NA	0.0303030
BCGH	79	29	0.3670886	0.3670886	NA	NA	NA	NA	NA	NA	0.3888889	NA	NA	0.3928571	NA	0.3529412	0.4285714	0.2222222	NA	NA	0.3670886

# WRITE OUTPUT: 
write_csv(df5.summary, 
          here::here("results", 
                     "code output", 
                     "2019-02-13_summary-readmit-rates.csv"))

MH readmits - EDA

Nayef Ahmad

February 5, 2019