Exploring the HHS Hospital Compare Database: Finding the Right Hospitals Across the US

by: D’Cypher

Finding the right hospital can be overwhelming. Hospitals, especially for major care, often have special areas of focus such as heart-attacks or stroke. The purpose of this project is to help lessen some of that work, and to help focus the research for the right hospital for a specific need.

Synopsis

This project is a program that uses R code to process hospital data and provide recommendations on hospitals within given criterions.

The program uses data from the Hospital Compare web site (http://hospitalcompare.hhs.gov) run by the U.S. Department of Health and Human Services. The purpose of the web site is to provide data and information about the quality of care at over 4,000 Medicare-certified hospitals in the U.S.

This dataset essentially covers all major U.S. hospitals. This dataset is used for a variety of purposes, including determining whether hospitals should be fined for not providing high quality care to patients (see http://goo.gl/jAXFX for some background on this particular topic).

Data Processing

The Hospital Compare web site contains a lot of data and this program will only look at a small subset. The analysis was performed on:

  1. outcome-of-care-measures.csv: Contains information about 30-day mortality and readmission rates for heart attacks, heart failure, and pneumonia for over 4,000 hospitals.

  2. hospital-data.csv: Contains information about each hospitals

  3. Hospital_Revised_Flatfiles.pdf: Descriptions of the variables in each (i.e the code book).

This code reads in the data.

setwd("C:/Users/wcai/Desktop/")
getwd()
outcome_data <- read.csv("outcome-of-care-measures.csv", colClasses = "character")
#head(outcome_data)

To get a sense of what the data looks like, this is a plot the 30-day mortality rates for heart attacks for hospitals accross the US:

#Histogram of 30 day death rates from heart attacks
outcome_data[, 11] <- as.numeric(outcome_data[, 11])
hist(outcome_data[, 11])

Program ‘Best’: Finding the best hospital in a state

This is the first of three programs that processes the data. This program is called ‘Best’. It is a function call that takes two inputs: a US State abbreviation, and a catagory (“heart attack”, “heart failure”, or “pneumonia”). ‘Best’ then returns the hospital with the lowest mortality on that catagory rate for that given state.

# "best" function takes State and outcomes ("heart attack", "heart failure", 
# or "pneumonia"). and returns hospital with 
# the lowest mortality rate for that outcome in that state.
best <- function(state, outcome){
  
  #--Input testing:
possible_state <- (unique(outcome_data$State) == state)
possible_outcome <- (c("heart attack", "heart failure", "pneumonia") == outcome)

  if(sum(possible_state) != 1){
    stop(print("invalid state"))
  } else if(sum(possible_outcome) != 1){
    stop(print("invalid outcome"))
  } else {
  
  #-- Creates State subset
  state_filter <- outcome_data[outcome_data$State == state,]
  
  state_filter[state_filter == "Not Available" ] = NA
  #Turns "Not Availible" string into NA
  
  state_subset <- data.frame(as.character(state_filter$Hospital.Name), 
                             as.character(state_filter$State),
                             as.numeric(state_filter$Hospital.30.Day.Death..Mortality..Rates.from.Heart.Attack),
                             as.numeric(state_filter$Hospital.30.Day.Death..Mortality..Rates.from.Heart.Failure),
                             as.numeric(state_filter$Hospital.30.Day.Death..Mortality..Rates.from.Pneumonia))
  
  colnames(state_subset) <- c("Hospital.Name", "State", "heart_attack", "heart_failure", "pneumonia")
  
  rm(state_filter)
  
  #-- Conditional Min and lookup
  if(outcome == "heart attack") {
    min_outcome <- min(state_subset$heart_attack, na.rm = TRUE)
    lookup_row <- which(state_subset$State == state 
                        & state_subset$heart_attack == min_outcome 
                        & complete.cases(state_subset$heart_attack) == T)
    lookup_col <- which(colnames(state_subset)=="Hospital.Name")
    best_hospitals <- sort(as.vector(state_subset[lookup_row,lookup_col]))
    
  } else if (outcome == "heart failure") {
    min_outcome <- min(state_subset$heart_failure, na.rm = TRUE)
    lookup_row <- which(state_subset$State == state 
                        & state_subset$heart_failure == min_outcome 
                        & complete.cases(state_subset$heart_failure) == T)
    lookup_col <- which(colnames(state_subset)=="Hospital.Name")
    best_hospitals <- sort(as.vector(state_subset[lookup_row,lookup_col]))
    
  } else if (outcome == "pneumonia"){
    min_outcome <- min(state_subset$pneumonia, na.rm = TRUE)
    lookup_row <- which(state_subset$State == state 
                        & state_subset$pneumonia == min_outcome 
                        & complete.cases(state_subset$pneumonia) == T)
    lookup_col <- which(colnames(state_subset)=="Hospital.Name")
    best_hospitals <- sort(as.vector(state_subset[lookup_row,lookup_col]))
    
  } else {
    stop(print("Not valid input for outcome."))
  }
  
  #Gives 1 of best hospital based on name order
  print(best_hospitals[1])
  }
}

Here are 4 examples of the input and output:

#"CYPRESS FAIRBANKS MEDICAL CENTER"
best("TX", "heart attack")
## [1] "CYPRESS FAIRBANKS MEDICAL CENTER"
#"FORT DUNCAN MEDICAL CENTER"
best("TX", "heart failure")
## [1] "FORT DUNCAN MEDICAL CENTER"
#"JOHNS HOPKINS HOSPITAL, THE"
best("MD", "heart attack")
## [1] "JOHNS HOPKINS HOSPITAL, THE"
#"GREATER BALTIMORE MEDICAL CENTER"
best("MD", "pneumonia")
## [1] "GREATER BALTIMORE MEDICAL CENTER"

Program ‘Rank’: Ranking hospitals by outcome in a state

The next program is called ‘Rank’. It’s a function called rankhospital that takes three arguments: the 2-character abbreviated name of a state (state), an outcome (outcome), and the ranking of a hospital in that state for that outcome (num).

The function reads the outcome-of-care-measures.csv and returns a character vector with the name of the hospital that has the ranking specified by the num argument.

For example, the call rankhospital(“MD”, “heart failure”, 5) would return a character vector containing the name of the hospital with the 5th lowest 30-day death rate for heart failure. The num argument can take values “best”, “worst”, or an integer indicating the ranking (smaller numbers are better).

If the number given by num is larger than the number of hospitals in that state, then the function should return NA. Hospitals that do not have data on a particular outcome should be excluded from the set of hospitals when deciding the rankings.

# "rankhospital" function takes State, outcomes ("heart attack", "heart failure", 
# or "pneumonia"), and rank, and returns hospital with 
# the lowest mortality rate for that outcome in that state for that rank.

rankhospital <- function(state, outcome, num = "best") {
  
  #--Input testing on state and outcome:
  possible_state <- (unique(outcome_data$State) == state)
  possible_outcome <- (c("heart attack", "heart failure", "pneumonia") == outcome)
  
  if(sum(possible_state) != 1){
    stop(print("invalid state"))
  } else if(sum(possible_outcome) != 1){
    stop(print("invalid outcome"))
  } else {
    
    #-- Creates State subset
    state_filter <- outcome_data[outcome_data$State == state,]
    
    state_filter[state_filter == "Not Available" ] = NA
    #Turns "Not Availible" string into NA
    
    state_subset <- data.frame(as.character(state_filter$Hospital.Name), 
                               as.character(state_filter$State),
                               as.numeric(state_filter$Hospital.30.Day.Death..Mortality..Rates.from.Heart.Attack),
                               as.numeric(state_filter$Hospital.30.Day.Death..Mortality..Rates.from.Heart.Failure),
                               as.numeric(state_filter$Hospital.30.Day.Death..Mortality..Rates.from.Pneumonia))
    
    colnames(state_subset) <- c("Hospital.Name", "State", "heart_attack", "heart_failure", "pneumonia")
    
    rm(state_filter)
    
    #-- Conditional outcome columns
    if(outcome == "heart attack"){ 
      outcome_col = 3
    } else if (outcome == "heart failure"){
      outcome_col = 4
    } else if (outcome == "pneumonia"){
      outcome_col = 5
    } else {
      stop(print("Not valid input for outcome."))
    }
    
    #--Ranking identifier: 
    n <- as.numeric(sum(complete.cases(state_subset[,outcome_col])))
    best_outcome <- min(state_subset[,outcome_col], na.rm = TRUE)
    worst_outcome <- max(state_subset[,outcome_col], na.rm = TRUE)
    
    if(num == "best") { 
      Nth_score <- best_outcome
    } else if(num == "worst") {
      Nth_score <- worst_outcome
    } else if(num >= 1 & num <= n) {
      Nth_score <- (sort(state_subset[,outcome_col], partial=(n-(n-num)))[n-(n-num)])
    } else if(num < 0 | num > n) {
      return("NA")
    } else {
      stop(print("Invalid input for num"))
    }

    sort(state_subset$Hospital.Name, decreasing = TRUE)
    lookup_row <- which(state_subset$State == state 
                        & state_subset[,outcome_col] == Nth_score 
                        & complete.cases(state_subset[,outcome_col]) == T)
    lookup_col <- as.numeric(which(colnames(state_subset)=="Hospital.Name"))
    Nth_best <- as.vector(state_subset[lookup_row,lookup_col])
    
    countof_Nth_best <- as.numeric(length(Nth_best))
    
    if(countof_Nth_best == 1){
      print(Nth_best)
    } else if (countof_Nth_best > 1){    
#-- Breaks ties
    top_Nth_filter <- state_subset[state_subset[,outcome_col] <= Nth_score & complete.cases(state_subset[,outcome_col]) == T,]
    top_Nth <- data.frame(as.character(top_Nth_filter$Hospital.Name), 
               as.character(top_Nth_filter$State), 
               as.numeric(top_Nth_filter$heart_attack),
               as.numeric(top_Nth_filter$heart_failure),
               as.numeric(top_Nth_filter$pneumonia))
    colnames(top_Nth) <- c("Hospital.Name", "State", "heart_attack", "heart_failure", "pneumonia")
    rm(top_Nth_filter)
    
    outcome_col_name <- names(top_Nth)[outcome_col]
    hospital_col_name <- names(top_Nth)[1]
    with_order <- with(top_Nth, order(top_Nth[outcome_col_name], top_Nth[hospital_col_name]))
    top_Nth_ordered <- top_Nth[with_order, ]
    print(as.vector(top_Nth_ordered[num,1]))  
    } else {
      stop(print("No Hospitals Qualify"))
    }
  }
}

Here are 2 examples of the input and output:

#"HARFORD MEMORIAL HOSPITAL"
rankhospital("MD", "heart attack", "worst")
## [1] "HARFORD MEMORIAL HOSPITAL"
#"NA"
rankhospital("ID", "heart failure", 20)
## [1] "NA"

Program ‘Rank All’: Ranking hospitals in all states

This last program is called ‘Rank All’. It’s a function called rankall that takes two arguments: an outcome name (outcome) and a hospital ranking (num).

The function reads the outcome-of-care-measures.csv and returns a 2-column data frame containing the hospital in each state that has the ranking specified in num.

For example the function call rankall(“heart attack”, “best”) would return a data frame containing the names of the hospitals that are the best in their respective states for 30-day heart attack death rates.

The function should return a value for every state (some may be NA). The first column in the data frame is named hospital, which contains the hospital name, and the second column is named state, which contains the 2-character abbreviation for the state name. Hospitals that do not have data on a particular outcome should be excluded from the set of hospitals when deciding the rankings.

# "rankall" function takes outcomes ("heart attack", "heart failure", 
# or "pneumonia"), and rank, and returns hospital with 
# the lowest mortality rate for that outcome per state.

rankall <- function(outcome, num = "best") {

state <- as.vector(unique(outcome_data$State))

hospital <- vector(mode ="character")

for (i in seq_along(state)){
  rankhospital <- function(state, outcome, num = "best") {
    
    #--Input testing on state and outcome:
    possible_state <- (unique(outcome_data$State) == state)
    possible_outcome <- (c("heart attack", "heart failure", "pneumonia") == outcome)
    
    if(sum(possible_state) != 1){
      stop(print("invalid state"))
    } else if(sum(possible_outcome) != 1){
      stop(print("invalid outcome"))
    } else {
      
      #-- Creates State subset
      state_filter <- outcome_data[outcome_data$State == state,]
      
      state_filter[state_filter == "Not Available" ] = NA
      #Turns "Not Availible" string into NA
      
      state_subset <- data.frame(as.character(state_filter$Hospital.Name), 
                                 as.character(state_filter$State),
                                 as.numeric(state_filter$Hospital.30.Day.Death..Mortality..Rates.from.Heart.Attack),
                                 as.numeric(state_filter$Hospital.30.Day.Death..Mortality..Rates.from.Heart.Failure),
                                 as.numeric(state_filter$Hospital.30.Day.Death..Mortality..Rates.from.Pneumonia))
      
      colnames(state_subset) <- c("Hospital.Name", "State", "heart_attack", "heart_failure", "pneumonia")
      
      rm(state_filter)
      
      #-- Conditional outcome columns
      if(outcome == "heart attack"){ 
        outcome_col = 3
      } else if (outcome == "heart failure"){
        outcome_col = 4
      } else if (outcome == "pneumonia"){
        outcome_col = 5
      } else {
        stop(print("Not valid input for outcome."))
      }
      
      #--Ranking identifier:    
      n <- as.numeric(sum(complete.cases(state_subset[,outcome_col])))
      best_outcome <- min(state_subset[,outcome_col], na.rm = TRUE)
      worst_outcome <- max(state_subset[,outcome_col], na.rm = TRUE)
      
      if(num == "best") { 
        Nth_score <- best_outcome
      } else if(num == "worst") {
        Nth_score <- worst_outcome
      } else if(num >= 1 & num <= n) {
        Nth_score <- (sort(state_subset[,outcome_col], partial=(n-(n-num)))[n-(n-num)])
      } else if(num < 0 | num > n) {
        return("NA")
      } else {
        return("NA")
      }
      
      
      sort(state_subset$Hospital.Name, decreasing = TRUE)
      lookup_row <- which(state_subset$State == state 
                          & state_subset[,outcome_col] == Nth_score 
                          & complete.cases(state_subset[,outcome_col]) == T)
      lookup_col <- as.numeric(which(colnames(state_subset)=="Hospital.Name"))
      Nth_best <- as.vector(state_subset[lookup_row,lookup_col])
      
      countof_Nth_best <- as.numeric(length(Nth_best))
      
      if(countof_Nth_best == 1){
        print(Nth_best)
      } else if (countof_Nth_best > 1){    
        #-- Breaks ties
        top_Nth_filter <- state_subset[state_subset[,outcome_col] <= Nth_score & complete.cases(state_subset[,outcome_col]) == T,]
        top_Nth <- data.frame(as.character(top_Nth_filter$Hospital.Name), 
                              as.character(top_Nth_filter$State), 
                              as.numeric(top_Nth_filter$heart_attack),
                              as.numeric(top_Nth_filter$heart_failure),
                              as.numeric(top_Nth_filter$pneumonia))
        colnames(top_Nth) <- c("Hospital.Name", "State", "heart_attack", "heart_failure", "pneumonia")
        rm(top_Nth_filter)
        
        outcome_col_name <- names(top_Nth)[outcome_col]
        hospital_col_name <- names(top_Nth)[1]
        with_order <- with(top_Nth, order(top_Nth[outcome_col_name], top_Nth[hospital_col_name]))
        top_Nth_ordered <- top_Nth[with_order, ]
        as.vector(top_Nth_ordered[num,1])  
      } else {
        stop(print("No Hospitals Qualify"))
      }
    }
  }
  
  hospital[i] <- rankhospital(state[i], outcome, num)
}

Nth_best <- data.frame(hospital, state)
print(Nth_best)
}

Here are 3 examples of the input and output:

#Top 20 on heart attack across the Country
rankall("heart attack", 20)
## [1] "D W MCMILLAN MEMORIAL HOSPITAL"
## [1] "JOHN C LINCOLN DEER VALLEY HOSPITAL"
## [1] "SKY RIDGE MEDICAL CENTER"
## [1] "COVENANT MEDICAL CENTER"
## [1] "COFFEYVILLE REGIONAL MEDICAL CENTER"
## [1] "HEYWOOD HOSPITAL"
## [1] "MARION GENERAL HOSPITAL"
## [1] "FRANKLIN REGIONAL HOSPITAL"
## [1] "MEDWEST HAYWOOD"
## [1] "HOSPITAL METROPOLITANO DR TITO MATTEI"
## [1] "ST CROIX REG MED CTR"
##                                                        hospital state
## 1                                D W MCMILLAN MEMORIAL HOSPITAL    AL
## 2                                                            NA    AK
## 3                           JOHN C LINCOLN DEER VALLEY HOSPITAL    AZ
## 4                             ARKANSAS METHODIST MEDICAL CENTER    AR
## 5                                         SHERMAN OAKS HOSPITAL    CA
## 6                                      SKY RIDGE MEDICAL CENTER    CO
## 7                                       MIDSTATE MEDICAL CENTER    CT
## 8                                                            NA    DE
## 9                                                            NA    DC
## 10                               SOUTH FLORIDA BAPTIST HOSPITAL    FL
## 11                                UPSON REGIONAL MEDICAL CENTER    GA
## 12                                                           NA    HI
## 13                                                           NA    ID
## 14 JESSE BROWN VA MEDICAL CENTER - VA CHICAGO HEALTHCARE SYSTEM    IL
## 15                                           COMMUNITY HOSPITAL    IN
## 16                                      COVENANT MEDICAL CENTER    IA
## 17                          COFFEYVILLE REGIONAL MEDICAL CENTER    KS
## 18                             KING'S DAUGHTERS' MEDICAL CENTER    KY
## 19                               NORTH OAKS MEDICAL CENTER, LLC    LA
## 20                                            RUMFORD  HOSPITAL    ME
## 21                                       CIVISTA MEDICAL CENTER    MD
## 22                                             HEYWOOD HOSPITAL    MA
## 23                GENESYS REGIONAL MEDICAL CENTER - HEALTH PARK    MI
## 24                                HEALTHEAST WOODWINDS HOSPITAL    MN
## 25                                      MARION GENERAL HOSPITAL    MS
## 26                                             LIBERTY HOSPITAL    MO
## 27                                                           NA    MT
## 28                                                           NA    NE
## 29                                                           NA    NV
## 30                                   FRANKLIN REGIONAL HOSPITAL    NH
## 31                     CAPITAL HEALTH MEDICAL CENTER - HOPEWELL    NJ
## 32                                                           NA    NM
## 33                                 METROPOLITAN HOSPITAL CENTER    NY
## 34                                              MEDWEST HAYWOOD    NC
## 35                                                           NA    ND
## 36                                 CINCINNATI VA MEDICAL CENTER    OH
## 37                             JACKSON COUNTY MEMORIAL HOSPITAL    OK
## 38                ST ALPHONSUS MEDICAL CENTER - BAKER CITY, INC    OR
## 39                                               UPMC PASSAVANT    PA
## 40                        HOSPITAL METROPOLITANO DR TITO MATTEI    PR
## 41                                                           NA    RI
## 42                                      PALMETTO HEALTH BAPTIST    SC
## 43                                                           NA    SD
## 44                                   INDIAN PATH MEDICAL CENTER    TN
## 45                                       NIX HEALTH CARE SYSTEM    TX
## 46                                                           NA    UT
## 47                                                           NA    VT
## 48                                                           NA    VI
## 49                            CARILION GILES COMMUNITY HOSPITAL    VA
## 50                                       SWEDISH MEDICAL CENTER    WA
## 51                                       PLATEAU MEDICAL CENTER    WV
## 52                                         ST CROIX REG MED CTR    WI
## 53                                                           NA    WY
## 54                                                           NA    GU
#Top 20 on pneumonia across the Country
rankall("pneumonia", 20)
## [1] "SCOTTSDALE HEALTHCARE-SHEA MEDICAL CENTER"
## [1] "JOHNS HOPKINS BAYVIEW MEDICAL CENTER"
## [1] "CONCORD HOSPITAL"
## [1] "LOS ALAMOS MEDICAL CENTER"
## [1] "LINTON HOSPITAL - CAH"
## [1] "ST VINCENT CHARITY MEDICAL CENTER"
## [1] "SEQUOYAH MEMORIAL HOSPITAL"
## [1] "SISTEMA INTEGRADOS DE SALUD DEL SUR OESTE INC"
## [1] "MARLBORO PARK HOSPITAL"
## [1] "DAVIS HOSPITAL AND MEDICAL CENTER"
##                                              hospital state
## 1                              CHILTON MEDICAL CENTER    AL
## 2                                                  NA    AK
## 3           SCOTTSDALE HEALTHCARE-SHEA MEDICAL CENTER    AZ
## 4          BAPTIST HEALTH MEDICAL CENTER HEBER SPINGS    AR
## 5  FOUNTAIN VALLEY REGIONAL HOSPITAL & MEDICAL CENTER    CA
## 6                    VALLEY VIEW HOSPITAL ASSOCIATION    CO
## 7                             MIDSTATE MEDICAL CENTER    CT
## 8                                                  NA    DE
## 9                                                  NA    DC
## 10                    KENDALL REGIONAL MEDICAL CENTER    FL
## 11                           JASPER MEMORIAL HOSPITAL    GA
## 12                                                 NA    HI
## 13                        BOUNDARY COMMUNITY HOSPITAL    ID
## 14                      METHODIST HOSPITAL OF CHICAGO    IL
## 15                         ST MARY MEDICAL CENTER INC    IN
## 16                     OTTUMWA REGIONAL HEALTH CENTER    IA
## 17                           ANDERSON COUNTY HOSPITAL    KS
## 18                              NORTON HOSPITALS, INC    KY
## 19            THE REGIONAL MEDICAL CENTER OF ACADIANA    LA
## 20                       DOWN EAST COMMUNITY HOSPITAL    ME
## 21               JOHNS HOPKINS BAYVIEW MEDICAL CENTER    MD
## 22                               HOLY FAMILY HOSPITAL    MA
## 23                                   PENNOCK HOSPITAL    MI
## 24                                   BUFFALO HOSPITAL    MN
## 25                        GRENADA LAKE MEDICAL CENTER    MS
## 26                 BARNES-JEWISH WEST COUNTY HOSPITAL    MO
## 27                MARCUS DALY MEMORIAL HOSPITAL - CAH    MT
## 28                     FAITH REGIONAL HEALTH SERVICES    NE
## 29                              BOULDER CITY HOSPITAL    NV
## 30                                   CONCORD HOSPITAL    NH
## 31                           HOLY NAME MEDICAL CENTER    NJ
## 32                          LOS ALAMOS MEDICAL CENTER    NM
## 33              NIAGARA FALLS MEMORIAL MEDICAL CENTER    NY
## 34               SOUTHEASTERN REGIONAL MEDICAL CENTER    NC
## 35                              LINTON HOSPITAL - CAH    ND
## 36                  ST VINCENT CHARITY MEDICAL CENTER    OH
## 37                         SEQUOYAH MEMORIAL HOSPITAL    OK
## 38                          WALLOWA MEMORIAL HOSPITAL    OR
## 39                         ELK REGIONAL HEALTH CENTER    PA
## 40      SISTEMA INTEGRADOS DE SALUD DEL SUR OESTE INC    PR
## 41                                                 NA    RI
## 42                             MARLBORO PARK HOSPITAL    SC
## 43   ST MICHAEL'S HOSPITAL - CRITICAL ACCESS HOSPITAL    SD
## 44              METHODIST MEDICAL CENTER OF OAK RIDGE    TN
## 45                           DOCTORS HOSPITAL TIDWELL    TX
## 46                  DAVIS HOSPITAL AND MEDICAL CENTER    UT
## 47                                                 NA    VT
## 48                                                 NA    VI
## 49                   SENTARA NORFOLK GENERAL HOSPITAL    VA
## 50                     KADLEC REGIONAL MEDICAL CENTER    WA
## 51                          FAIRMONT GENERAL HOSPITAL    WV
## 52                           COMMUNITY MEMORIAL HSPTL    WI
## 53                         SHERIDAN VA MEDICAL CENTER    WY
## 54                                                 NA    GU

Conclusion

Researching for the right hospital takes time and is a long process. These programs allow for the user to narrow his/her search very quickly by drawing from data and information about the quality of care at over 4,000 hospitals in the U.S. Narrowing that search is half the battle!

A full description of the variables in each of the files is in the included PDF named Hospital_Revised_Flatfiles.pdf.

Enjoy the program!