Setting up the defaults:

knitr::opts_chunk$set(echo = TRUE, results = "asis")

Introduction

Download the file ProgAssignment3-data.zip file containing the data for Programming Assignment 3 from the Coursera web site. Unzip the file in a directory that will serve as your working directory. When you start up R make sure to change your working directory to the directory where you unzipped the data.

The data for this assignment come from the Hospital Compare web site (http://hospitalcompare.hhs.gov) run by the U.S. Department of Health and Human Services. The purpose of the web site is to provide data and information about the quality of care at over 4,000 Medicare-certified hospitals in the U.S. This dataset es- sentially covers all major U.S. hospitals. This dataset is used for a variety of purposes, including determining whether hospitals should be fined for not providing high quality care to patients (see http://goo.gl/jAXFX for some background on this particular topic).

The Hospital Compare web site contains a lot of data and we will only look at a small subset for this assignment. The zip file for this assignment contains three files:

A description of the variables in each of the files is in the included PDF file named Hospital_Revised_Flatfiles.pdf. This document contains information about many other files that are not included with this programming assignment. You will want to focus on the variables for Number 19 (“Outcome of Care Measures.csv”) and Number 11 (“Hospital Data.csv”). You may find it useful to print out this document (at least the pages for Tables 19 and 11) to have next to you while you work on this assignment. In particular, the numbers of the variables for each table indicate column indices in each table (i.e. “Hospital Name” is column 2 in the outcome-of-care-measures.csv file).

1 Plot the 30-day mortality rates for heart attack

Read the outcome data into R via the read.csv function and look at the first few rows.

outcome <- read.csv("outcome-of-care-measures.csv", colClasses = "character")
head(outcome)

Provider.Number Hospital.Name 1 010001 SOUTHEAST ALABAMA MEDICAL CENTER 2 010005 MARSHALL MEDICAL CENTER SOUTH 3 010006 ELIZA COFFEE MEMORIAL HOSPITAL 4 010007 MIZELL MEMORIAL HOSPITAL 5 010008 CRENSHAW COMMUNITY HOSPITAL 6 010010 MARSHALL MEDICAL CENTER NORTH Address.1 Address.2 Address.3 City State 1 1108 ROSS CLARK CIRCLE DOTHAN AL 2 2505 U S HIGHWAY 431 NORTH BOAZ AL 3 205 MARENGO STREET FLORENCE AL 4 702 N MAIN ST OPP AL 5 101 HOSPITAL CIRCLE LUVERNE AL 6 8000 ALABAMA HIGHWAY 69 GUNTERSVILLE AL ZIP.Code County.Name Phone.Number 1 36301 HOUSTON 3347938701 2 35957 MARSHALL 2565938310 3 35631 LAUDERDALE 2567688400 4 36467 COVINGTON 3344933541 5 36049 CRENSHAW 3343353374 6 35976 MARSHALL 2565718000 Hospital.30.Day.Death..Mortality..Rates.from.Heart.Attack 1 14.3 2 18.5 3 18.1 4 Not Available 5 Not Available 6 Not Available Comparison.to.U.S..Rate…Hospital.30.Day.Death..Mortality..Rates.from.Heart.Attack 1 No Different than U.S. National Rate 2 No Different than U.S. National Rate 3 No Different than U.S. National Rate 4 Number of Cases Too Small 5 Number of Cases Too Small 6 Number of Cases Too Small Lower.Mortality.Estimate…Hospital.30.Day.Death..Mortality..Rates.from.Heart.Attack 1 12.1 2 14.7 3 14.8 4 Not Available 5 Not Available 6 Not Available Upper.Mortality.Estimate…Hospital.30.Day.Death..Mortality..Rates.from.Heart.Attack 1 17.0 2 23.0 3 21.8 4 Not Available 5 Not Available 6 Not Available Number.of.Patients…Hospital.30.Day.Death..Mortality..Rates.from.Heart.Attack 1 666 2 44 3 329 4 14 5 9 6 22 Footnote…Hospital.30.Day.Death..Mortality..Rates.from.Heart.Attack 1
2
3
4 number of cases is too small (fewer than 25) to reliably tell how well the hospital is performing 5 number of cases is too small (fewer than 25) to reliably tell how well the hospital is performing 6 number of cases is too small (fewer than 25) to reliably tell how well the hospital is performing Hospital.30.Day.Death..Mortality..Rates.from.Heart.Failure 1 11.4 2 15.2 3 11.3 4 13.6 5 13.8 6 12.5 Comparison.to.U.S..Rate…Hospital.30.Day.Death..Mortality..Rates.from.Heart.Failure 1 No Different than U.S. National Rate 2 Worse than U.S. National Rate 3 No Different than U.S. National Rate 4 No Different than U.S. National Rate 5 No Different than U.S. National Rate 6 No Different than U.S. National Rate Lower.Mortality.Estimate…Hospital.30.Day.Death..Mortality..Rates.from.Heart.Failure 1 9.5 2 12.2 3 9.1 4 10.0 5 9.9 6 9.9 Upper.Mortality.Estimate…Hospital.30.Day.Death..Mortality..Rates.from.Heart.Failure 1 13.7 2 18.8 3 13.9 4 18.2 5 18.7 6 15.6 Number.of.Patients…Hospital.30.Day.Death..Mortality..Rates.from.Heart.Failure 1 741 2 234 3 523 4 113 5 53 6 163 Footnote…Hospital.30.Day.Death..Mortality..Rates.from.Heart.Failure 1
2
3
4
5
6
Hospital.30.Day.Death..Mortality..Rates.from.Pneumonia 1 10.9 2 13.9 3 13.4 4 14.9 5 15.8 6 8.7 Comparison.to.U.S..Rate…Hospital.30.Day.Death..Mortality..Rates.from.Pneumonia 1 No Different than U.S. National Rate 2 No Different than U.S. National Rate 3 No Different than U.S. National Rate 4 No Different than U.S. National Rate 5 No Different than U.S. National Rate 6 Better than U.S. National Rate Lower.Mortality.Estimate…Hospital.30.Day.Death..Mortality..Rates.from.Pneumonia 1 8.6 2 11.3 3 11.2 4 11.6 5 11.4 6 6.8 Upper.Mortality.Estimate…Hospital.30.Day.Death..Mortality..Rates.from.Pneumonia 1 13.7 2 17.0 3 15.8 4 19.0 5 21.5 6 11.0 Number.of.Patients…Hospital.30.Day.Death..Mortality..Rates.from.Pneumonia 1 371 2 372 3 836 4 239 5 61 6 315 Footnote…Hospital.30.Day.Death..Mortality..Rates.from.Pneumonia 1
2
3
4
5
6
Hospital.30.Day.Readmission.Rates.from.Heart.Attack 1 19.0 2 Not Available 3 17.8 4 Not Available 5 Not Available 6 Not Available Comparison.to.U.S..Rate…Hospital.30.Day.Readmission.Rates.from.Heart.Attack 1 No Different than U.S. National Rate 2 Number of Cases Too Small 3 No Different than U.S. National Rate 4 Number of Cases Too Small 5 Number of Cases Too Small 6 Number of Cases Too Small Lower.Readmission.Estimate…Hospital.30.Day.Readmission.Rates.from.Heart.Attack 1 16.6 2 Not Available 3 14.9 4 Not Available 5 Not Available 6 Not Available Upper.Readmission.Estimate…Hospital.30.Day.Readmission.Rates.from.Heart.Attack 1 21.7 2 Not Available 3 21.5 4 Not Available 5 Not Available 6 Not Available Number.of.Patients…Hospital.30.Day.Readmission.Rates.from.Heart.Attack 1 728 2 21 3 342 4 1 5 4 6 13 Footnote…Hospital.30.Day.Readmission.Rates.from.Heart.Attack 1
2 number of cases is too small (fewer than 25) to reliably tell how well the hospital is performing 3
4 number of cases is too small (fewer than 25) to reliably tell how well the hospital is performing 5 number of cases is too small (fewer than 25) to reliably tell how well the hospital is performing 6 number of cases is too small (fewer than 25) to reliably tell how well the hospital is performing Hospital.30.Day.Readmission.Rates.from.Heart.Failure 1 23.7 2 22.5 3 19.8 4 27.1 5 24.7 6 23.9 Comparison.to.U.S..Rate…Hospital.30.Day.Readmission.Rates.from.Heart.Failure 1 No Different than U.S. National Rate 2 No Different than U.S. National Rate 3 Better than U.S. National Rate 4 No Different than U.S. National Rate 5 No Different than U.S. National Rate 6 No Different than U.S. National Rate Lower.Readmission.Estimate…Hospital.30.Day.Readmission.Rates.from.Heart.Failure 1 21.3 2 19.2 3 17.2 4 22.4 5 19.9 6 20.1 Upper.Readmission.Estimate…Hospital.30.Day.Readmission.Rates.from.Heart.Failure 1 26.5 2 26.1 3 22.9 4 31.9 5 30.2 6 28.2 Number.of.Patients…Hospital.30.Day.Readmission.Rates.from.Heart.Failure 1 891 2 264 3 614 4 135 5 59 6 173 Footnote…Hospital.30.Day.Readmission.Rates.from.Heart.Failure 1
2
3
4
5
6
Hospital.30.Day.Readmission.Rates.from.Pneumonia 1 17.1 2 17.6 3 16.9 4 19.4 5 18.0 6 18.7 Comparison.to.U.S..Rate…Hospital.30.Day.Readmission.Rates.from.Pneumonia 1 No Different than U.S. National Rate 2 No Different than U.S. National Rate 3 No Different than U.S. National Rate 4 No Different than U.S. National Rate 5 No Different than U.S. National Rate 6 No Different than U.S. National Rate Lower.Readmission.Estimate…Hospital.30.Day.Readmission.Rates.from.Pneumonia 1 14.4 2 15.0 3 14.7 4 15.9 5 14.0 6 15.7 Upper.Readmission.Estimate…Hospital.30.Day.Readmission.Rates.from.Pneumonia 1 20.4 2 20.6 3 19.5 4 23.2 5 22.8 6 22.2 Number.of.Patients…Hospital.30.Day.Readmission.Rates.from.Pneumonia 1 400 2 374 3 842 4 254 5 56 6 326 Footnote…Hospital.30.Day.Readmission.Rates.from.Pneumonia 1
2
3
4
5
6

There are many columns in this dataset. You can see how many by typing ncol(outcome) (you can see the number of rows with the nrow function). In addition, you can see the names of each column by typing names(outcome) (the names are also in the PDF document.

To make a simple histogram of the 30-day death rates from heart attack (column 11 in the outcome dataset), run

outcome[, 11] <- as.numeric(outcome[, 11])
## Warning: NAs introduced by coercion
## You may get a warning about NAs being introduced; that is okay
hist(outcome[, 11]) 

Because we originally read the data in as character (by specifying colClasses = “character” we need to coerce the column to be numeric. You may get a warning about NAs being introduced but that is okay.

2 Finding the best hospital in a state

Write a function called best that take two arguments: the 2-character abbreviated name of a state and an outcome name. The function reads the outcome-of-care-measures.csv file and returns a character vector with the name of the hospital that has the best (i.e. lowest) 30-day mortality for the specified outcome in that state. The hospital name is the name provided in the Hospital.Name variable. The outcomes can be one of “heart attack”, “heart failure”, or “pneumonia”. Hospitals that do not have data on a particular outcome should be excluded from the set of hospitals when deciding the rankings.

Handling ties. If there is a tie for the best hospital for a given outcome, then the hospital names should be sorted in alphabetical order and the first hospital in that set should be chosen (i.e. if hospitals “b”, “c”, and “f” are tied for best, then hospital “b” should be returned).

Answer:

best <- function(state, outcome){
  library(dplyr)
  ## Read outcome data
  outcomeData <- read.csv("outcome-of-care-measures.csv",colClasses = "character")
  bestHospital <- NULL
  rate <- NULL
  
  ## Function for heart attack
  inHeartAttack <- function(s, x){
    x <- select(x, State, Hospital.Name, Hospital.30.Day.Death..Mortality..Rates.from.Heart.Attack)
    filtered <- x[x$State==s & x$Hospital.30.Day.Death..Mortality..Rates.from.Heart.Attack != 'Not Available' ,c("Hospital.Name","Hospital.30.Day.Death..Mortality..Rates.from.Heart.Attack")]
    
    ## Sort by Hospital name
    sortedData <- arrange(filtered, Hospital.Name)
    
    ## Sort by Rate
    sortedData$Hospital.30.Day.Death..Mortality..Rates.from.Heart.Attack <- as.numeric(sortedData$Hospital.30.Day.Death..Mortality..Rates.from.Heart.Attack)
    sortedData <- arrange(sortedData, Hospital.30.Day.Death..Mortality..Rates.from.Heart.Attack)

    ## Return hospital name in that state with lowest 30-day death
    bestHosp <- sortedData
    
  }
  
  ## Function for heart failure
  inHeartFailure <- function(s, x){
    x <- select(x, State, Hospital.Name, Hospital.30.Day.Death..Mortality..Rates.from.Heart.Failure)
    filtered <- x[x$State==s & x$Hospital.30.Day.Death..Mortality..Rates.from.Heart.Failure != 'Not Available' ,c("Hospital.Name","Hospital.30.Day.Death..Mortality..Rates.from.Heart.Failure")]
    
    ## Sort by Hospital name
    sortedData <- arrange(filtered, Hospital.Name)
    
    ## Sort by Rate
    sortedData$Hospital.30.Day.Death..Mortality..Rates.from.Heart.Failure <- as.numeric(sortedData$Hospital.30.Day.Death..Mortality..Rates.from.Heart.Failure)
    sortedData <- arrange(sortedData, Hospital.30.Day.Death..Mortality..Rates.from.Heart.Failure)
    
    ## Return hospital name in that state with lowest 30-day death
    bestHosp <- sortedData
    
  }
  
  ## Function for pneumonia
  inPneumonia <- function(s, x){
    x <- select(x, State, Hospital.Name, Hospital.30.Day.Death..Mortality..Rates.from.Pneumonia)
    filtered <- x[x$State==s & x$Hospital.30.Day.Death..Mortality..Rates.from.Pneumonia != 'Not Available' ,c("Hospital.Name","Hospital.30.Day.Death..Mortality..Rates.from.Pneumonia")]
    
    ## Sort by Hospital name
    sortedData <- arrange(filtered, Hospital.Name)
    
    ## Sort by Rate
    sortedData$Hospital.30.Day.Death..Mortality..Rates.from.Pneumonia <- as.numeric(sortedData$Hospital.30.Day.Death..Mortality..Rates.from.Pneumonia)
    sortedData <- arrange(sortedData, Hospital.30.Day.Death..Mortality..Rates.from.Pneumonia)
    
    ## Return hospital name in that state with lowest 30-day death
    bestHosp <- sortedData
    
  }
  
  ## Main
  ## Check that outcome is valid
  if (outcome == "heart attack"){
    ## Check that state is valid
    if (length(outcomeData[outcomeData$State == state,c("State")])>0){
      bh <- inHeartAttack(state, outcomeData)
      bestHospital <- bh[1,c("Hospital.Name")]
      rate <- bh[1,2]
    } else{
      print(paste("Error in best(", state, ", ", outcome,") : invalid state", sep=""))
    }
  } else if (outcome == "heart failure"){
    ## Check that state is valid
    if (length(outcomeData[outcomeData$State == state,c("State")])>0){
      bh <- inHeartFailure(state, outcomeData)
      bestHospital <- bh[1,c("Hospital.Name")]
      rate <- bh[1,2]
      
    } else{
      print(paste("Error in best(", state, ", ", outcome,") : invalid state", sep=""))
    }
  } else if (outcome == "pneumonia"){
    ## Check that state is valid
    if (length(outcomeData[outcomeData$State == state,c("State")])>0){
      bh <- inPneumonia(state, outcomeData)
      bestHospital <- bh[1,c("Hospital.Name")]
      rate <- bh[1,2]
    } else{
      print(paste("Error in best(", state, ", ", outcome,") : invalid state", sep=""))
    }
  } else{
    print(paste("Error in best(", state, ", ", outcome,") : invalid outcome", sep=""))
  }
  
  bestHospital
}

The function should check the validity of its arguments. If an invalid state value is passed to best, the function should throw an error via the stop function with the exact message “invalid state”. If an invalid outcome value is passed to best, the function should throw an error via the stop function with the exact message “invalid outcome”.

Here is some sample output from the function.

best("TX", "heart attack")
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union

[1] “CYPRESS FAIRBANKS MEDICAL CENTER”

best("TX", "heart failure")

[1] “FORT DUNCAN MEDICAL CENTER”

best("MD", "heart attack")

[1] “JOHNS HOPKINS HOSPITAL, THE”

best("MD", "pneumonia")

[1] “GREATER BALTIMORE MEDICAL CENTER”

best("BB", "heart attack")

[1] “Error in best(BB, heart attack) : invalid state” NULL

best("NY", "hert attack")

[1] “Error in best(NY, hert attack) : invalid outcome” NULL

Save your code for this function to a file named best.R.

3 Ranking hospitals by outcome in a state

Write a function called rankhospital that takes three arguments: the 2-character abbreviated name of a state (state), an outcome (outcome), and the ranking of a hospital in that state for that outcome (num). The function reads the outcome-of-care-measures.csv file and returns a character vector with the name of the hospital that has the ranking specified by the num argument. For example, the call

rankhospital(“MD”, “heart failure”, 5)

would return a character vector containing the name of the hospital with the 5th lowest 30-day death rate for heart failure. The num argument can take values “best”, “worst”, or an integer indicating the ranking (smaller numbers are better). If the number given by num is larger than the number of hospitals in that state, then the function should return NA. Hospitals that do not have data on a particular outcome should be excluded from the set of hospitals when deciding the rankings.

Handling ties. It may occur that multiple hospitals have the same 30-day mortality rate for a given cause of death. In those cases ties should be broken by using the hospital name. For example, in Texas (“TX”), the hospitals with lowest 30-day mortality rate for heart failure are shown here.

head(texas)

Hospital.Name Rate Rank

3935 FORT DUNCAN MEDICAL CENTER 8.1 1

4085 TOMBALL REGIONAL MEDICAL CENTER 8.5 2

4103 CYPRESS FAIRBANKS MEDICAL CENTER 8.7 3

3954 DETAR HOSPITAL NAVARRO 8.7 4

4010 METHODIST HOSPITAL,THE 8.8 5

3962 MISSION REGIONAL MEDICAL CENTER 8.8 6

Note that Cypress Fairbanks Medical Center and Detar Hospital Navarro both have the same 30-day rate (8.7). However, because Cypress comes before Detar alphabetically, Cypress is ranked number 3 in this scheme and Detar is ranked number 4. One can use the order function to sort multiple vectors in this manner (i.e. where one vector is used to break ties in another vector).

Answer:

rankhospital <- function(state, outcome, num = "best"){
  library(dplyr)
  ## Read outcome data
  outcomeData <- read.csv("outcome-of-care-measures.csv", colClasses = "character")
  hospital <- NULL
  rate <- NULL
  range <- NULL
  
  ## Function for heart attack
  inHeartAttack <- function(s, x){
    x <- select(x, State, Hospital.Name, Hospital.30.Day.Death..Mortality..Rates.from.Heart.Attack)
    filtered <- x[x$State==s & x$Hospital.30.Day.Death..Mortality..Rates.from.Heart.Attack != 'Not Available' ,c("Hospital.Name","Hospital.30.Day.Death..Mortality..Rates.from.Heart.Attack")]
    
    ## Sort by Hospital name
    sortedData <- arrange(filtered, Hospital.Name)
    
    ## Sort by Rate
    sortedData$Hospital.30.Day.Death..Mortality..Rates.from.Heart.Attack <- as.numeric(sortedData$Hospital.30.Day.Death..Mortality..Rates.from.Heart.Attack)
    sortedData <- arrange(sortedData, Hospital.30.Day.Death..Mortality..Rates.from.Heart.Attack)
    
    ## Return hospital name in that state with lowest 30-day death
    bestHosp <- sortedData
    
  }
  
  ## Function for heart failure
  inHeartFailure <- function(s, x){
    x <- select(x, State, Hospital.Name, Hospital.30.Day.Death..Mortality..Rates.from.Heart.Failure)
    filtered <- x[x$State==s & x$Hospital.30.Day.Death..Mortality..Rates.from.Heart.Failure != 'Not Available' ,c("Hospital.Name","Hospital.30.Day.Death..Mortality..Rates.from.Heart.Failure")]
    
    ## Sort by Hospital name
    sortedData <- arrange(filtered, Hospital.Name)
    
    ## Sort by Rate
    sortedData$Hospital.30.Day.Death..Mortality..Rates.from.Heart.Failure <- as.numeric(sortedData$Hospital.30.Day.Death..Mortality..Rates.from.Heart.Failure)
    sortedData <- arrange(sortedData, Hospital.30.Day.Death..Mortality..Rates.from.Heart.Failure)
    
    ## Return hospital name in that state with lowest 30-day death
    bestHosp <- sortedData
    
  }
  
  ## Function for pneumonia
  inPneumonia <- function(s, x){
    x <- select(x, State, Hospital.Name, Hospital.30.Day.Death..Mortality..Rates.from.Pneumonia)
    filtered <- x[x$State==s & x$Hospital.30.Day.Death..Mortality..Rates.from.Pneumonia != 'Not Available' ,c("Hospital.Name","Hospital.30.Day.Death..Mortality..Rates.from.Pneumonia")]
    
    ## Sort by Hospital name
    sortedData <- arrange(filtered, Hospital.Name)
    
    ## Sort by Rate
    sortedData$Hospital.30.Day.Death..Mortality..Rates.from.Pneumonia <- as.numeric(sortedData$Hospital.30.Day.Death..Mortality..Rates.from.Pneumonia)
    sortedData <- arrange(sortedData, Hospital.30.Day.Death..Mortality..Rates.from.Pneumonia)
    
    ## Return hospital name in that state with lowest 30-day death
    bestHosp <- sortedData
    
  }
  
  ## Finalize Result
  getHospital <- function(ds, range){
    if (range == "best")
      ds[1, ]
    else if (range == "worst")
      ds[nrow(ds),]
    else if (is.numeric(range))
      ds[range,]
  }
  
  ## Main
  ## Check that outcome is valid
  if (outcome == "heart attack"){
    ## Check that state is valid
    if (length(outcomeData[outcomeData$State == state,c("State")])>0){
      bh <- inHeartAttack(state, outcomeData)
      bh <- getHospital(bh, num)
      hospital <- bh[,c("Hospital.Name")]
      rate <- bh[,2]
    } else{
      hospital <- paste("Error in best(", state, ", ", outcome,") : invalid state", sep="")
    }
  } else if (outcome == "heart failure"){
    ## Check that state is valid
    if (length(outcomeData[outcomeData$State == state,c("State")])>0){
      bh <- inHeartFailure(state, outcomeData)
      bh <- getHospital(bh, num)
      hospital <- bh[,c("Hospital.Name")]
      rate <- bh[,2]
    } else{
      hospital <- paste("Error in best(", state, ", ", outcome,") : invalid state", sep="")
    }
  } else if (outcome == "pneumonia"){
    ## Check that state is valid
    if (length(outcomeData[outcomeData$State == state,c("State")])>0){
      bh <- inPneumonia(state, outcomeData)
      bh <- getHospital(bh, num)
      hospital <- bh[,c("Hospital.Name")]
      rate <- bh[,2]
    } else{
      hospital <- paste("Error in best(", state, ", ", outcome,") : invalid state", sep="")
    }
  } else{
    hospital <- paste("Error in best(", state, ", ", outcome,") : invalid outcome", sep="")
  }
  hospital
}

The function should check the validity of its arguments. If an invalid state value is passed to best, the function should throw an error via the stop function with the exact message “invalid state”. If an invalid outcome value is passed to best, the function should throw an error via the stop function with the exact message “invalid outcome”.

Here is some sample output from the function.

rankhospital("TX", "heart failure", 4)

[1] “DETAR HOSPITAL NAVARRO”

rankhospital("MD", "heart attack", "worst") 

[1] “HARFORD MEMORIAL HOSPITAL”

rankhospital("MN", "heart attack", 5000)

[1] NA

Save your code for this function to a file named rankhospital.R.

4 Ranking hospitals in all states

Write a function called rankall that takes two arguments: an outcome name (outcome) and a hospital rank- ing (num). The function reads the outcome-of-care-measures.csv file and returns a 2-column data frame containing the hospital in each state that has the ranking specified in num. For example the function call rankall(“heart attack”, “best”) would return a data frame containing the names of the hospitals that are the best in their respective states for 30-day heart attack death rates. The function should return a value for every state (some may be NA). The first column in the data frame is named hospital, which contains the hospital name, and the second column is named state, which contains the 2-character abbreviation for the state name. Hospitals that do not have data on a particular outcome should be excluded from the set of hospitals when deciding the rankings.

Handling ties. The rankall function should handle ties in the 30-day mortality rates in the same way that the rankhospital function handles ties.

NOTE: For the purpose of this part of the assignment (and for efficiency), your function should NOT call the rankhospital function from the previous section.

The function should check the validity of its arguments. If an invalid outcome value is passed to rankall, the function should throw an error via the stop function with the exact message “invalid outcome”. The num variable can take values “best”, “worst”, or an integer indicating the ranking (smaller numbers are better). If the number given by num is larger than the number of hospitals in that state, then the function should return NA.

Answer:

rankall <- function(outcome, num = "best"){
  ## Re-use rankhospital.R
  ##source("rankhospital.R")
  ## Read outcome data
  states <- read.csv("outcome-of-care-measures.csv", colClasses = "character")
  states <- unique(select(states, State))
  states <- arrange(states,State)
  ctr <- 1
  end <- nrow(states)
  dataset <- data.frame("hospital"=character(0),"state"=character(0))
  
  while (ctr <= end){
    rh <- rankhospital(states[ctr,], outcome, num)

    if (length(rh[1]) == 0){
      rh <- "NA"
    }
    newDataSet <- data.frame("hospital" = rh[1], "state" = states[ctr,])
    dataset <- rbind(dataset, newDataSet)
    ctr <- ctr + 1
  }
  
  dataset
  
}

Here is some sample output from the function.

head(rankall("heart attack", 20), 10)
                          hospital state

1 AK 2 D W MCMILLAN MEMORIAL HOSPITAL AL 3 ARKANSAS METHODIST MEDICAL CENTER AR 4 JOHN C LINCOLN DEER VALLEY HOSPITAL AZ 5 SHERMAN OAKS HOSPITAL CA 6 SKY RIDGE MEDICAL CENTER CO 7 MIDSTATE MEDICAL CENTER CT 8 DC 9 DE 10 SOUTH FLORIDA BAPTIST HOSPITAL FL

tail(rankall("pneumonia", "worst"), 3)
                                 hospital state

52 MAYO CLINIC HEALTH SYSTEM - NORTHLAND, INC WI 53 PLATEAU MEDICAL CENTER WV 54 NORTH BIG HORN HOSPITAL DISTRICT WY

tail(rankall("heart failure"), 10)
                                                        hospital state

45 WELLMONT HAWKINS COUNTY MEMORIAL HOSPITAL TN 46 FORT DUNCAN MEDICAL CENTER TX 47 VA SALT LAKE CITY HEALTHCARE - GEORGE E. WAHLEN VA MEDICAL CENTER UT 48 SENTARA POTOMAC HOSPITAL VA 49 GOV JUAN F LUIS HOSPITAL & MEDICAL CTR VI 50 SPRINGFIELD HOSPITAL VT 51 HARBORVIEW MEDICAL CENTER WA 52 AURORA ST LUKES MEDICAL CENTER WI 53 FAIRMONT GENERAL HOSPITAL WV 54 CHEYENNE VA MEDICAL CENTER WY

Save your code for this function to a file named rankall.R.