Introduction

In October 2016, IHME released an update to the Global Burden of Disease estimates. For the first time, these updates included prevalence figures for difference diseases. This document aims to compare prevalence estimates for QOF with those calculated by IHME. The latest GBD visualisation can be accesses here: https://vizhub.healthdata.org/gbd-compare/.

Libraries

To munge and plot the data I’ll need a few packages

library(dplyr)
library(ggplot2)
library(stringr)

Import data

I’ve already exported the data from the GBD and QOF websites, so I just need to read in the files as csvs.

gbdOrig <- read.csv("IHME-GBD_2015_DATA-46e40f0f-1.csv")
qofOrig <- read.csv("qof.csv")
causelookup <- read.csv("cause_lookup.csv")
utlalookup <- read.csv("region_lookup.csv")

Munging

I need to munge and join the tables together to be able to plot them.

qof <- left_join(qofOrig,utlalookup, by = c("UTLA11" = "UTLA13CD")) %>%
        group_by(Year, GBD_name, PrevLabel) %>%
        summarise(count = sum(SumOfNumerator), 
                  denominator = sum(SumOfDenominator)) %>%
        data.frame() %>%
        mutate(GBD_name = as.character(GBD_name))

qofEng <- group_by(qof,Year,PrevLabel)  %>%
        summarise(count = sum(count), 
                  denominator = sum(denominator)) %>%
        mutate(GBD_name = "England") %>%
        data.frame()

qof <- rbind(qof,qofEng) %>%
        mutate(Measure = "QOF prevalence",
               Percent = 100 * (count/denominator)) %>%
        select(-(count:denominator)) %>%
        left_join(causelookup, by = c("PrevLabel" = "QOF")) %>%
        filter(!is.na(GBD)) %>%
        mutate(`GBD (QOF)` = paste0(GBD," (",PrevLabel,")")) %>%
        select(Year,`GBD (QOF)`,GBD_name,Measure,Percent) %>%
        data.frame() %>%
        rename(Location = GBD_name,`GBD (QOF)` = GBD..QOF.) 
        
gbd <- filter(gbdOrig,metric_name == "Rate" & sex_name == "Both" & location_name != "Global" ) %>%
        select(year,measure_name,location_name,cause_name,val) %>%
        mutate(measure_name = paste("GBD",tolower(measure_name),sep = " ")) %>%
        left_join(causelookup, by = c("cause_name" = "GBD")) %>%
        filter(!(is.na(QOF))) %>%
        mutate(val = val / 1000, 
               `GBD (QOF)` = paste0(cause_name," (",QOF,")")) %>%
        select(year,`GBD (QOF)`,location_name,measure_name,val) %>%
        rename(Year = year, Location = location_name, Measure = measure_name, Percent = val)

df <- rbind(gbd,qof) %>%
        mutate(PrevInc = ifelse(grepl("prevalence",Measure),"Prevalence","Incidence")) %>%
        droplevels()
df$`GBD (QOF)` <- factor(df$`GBD (QOF)`)
df$PrevInc <- factor(df$PrevInc, levels = c("Prevalence","Incidence"))
wrapletters <- 8
df$Location <- str_wrap(df$Location,wrapletters)
df$Location <- factor(df$Location,levels = str_wrap(c("England","North East England","North West England",
                         "Yorkshire and the Humber","East Midlands","West Midlands",
                         "East of England","Greater London",
                         "South East England","South West England"),wrapletters))
df$Measure <- factor(df$Measure, levels = c("GBD acute prevalence",
                               "GBD chronic prevalence", 
                               "GBD prevalence", 
                               "QOF prevalence",
                               "GBD acute incidence",
                               "GBD incidence"))

Plotting the outputs

I’ve munged the data, I just need to present the outputs. Each title is the GBD term for the condition followed by the QOF term for the condition in brackets. Not all of the conditions that are plotted against each other are identical.