In October 2016, IHME released an update to the Global Burden of Disease estimates. For the first time, these updates included prevalence figures for difference diseases. This document aims to compare prevalence estimates for QOF with those calculated by IHME. The latest GBD visualisation can be accesses here: https://vizhub.healthdata.org/gbd-compare/.
To munge and plot the data I’ll need a few packages
library(dplyr)
library(ggplot2)
library(stringr)
I’ve already exported the data from the GBD and QOF websites, so I just need to read in the files as csvs.
gbdOrig <- read.csv("IHME-GBD_2015_DATA-46e40f0f-1.csv")
qofOrig <- read.csv("qof.csv")
causelookup <- read.csv("cause_lookup.csv")
utlalookup <- read.csv("region_lookup.csv")
I need to munge and join the tables together to be able to plot them.
qof <- left_join(qofOrig,utlalookup, by = c("UTLA11" = "UTLA13CD")) %>%
group_by(Year, GBD_name, PrevLabel) %>%
summarise(count = sum(SumOfNumerator),
denominator = sum(SumOfDenominator)) %>%
data.frame() %>%
mutate(GBD_name = as.character(GBD_name))
qofEng <- group_by(qof,Year,PrevLabel) %>%
summarise(count = sum(count),
denominator = sum(denominator)) %>%
mutate(GBD_name = "England") %>%
data.frame()
qof <- rbind(qof,qofEng) %>%
mutate(Measure = "QOF prevalence",
Percent = 100 * (count/denominator)) %>%
select(-(count:denominator)) %>%
left_join(causelookup, by = c("PrevLabel" = "QOF")) %>%
filter(!is.na(GBD)) %>%
mutate(`GBD (QOF)` = paste0(GBD," (",PrevLabel,")")) %>%
select(Year,`GBD (QOF)`,GBD_name,Measure,Percent) %>%
data.frame() %>%
rename(Location = GBD_name,`GBD (QOF)` = GBD..QOF.)
gbd <- filter(gbdOrig,metric_name == "Rate" & sex_name == "Both" & location_name != "Global" ) %>%
select(year,measure_name,location_name,cause_name,val) %>%
mutate(measure_name = paste("GBD",tolower(measure_name),sep = " ")) %>%
left_join(causelookup, by = c("cause_name" = "GBD")) %>%
filter(!(is.na(QOF))) %>%
mutate(val = val / 1000,
`GBD (QOF)` = paste0(cause_name," (",QOF,")")) %>%
select(year,`GBD (QOF)`,location_name,measure_name,val) %>%
rename(Year = year, Location = location_name, Measure = measure_name, Percent = val)
df <- rbind(gbd,qof) %>%
mutate(PrevInc = ifelse(grepl("prevalence",Measure),"Prevalence","Incidence")) %>%
droplevels()
df$`GBD (QOF)` <- factor(df$`GBD (QOF)`)
df$PrevInc <- factor(df$PrevInc, levels = c("Prevalence","Incidence"))
wrapletters <- 8
df$Location <- str_wrap(df$Location,wrapletters)
df$Location <- factor(df$Location,levels = str_wrap(c("England","North East England","North West England",
"Yorkshire and the Humber","East Midlands","West Midlands",
"East of England","Greater London",
"South East England","South West England"),wrapletters))
df$Measure <- factor(df$Measure, levels = c("GBD acute prevalence",
"GBD chronic prevalence",
"GBD prevalence",
"QOF prevalence",
"GBD acute incidence",
"GBD incidence"))
I’ve munged the data, I just need to present the outputs. Each title is the GBD term for the condition followed by the QOF term for the condition in brackets. Not all of the conditions that are plotted against each other are identical.