The Background

In my last post for bar chart races, I looked at COVID deaths versus global deaths from other communicable diseases. One of the issues that is ignored by that comparison is that much of the global communicable disease mortality burden exists in developing economies. Infectious disease as a cause of death has declined significantly in advanced economies, which is partly what makes convincing people of the necessity of public health measures so difficult. They are not use to dealing with anything like this.

So, I wanted to compare deaths from infectious diseases in advanced economies to COVID deaths in those same economies.

What IS an advanced economy? The best list I can find is kept by the International Monetary Fund. I pulled the rate of deaths from the Global Burden of Disease Study query tool and populations for each country from the UN population database.

Just as before, I pulled the data from the GBD query and cleaned it up in a CSV before importing it to R. I also calculated the population within Excel by taking the number of deaths for each year for each country and dividing it by the rate. This gave me the population in that country that year in 100,000s.

df<-read.csv("~/csv/oecdinfdiz.csv",stringsAsFactors = F)
df$number<-df$val*df$pop diseases<-unique(df$cause)
data<-data.frame(disease=rep(diseases,28),year=rep(1990:2017,each=33),rate=NA)
numbers<-df%>%group_by(cause,year)%>%summarise(number=sum(number),pop=sum(pop))
numbers$rate<-numbers$number/numbers$pop numbers$code<-paste(numbers$cause,numbers$year)
data$code<-paste(data$disease,data$year) data$rate<-numbers$rate[match(data$code,numbers$code)] g<-data%>%group_by(year)%>%summarise(rate=sum(rate)) ggplot(g,aes(x=year,y=rate))+geom_line() COVID Now that I have the infectious diseases data, I just need the COVID data for the target countries. deaths<-read_csv("https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_deaths_global.csv") names(deaths)[1:2]<-c("province","country") ncol<-ncol(deaths) lastdate<-names(deaths)[ncol] countries<-unique(df$location)
countries[4]<-"US"
countries[6]<-"Taiwan*"
countries[17]<-"Czechia"
countries[28]<-"Korea, South"
deaths<-deaths%>%filter(country%in%countries)
deaths<-gather(deaths,date,cumulative_cases,'1/22/20':paste(lastdate))
deaths$province<-ifelse(is.na(deaths$province),"",deaths$province) deaths<-deaths%>%group_by(date)%>%summarise(cumulative_cases=sum(cumulative_cases)) deaths$date<-mdy(deaths\$date)
deaths<-deaths%>%arrange(date)

Conclusion

And there you have it. COVID 19 deaths in advanced economies versus deaths caused by other communicable diseases.