May 19, 2016

Tuberculosis: A Global Problem

The World Health Organization Says:

"Tuberculosis is one of the top 5 causes of death globally among women aged 15-44, and in 2014 alone killed approximately 1.5 million people worldwide."


Research Questions:

  • Do countries that have high incidence of TB share certain measurable attributes?

  • Do countries that have relatively lower incidence of TB share certain measurable attributes that countries with high incidences of TB should perhaps strive to emulate?

Data Used

Tuberculosis Infection Counts

  • From Assignment 3: TB Case Counts for 100 countries (1995 - 2013)

Country Population Counts

  • From Assignment 3: Population counts for 100 countries (1995 - 2013).

World Bank Data:

  • Life Expectancy at Birth (in years) per Country (1995 - 2013);

  • Health Care Expenditure Per Capita per Country (1995 - 2013);

  • Gross National Income (GNI) Per Capita per Country (1995 - 2013);

  • The percentage of a country's population having access to electricity (2000, 2010, 2012)

United Nations Data:

  • Average years of Schooling per country (2000, 2005 - 2012)

Data Acquisition

All data is downloaded / scraped, transformed / 'tidied', and then stored in MySQL

Data Source Format
TB Infection Counts MySQL Table "Long"
Population Counts CSV File "Long"
World Bank Data CSV Files (4) "Wide"
UN Data Web Scrape HTML,"Wide"


Challenges:

  • Each data source + R's rworldmap package all use different country naming conventions

  • Multiple country name "lookup" tables had to be created by hand in MySQL

  • Lack of consistency in the calendar years covered by data sets

Analysis

  • TB Case Counts vs. TB Infection Rates: Geographical "Hotspots"


  • Trends in TB Case Counts & Infection Rates


  • Per Capita Metrics vs. TB Infection Rates

Average Annual TB Case Counts (1995 - 2013)

Average Annual TB Infection Rates (1995 - 2013)

Average Annual TB Case Counts (1995 - 2013)

Average Annual TB Infection Rates (1995 - 2013)

Cause For Alarm

Countries With Both High TB Case Counts & High Infection Rates
country
1 South Africa
2 Zimbabwe
3 North Korea
4 Kenya
5 Democratic Republic of the Congo

Trends

Evidence of Worldwide TB Case Spike 2006 - 2008

Per Capita Metrics vs. TB Infection Rates

Linear least squares regression finds little relation between per capita metrics and TB infection rates

Per Capita Metrics vs. TB Infection Rates

However, there clearly are major differences in the per capita metrics of countries with low and high TB infection rates:

Type of Country Life Exp. HC Exp. GNI Elec. Acc. Schooling
Low TB Rate 76.63 yrs $966.90 $25,640 100.00% 9.05 yrs
High TB Rate 59.66 yrs $114.20 $ 3,280 37.45% 6.07 yrs
Difference: 16.97 yrs $852.70 $22,360 62.55% 2.98 yrs


So what is going on here?

  • A relationship likely DOES exist between the per capita metrics and TB infection rates.

  • However, if such a relationship exists, it must be non-linear.

Conclusions

  • Linear least squares modeling is not useful for identifying relationships between per capita metrics and TB infection rates.

  • However, the differences in per capita metrics of countries with low and high TB infection rates indicates that some sort of relationship likely DOES exist.

  • Therefore, other types non-linear modeling would be required


Access the Full Report & Separate Data Acquisition Module at: