J Davis, Ph.D.
May 6, 2016
Make available real-world data on polio incidences in the US from 1928-1968
Make available a method to interact with and explore the data in charts
Full-fill the requirements for a simplified app for Johns Hopkins' Data Science Certification series
Data Description & Instructions: Data gathered from 40 years of monitoring Polio instances in the US. Data types as indicated inc=incidence ratio; loc=location code; region = region as defined by US Census; number = number of instances of polio; population = population of the state; week = the week of the year in which data was collected; state=Postal Service two-letter state code. Data was originally downloaded from healthdata.gov. The first tab shows a frequency plot of Polio Incidences through time, the second tab is interactive barplot of data, and the third contains raw data in table form.
Data was obtained from healthcare.gov and is a subset of the Tycho project. You can find out more about the Tycho project and open data at http://www.tycho.pitt.edu/
Years of Data Reported
Min. 1st Qu. Median Mean 3rd Qu. Max.
1928 1936 1944 1945 1954 1968
Incidence Summary
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.0000 0.0000 0.0300 0.2175 0.1600 33.0800
Example of Data by Region
Polio was most prevalent in the south and midwest, probably due to a lag in vaccination regimens. However per population capita it was more frequent in the northeast. Data separation by regions based on US Census definitions.
Example Data by Year
Polio data is available by year as well. You can use the app to explor data in table format as well as gain an overview of Polio decline.
Intent of the app is to use data to educate and tell the story of Polio and its erradication in the US
Exploration of the history of polio using real world data is enabled by the PolioHxApp
Vaccines often do not cure diseases against which they are targeted; this was the case with Polio
Future goals, (i) building the app out using D3.js as a front-end for beautiful, interactive, graphic visualization (ii) adding machine learning and time-series analysis to the data summary analysis, with R as the compute engine
-Interact with the data on Shiny here (click or paste into browswer): https://jddavis.shinyapps.io/FinalAppPolioData/