Warmup - Predict Blood Donations

With Various Classification Algorithms (An R Shiny Powered Application)

Ash Chakraborty
Data Analyst


"Warmup: Predict Blood Donations" - A Drivendata.org Competition

  • DrivenData.org hosts data science competitions for social good;
  • The Predict Blood Donations competition provides a dataset from a mobile blood donation vehicle in Taiwan,
  • The aim is to predict whether or not the donor will return for a donation in March 2007;
  • This R Shiny powered application helps contestants quickly explore the simplest parametric approach by entering the predictor values and observing the in-sample prediction (along with prediction performance) of various classification algorithms;
  • Such an application may eventually help mobile center Phlebotomists to identify high likelihood donors.

Blood Donations are Crucial!

Good data-driven systems for tracking and predicting donations and supply needs can improve the entire supply chain, making sure that more patients get the blood transfusions they need.

  • Every two seconds someone in the U.S. needs blood.
  • More than 41,000 blood donations are needed every day.
  • A total of 30 million blood components are transfused each year in the U.S.
  • A single car accident victim can require as many as 100 pints of blood.

Source: About Blood Donations

Analyzing Available Data

The mobile blood donation vehicle has made available the following data set. Let's take a look at the relationship between the predictors used in the basic parametric model.

plot of chunk plot1

We observe the following:

  • Donors that began donating between 1.5 to 4 years ago seem likely to return.
  • Donors that have recently donated - within 1.5 years - seem likely to return.
  • Higher donation volume also seems to be a good indicator of returning donors.

Don't Sweat the Basics

This webapp can thus get competitors jump-started on advanced model analysis by providing an idea of how some common classification algorithms perform on the entire dataset provided. The algorithms are fed a parametric model of the form: isMarch07Donor ~ First Donation + Last Donation + Donation Volume

This webapp, once productionalized, may be used by real Phlebotomists to determine donation likelihood of donors on file during at designated donation stops.

Links: