Introducing the custom binning App

Jurriaan Nagelkerke
23-8-2015

Value of binning for predictive modeling

Binning is very useful in predictive modeling

  • To identify how variables are related
  • To visualize correlations
  • To get rid of outliers and missing values

Below, an example is included that shows how easy it is to spot a correlation between sepal length and the specy “Satosa” in the Iris dataset in the binned version of sepal with.

plot of chunk unnamed-chunk-1

Features of the app

The following features make the app very usefull:

  • You can select a data source (in the current version: MLBattend - an R example set from the UsingR package, see http://www.inside-r.org/packages/cran/UsingR/docs/MLBattend for details)
  • You can select all numeric fields within the dataset
  • You can select the number of equal-sized bins you would like to have
  • You can evaluatie the cutoff points visually since the points for the bins are visualized in the histogram graph of the original variable
  • You can adjust the exact cutoff values to have more meaningfull cutoffs. For example: 1.000.000 as a cutoff instead of 995289.29
  • You can copy and paste the resulting code to create the binned variable. In case you've adjusted the cutoffs, these adjusted values are used; otherwise the calculated cutoffs are used.

How to get to the app

The app can be found on the following URL: https://jurrr.shinyapps.io/course-project

If you would like to know more on the source code (server.R and ui.R): https://github.com/jurrr/DevelopingDataProducts_CourseProject

Happy binning!!

Thanks for trying out this app! Hope it suits your goals in transforming your analysis data!

beybey