Introducing the custom binning App

Jurriaan Nagelkerke
23-8-2015

Value of binning for predictive modeling

Binning is very useful in predictive modeling

  • To identify how variables are related
  • To visualize correlations
  • To get rid of outliers and missing values

Features of the app

The following features make the app very useful:

  • You can select a data source (in the current version: MLBattend - an R example set from the UsingR package, see http://www.inside-r.org/packages/cran/UsingR/docs/MLBattend for details)
  • You can select all numeric fields within the dataset
  • You can select the number of equal-sized bins you would like to have
  • You can evaluatie the cutoff points visually since the points for the bins are visualized in the histogram graph of the original variable
  • You can adjust the exact cutoff values to have more meaningfull cutoffs. For example: 1.000.000 as a cutoff instead of 995289.29
  • You can copy and paste the resulting code to create the binned variable. In case you've adjusted the cutoffs, these adjusted values are used; otherwise the calculated cutoffs are used.

How to get to the app

The app can be found on the following URL: https://jurrr.shinyapps.io/course-project If you would like to know more on the source code (server.R and ui.R): https://github.com/jurrr/DevelopingDataProducts_CourseProject

Below some details on the data used to develop the Shiny App for custom binning:

library(UsingR)
data(MLBattend)
str(MLBattend)
'data.frame':   838 obs. of  10 variables:
 $ franchise   : Factor w/ 33 levels "ANA","ARI","ATL",..: 4 5 10 12 21 33 6 7 15 19 ...
 $ league      : Factor w/ 2 levels "AL","NL": 1 1 1 1 1 1 1 1 1 1 ...
 $ division    : Factor w/ 3 levels "CENT","EAST",..: 2 2 2 2 2 2 3 3 3 3 ...
 $ year        : num  69 69 69 69 69 69 69 69 69 69 ...
 $ attendance  : num  1062069 1833246 619970 1577481 1067996 ...
 $ runs.scored : num  779 743 573 701 562 694 528 625 586 790 ...
 $ runs.allowed: num  517 736 717 601 587 644 652 723 688 618 ...
 $ wins        : num  109 87 62 90 80 86 71 68 69 97 ...
 $ losses      : num  53 75 99 72 81 76 91 94 93 65 ...
 $ games.behind: num  0 22 46.5 19 28.5 23 26 29 28 0 ...

Happy binning!!

Thanks for trying out this app! Hope it suits your goals in transforming your analysis data!

beybey