Supervised Segmentation

Anil Akyildirim

9/17/2019

## Installing package into 'C:/Users/Anil Akyildirim/Documents/R/win-library/3.6'
## (as 'lib' is unspecified)
## package 'entropy' successfully unpacked and MD5 sums checked
## 
## The downloaded binary packages are in
##  C:\Users\Anil Akyildirim\AppData\Local\Temp\RtmpkPYmz8\downloaded_packages
## Installing package into 'C:/Users/Anil Akyildirim/Documents/R/win-library/3.6'
## (as 'lib' is unspecified)
## package 'frequency' successfully unpacked and MD5 sums checked
## 
## The downloaded binary packages are in
##  C:\Users\Anil Akyildirim\AppData\Local\Temp\RtmpkPYmz8\downloaded_packages
## Installing package into 'C:/Users/Anil Akyildirim/Documents/R/win-library/3.6'
## (as 'lib' is unspecified)
## package 'plyr' successfully unpacked and MD5 sums checked
## Warning: cannot remove prior installation of package 'plyr'
## Warning in file.copy(savedcopy, lib, recursive = TRUE):
## problem copying C:\Users\Anil Akyildirim\Documents\R\win-
## library\3.6\00LOCK\plyr\libs\x64\plyr.dll to C:\Users\Anil
## Akyildirim\Documents\R\win-library\3.6\plyr\libs\x64\plyr.dll: Permission
## denied
## Warning: restored 'plyr'
## 
## The downloaded binary packages are in
##  C:\Users\Anil Akyildirim\AppData\Local\Temp\RtmpkPYmz8\downloaded_packages
## Installing package into 'C:/Users/Anil Akyildirim/Documents/R/win-library/3.6'
## (as 'lib' is unspecified)
## package 'ggplot2' successfully unpacked and MD5 sums checked
## 
## The downloaded binary packages are in
##  C:\Users\Anil Akyildirim\AppData\Local\Temp\RtmpkPYmz8\downloaded_packages
## Loading required package: rmarkdown
## Loading required package: knitr
## Loading required package: DT
## Loading required package: ggplot2

Purpose

Agenda

About the Reference Problem

Imagine we have an histroical medical data of patients that suffered the same illness. In the process of their treatment, each patient were given different medication and each of them responded to the treatment differently. The Business Problem in question is “Without wasting any medication and time, can we predict the right medication for this same ilness for a future patient?”

Attribute Selection

Brief Overview of the DataSet

##   Age Sex     BP Cholesterol Na_to_K  Drug
## 1  23   F   HIGH        HIGH  25.355 drugY
## 2  47   M    LOW        HIGH  13.093 drugC
## 3  47   M    LOW        HIGH  10.114 drugC
## 4  28   F NORMAL        HIGH   7.798 drugX
## 5  61   F    LOW        HIGH  18.043 drugY
## 6  22   F NORMAL        HIGH   8.607 drugX

Brief Overview of the DataSet

##       Age        Sex          BP     Cholesterol     Na_to_K      
##  Min.   :15.00   F: 96   HIGH  :77   HIGH  :103   Min.   : 6.269  
##  1st Qu.:31.00   M:104   LOW   :64   NORMAL: 97   1st Qu.:10.445  
##  Median :45.00           NORMAL:59                Median :13.937  
##  Mean   :44.31                                    Mean   :16.084  
##  3rd Qu.:58.00                                    3rd Qu.:19.380  
##  Max.   :74.00                                    Max.   :38.247  
##     Drug   
##  drugA:23  
##  drugB:16  
##  drugC:16  
##  drugX:54  
##  drugY:91  
## 

Brief Overview of the DataSet

## 'data.frame':    200 obs. of  6 variables:
##  $ Age        : int  23 47 47 28 61 22 49 41 60 43 ...
##  $ Sex        : Factor w/ 2 levels "F","M": 1 2 2 1 1 1 1 2 2 2 ...
##  $ BP         : Factor w/ 3 levels "HIGH","LOW","NORMAL": 1 2 2 3 2 3 3 2 3 2 ...
##  $ Cholesterol: Factor w/ 2 levels "HIGH","NORMAL": 1 1 1 1 1 1 1 1 1 2 ...
##  $ Na_to_K    : num  25.4 13.1 10.1 7.8 18 ...
##  $ Drug       : Factor w/ 5 levels "drugA","drugB",..: 5 3 3 4 5 4 5 3 5 5 ...

Brief Overview of the DataSet

## [1] 0
## [1] 0
## [1] 0
## [1] 0
## [1] 0
## [1] 0

How to Calculate Entropy - Measure of Purity-

\[entropy = -p_{1}log(p_{1}) - p_{2}log(p_{2})....\]

Overview of Frequency of Each Attribute

## Building tables
## 
  |                                                                       
  |                                                                 |   0%
  |                                                                       
  |=================================================================| 100%
## $`x:`
##                  x label Freq Percent Valid Percent Cumulative Percent
## 2    Valid       F         96    48.0          48.0               48.0
## 3                M        104    52.0          52.0              100.0
## 31           Total        200   100.0         100.0                   
## 1  Missing <blank>          0     0.0                                 
## 4             <NA>          0     0.0                                 
## 6            Total        200   100.0

Overview of Frequency of Each Attribute

## Building tables
## 
  |                                                                       
  |                                                                 |   0%
  |                                                                       
  |=================================================================| 100%
## $`x:`
##                  x label Freq Percent Valid Percent Cumulative Percent
## 2    Valid    HIGH         77    38.5          38.5               38.5
## 3              LOW         64    32.0          32.0               70.5
## 4           NORMAL         59    29.5          29.5              100.0
## 41           Total        200   100.0         100.0                   
## 1  Missing <blank>          0     0.0                                 
## 5             <NA>          0     0.0                                 
## 7            Total        200   100.0

Overview of Frequency of Each Attribute

## Building tables
## 
  |                                                                       
  |                                                                 |   0%
  |                                                                       
  |=================================================================| 100%
## $`x:`
##                  x label Freq Percent Valid Percent Cumulative Percent
## 2    Valid    HIGH        103    51.5          51.5               51.5
## 3           NORMAL         97    48.5          48.5              100.0
## 31           Total        200   100.0         100.0                   
## 1  Missing <blank>          0     0.0                                 
## 4             <NA>          0     0.0                                 
## 6            Total        200   100.0

Overview of Frequency of Each Attribute

## Building tables
## 
  |                                                                       
  |                                                                 |   0%
  |                                                                       
  |=================================================================| 100%
## $`x:`
##                  x label Freq Percent Valid Percent Cumulative Percent
## 2    Valid   drugA         23    11.5          11.5               11.5
## 3            drugB         16     8.0           8.0               19.5
## 4            drugC         16     8.0           8.0               27.5
## 5            drugX         54    27.0          27.0               54.5
## 6            drugY         91    45.5          45.5              100.0
## 61           Total        200   100.0         100.0                   
## 1  Missing <blank>          0     0.0                                 
## 7             <NA>          0     0.0                                 
## 9            Total        200   100.0

Probability of Each Attribute and Target

Calculating Entropy for the entire population (Parent)

## [1] 1.364655

Gender and Drug Distribution

Female and Drug Distribution

##     Age Sex     BP Cholesterol Na_to_K  Drug
## 1    23   F   HIGH        HIGH  25.355 drugY
## 4    28   F NORMAL        HIGH   7.798 drugX
## 5    61   F    LOW        HIGH  18.043 drugY
## 6    22   F NORMAL        HIGH   8.607 drugX
## 7    49   F NORMAL        HIGH  16.275 drugY
## 11   47   F    LOW        HIGH  11.767 drugC
## 12   34   F   HIGH      NORMAL  19.199 drugY
## 14   74   F    LOW        HIGH  20.942 drugY
## 15   50   F NORMAL        HIGH  12.703 drugX
## 16   16   F   HIGH      NORMAL  15.516 drugY
## 20   32   F   HIGH      NORMAL  25.974 drugY
## 24   48   F    LOW        HIGH  15.036 drugY
## 25   33   F    LOW        HIGH  33.486 drugY
## 26   28   F   HIGH      NORMAL  18.809 drugY
## 28   49   F NORMAL      NORMAL   9.381 drugX
## 29   39   F    LOW      NORMAL  22.697 drugY
## 31   18   F NORMAL      NORMAL   8.750 drugX
## 34   65   F   HIGH      NORMAL  31.876 drugY
## 39   39   F NORMAL      NORMAL   9.709 drugX
## 41   73   F NORMAL        HIGH  19.221 drugY
## 42   58   F   HIGH      NORMAL  14.239 drugB
## 45   50   F NORMAL      NORMAL  12.295 drugX
## 46   66   F NORMAL      NORMAL   8.107 drugX
## 47   37   F   HIGH        HIGH  13.091 drugA
## 50   28   F    LOW        HIGH  19.796 drugY
## 51   58   F   HIGH        HIGH  19.416 drugY
## 54   24   F   HIGH      NORMAL  18.457 drugY
## 55   68   F   HIGH      NORMAL  10.189 drugB
## 56   26   F    LOW        HIGH  14.160 drugC
## 61   38   F    LOW      NORMAL  29.875 drugY
## 65   60   F   HIGH        HIGH  13.303 drugB
## 66   68   F NORMAL      NORMAL  27.050 drugY
## 70   18   F   HIGH      NORMAL  24.276 drugY
## 72   28   F NORMAL        HIGH  19.675 drugY
## 73   24   F NORMAL        HIGH  10.605 drugX
## 74   41   F NORMAL      NORMAL  22.905 drugY
## 77   36   F   HIGH        HIGH  11.198 drugA
## 78   26   F   HIGH      NORMAL  19.161 drugY
## 79   19   F   HIGH        HIGH  13.313 drugA
## 80   32   F    LOW      NORMAL  10.840 drugX
## 83   32   F    LOW        HIGH   9.712 drugC
## 84   38   F   HIGH      NORMAL  11.326 drugA
## 85   47   F    LOW        HIGH  10.067 drugC
## 87   51   F NORMAL        HIGH  13.597 drugX
## 89   37   F   HIGH      NORMAL  23.091 drugY
## 90   50   F NORMAL      NORMAL  17.211 drugY
## 93   29   F   HIGH        HIGH  29.450 drugY
## 94   42   F    LOW      NORMAL  29.271 drugY
## 97   58   F    LOW        HIGH  38.247 drugY
## 98   56   F   HIGH        HIGH  25.395 drugY
## 100  15   F   HIGH      NORMAL  16.725 drugY
## 102  45   F   HIGH        HIGH  12.854 drugA
## 103  28   F    LOW        HIGH  13.127 drugC
## 112  47   F NORMAL      NORMAL   6.683 drugX
## 114  65   F    LOW      NORMAL  13.769 drugX
## 115  20   F NORMAL      NORMAL   9.281 drugX
## 118  40   F NORMAL        HIGH  10.103 drugX
## 119  32   F   HIGH      NORMAL  10.292 drugA
## 120  61   F   HIGH        HIGH  25.475 drugY
## 124  36   F NORMAL        HIGH  16.753 drugY
## 125  53   F   HIGH      NORMAL  12.495 drugB
## 126  19   F   HIGH      NORMAL  25.969 drugY
## 130  32   F NORMAL        HIGH   7.477 drugX
## 131  70   F NORMAL        HIGH  20.489 drugY
## 135  42   F   HIGH        HIGH  21.036 drugY
## 137  55   F   HIGH        HIGH  10.977 drugB
## 138  35   F   HIGH        HIGH  12.894 drugA
## 140  69   F NORMAL        HIGH  10.065 drugX
## 142  64   F    LOW      NORMAL  25.741 drugY
## 147  37   F    LOW      NORMAL  12.006 drugX
## 148  26   F   HIGH      NORMAL  12.307 drugA
## 149  61   F    LOW      NORMAL   7.340 drugX
## 154  72   F    LOW      NORMAL  14.642 drugX
## 159  59   F    LOW        HIGH  10.444 drugC
## 160  34   F    LOW      NORMAL  12.923 drugX
## 161  30   F NORMAL        HIGH  10.443 drugX
## 162  57   F   HIGH      NORMAL   9.945 drugB
## 164  21   F   HIGH      NORMAL  28.632 drugY
## 167  58   F    LOW        HIGH  26.645 drugY
## 168  57   F NORMAL        HIGH  14.216 drugX
## 169  51   F    LOW      NORMAL  23.003 drugY
## 170  20   F   HIGH        HIGH  11.262 drugA
## 171  28   F NORMAL        HIGH  12.879 drugX
## 173  39   F NORMAL      NORMAL  17.225 drugY
## 174  41   F    LOW      NORMAL  18.739 drugY
## 176  73   F   HIGH        HIGH  18.348 drugY
## 180  67   F NORMAL        HIGH  15.891 drugY
## 181  22   F   HIGH      NORMAL  22.818 drugY
## 182  59   F NORMAL        HIGH  13.884 drugX
## 183  20   F    LOW      NORMAL  11.686 drugX
## 184  36   F   HIGH      NORMAL  15.490 drugY
## 185  18   F   HIGH        HIGH  37.188 drugY
## 186  57   F NORMAL      NORMAL  25.893 drugY
## 195  46   F   HIGH        HIGH  34.686 drugY
## 196  56   F    LOW        HIGH  11.567 drugC
## 200  40   F    LOW      NORMAL  11.349 drugX
##    female.Sex female.Drug
## 1           F       drugY
## 2           F       drugX
## 3           F       drugY
## 4           F       drugX
## 5           F       drugY
## 6           F       drugC
## 7           F       drugY
## 8           F       drugY
## 9           F       drugX
## 10          F       drugY
## 11          F       drugY
## 12          F       drugY
## 13          F       drugY
## 14          F       drugY
## 15          F       drugX
## 16          F       drugY
## 17          F       drugX
## 18          F       drugY
## 19          F       drugX
## 20          F       drugY
## 21          F       drugB
## 22          F       drugX
## 23          F       drugX
## 24          F       drugA
## 25          F       drugY
## 26          F       drugY
## 27          F       drugY
## 28          F       drugB
## 29          F       drugC
## 30          F       drugY
## 31          F       drugB
## 32          F       drugY
## 33          F       drugY
## 34          F       drugY
## 35          F       drugX
## 36          F       drugY
## 37          F       drugA
## 38          F       drugY
## 39          F       drugA
## 40          F       drugX
## 41          F       drugC
## 42          F       drugA
## 43          F       drugC
## 44          F       drugX
## 45          F       drugY
## 46          F       drugY
## 47          F       drugY
## 48          F       drugY
## 49          F       drugY
## 50          F       drugY
## 51          F       drugY
## 52          F       drugA
## 53          F       drugC
## 54          F       drugX
## 55          F       drugX
## 56          F       drugX
## 57          F       drugX
## 58          F       drugA
## 59          F       drugY
## 60          F       drugY
## 61          F       drugB
## 62          F       drugY
## 63          F       drugX
## 64          F       drugY
## 65          F       drugY
## 66          F       drugB
## 67          F       drugA
## 68          F       drugX
## 69          F       drugY
## 70          F       drugX
## 71          F       drugA
## 72          F       drugX
## 73          F       drugX
## 74          F       drugC
## 75          F       drugX
## 76          F       drugX
## 77          F       drugB
## 78          F       drugY
## 79          F       drugY
## 80          F       drugX
## 81          F       drugY
## 82          F       drugA
## 83          F       drugX
## 84          F       drugY
## 85          F       drugY
## 86          F       drugY
## 87          F       drugY
## 88          F       drugY
## 89          F       drugX
## 90          F       drugX
## 91          F       drugY
## 92          F       drugY
## 93          F       drugY
## 94          F       drugY
## 95          F       drugC
## 96          F       drugX

Female and Drug Distribution

##   female.Drug freq
## 1       drugA    9
## 2       drugB    6
## 3       drugC    7
## 4       drugX   27
## 5       drugY   47

Calculating Entropy for Child 1 (Entropy for Female)

## [1] -1.280404

Male and Drug Distribution

##     Age Sex     BP Cholesterol Na_to_K  Drug
## 2    47   M    LOW        HIGH  13.093 drugC
## 3    47   M    LOW        HIGH  10.114 drugC
## 8    41   M    LOW        HIGH  11.037 drugC
## 9    60   M NORMAL        HIGH  15.171 drugY
## 10   43   M    LOW      NORMAL  19.368 drugY
## 13   43   M    LOW        HIGH  15.376 drugY
## 17   69   M    LOW      NORMAL  11.455 drugX
## 18   43   M   HIGH        HIGH  13.972 drugA
## 19   23   M    LOW        HIGH   7.298 drugC
## 21   57   M    LOW      NORMAL  19.128 drugY
## 22   63   M NORMAL        HIGH  25.917 drugY
## 23   47   M    LOW      NORMAL  30.568 drugY
## 27   31   M   HIGH        HIGH  30.366 drugY
## 30   45   M    LOW        HIGH  17.951 drugY
## 32   74   M   HIGH        HIGH   9.567 drugB
## 33   49   M    LOW      NORMAL  11.014 drugX
## 35   53   M NORMAL        HIGH  14.133 drugX
## 36   46   M NORMAL      NORMAL   7.285 drugX
## 37   32   M   HIGH      NORMAL   9.445 drugA
## 38   39   M    LOW      NORMAL  13.938 drugX
## 40   15   M NORMAL        HIGH   9.084 drugX
## 43   50   M NORMAL      NORMAL  15.790 drugY
## 44   23   M NORMAL        HIGH  12.260 drugX
## 48   68   M    LOW        HIGH  10.291 drugC
## 49   23   M NORMAL        HIGH  31.686 drugY
## 52   67   M NORMAL      NORMAL  10.898 drugX
## 53   62   M    LOW      NORMAL  27.183 drugY
## 57   65   M   HIGH      NORMAL  11.340 drugB
## 58   40   M   HIGH        HIGH  27.826 drugY
## 59   60   M NORMAL      NORMAL  10.091 drugX
## 60   34   M   HIGH        HIGH  18.703 drugY
## 62   24   M   HIGH      NORMAL   9.475 drugA
## 63   67   M    LOW      NORMAL  20.693 drugY
## 64   45   M    LOW      NORMAL   8.370 drugX
## 67   29   M   HIGH        HIGH  12.856 drugA
## 68   17   M NORMAL      NORMAL  10.832 drugX
## 69   54   M NORMAL        HIGH  24.658 drugY
## 71   70   M   HIGH        HIGH  13.967 drugB
## 75   31   M   HIGH      NORMAL  17.069 drugY
## 76   26   M    LOW      NORMAL  20.909 drugY
## 81   60   M   HIGH        HIGH  13.934 drugB
## 82   64   M NORMAL        HIGH   7.761 drugX
## 86   59   M   HIGH        HIGH  13.935 drugB
## 88   69   M    LOW        HIGH  15.478 drugY
## 91   62   M NORMAL        HIGH  16.594 drugY
## 92   41   M   HIGH      NORMAL  15.156 drugY
## 95   56   M    LOW        HIGH  15.015 drugY
## 96   36   M    LOW      NORMAL  11.424 drugX
## 99   20   M   HIGH      NORMAL  35.639 drugY
## 101  31   M   HIGH      NORMAL  11.871 drugA
## 104  56   M NORMAL        HIGH   8.966 drugX
## 105  22   M   HIGH      NORMAL  28.294 drugY
## 106  37   M    LOW      NORMAL   8.968 drugX
## 107  22   M NORMAL        HIGH  11.953 drugX
## 108  42   M    LOW        HIGH  20.013 drugY
## 109  72   M   HIGH      NORMAL   9.677 drugB
## 110  23   M NORMAL        HIGH  16.850 drugY
## 111  50   M   HIGH        HIGH   7.490 drugA
## 113  35   M    LOW      NORMAL   9.170 drugX
## 116  51   M   HIGH        HIGH  18.295 drugY
## 117  67   M NORMAL      NORMAL   9.514 drugX
## 121  28   M NORMAL        HIGH  27.064 drugY
## 122  15   M   HIGH      NORMAL  17.206 drugY
## 123  34   M NORMAL        HIGH  22.456 drugY
## 127  66   M   HIGH        HIGH  16.347 drugY
## 128  35   M NORMAL      NORMAL   7.845 drugX
## 129  47   M    LOW      NORMAL  33.542 drugY
## 132  52   M    LOW      NORMAL  32.922 drugY
## 133  49   M    LOW      NORMAL  13.598 drugX
## 134  24   M NORMAL        HIGH  25.786 drugY
## 136  74   M    LOW      NORMAL  11.939 drugX
## 139  51   M   HIGH      NORMAL  11.343 drugB
## 141  49   M   HIGH      NORMAL   6.269 drugA
## 143  60   M   HIGH      NORMAL   8.621 drugB
## 144  74   M   HIGH      NORMAL  15.436 drugY
## 145  39   M   HIGH        HIGH   9.664 drugA
## 146  61   M NORMAL        HIGH   9.443 drugX
## 150  22   M    LOW        HIGH   8.151 drugC
## 151  49   M   HIGH      NORMAL   8.700 drugA
## 152  68   M   HIGH        HIGH  11.009 drugB
## 153  55   M NORMAL      NORMAL   7.261 drugX
## 155  37   M    LOW      NORMAL  16.724 drugY
## 156  49   M    LOW        HIGH  10.537 drugC
## 157  31   M   HIGH      NORMAL  11.227 drugA
## 158  53   M    LOW        HIGH  22.963 drugY
## 163  43   M NORMAL      NORMAL  12.859 drugX
## 165  16   M   HIGH      NORMAL  19.007 drugY
## 166  38   M    LOW        HIGH  18.295 drugY
## 172  45   M    LOW      NORMAL  10.017 drugX
## 175  42   M   HIGH      NORMAL  12.766 drugA
## 177  48   M   HIGH      NORMAL  10.446 drugA
## 178  25   M NORMAL        HIGH  19.011 drugY
## 179  39   M NORMAL        HIGH  15.969 drugY
## 187  70   M   HIGH        HIGH   9.849 drugB
## 188  47   M   HIGH        HIGH  10.403 drugA
## 189  65   M   HIGH      NORMAL  34.997 drugY
## 190  64   M   HIGH      NORMAL  20.932 drugY
## 191  58   M   HIGH        HIGH  18.991 drugY
## 192  23   M   HIGH        HIGH   8.011 drugA
## 193  72   M    LOW        HIGH  16.310 drugY
## 194  72   M    LOW        HIGH   6.769 drugC
## 197  16   M    LOW        HIGH  12.006 drugC
## 198  52   M NORMAL        HIGH   9.894 drugX
## 199  23   M NORMAL      NORMAL  14.020 drugX
##     male.Sex male.Drug
## 1          M     drugC
## 2          M     drugC
## 3          M     drugC
## 4          M     drugY
## 5          M     drugY
## 6          M     drugY
## 7          M     drugX
## 8          M     drugA
## 9          M     drugC
## 10         M     drugY
## 11         M     drugY
## 12         M     drugY
## 13         M     drugY
## 14         M     drugY
## 15         M     drugB
## 16         M     drugX
## 17         M     drugX
## 18         M     drugX
## 19         M     drugA
## 20         M     drugX
## 21         M     drugX
## 22         M     drugY
## 23         M     drugX
## 24         M     drugC
## 25         M     drugY
## 26         M     drugX
## 27         M     drugY
## 28         M     drugB
## 29         M     drugY
## 30         M     drugX
## 31         M     drugY
## 32         M     drugA
## 33         M     drugY
## 34         M     drugX
## 35         M     drugA
## 36         M     drugX
## 37         M     drugY
## 38         M     drugB
## 39         M     drugY
## 40         M     drugY
## 41         M     drugB
## 42         M     drugX
## 43         M     drugB
## 44         M     drugY
## 45         M     drugY
## 46         M     drugY
## 47         M     drugY
## 48         M     drugX
## 49         M     drugY
## 50         M     drugA
## 51         M     drugX
## 52         M     drugY
## 53         M     drugX
## 54         M     drugX
## 55         M     drugY
## 56         M     drugB
## 57         M     drugY
## 58         M     drugA
## 59         M     drugX
## 60         M     drugY
## 61         M     drugX
## 62         M     drugY
## 63         M     drugY
## 64         M     drugY
## 65         M     drugY
## 66         M     drugX
## 67         M     drugY
## 68         M     drugY
## 69         M     drugX
## 70         M     drugY
## 71         M     drugX
## 72         M     drugB
## 73         M     drugA
## 74         M     drugB
## 75         M     drugY
## 76         M     drugA
## 77         M     drugX
## 78         M     drugC
## 79         M     drugA
## 80         M     drugB
## 81         M     drugX
## 82         M     drugY
## 83         M     drugC
## 84         M     drugA
## 85         M     drugY
## 86         M     drugX
## 87         M     drugY
## 88         M     drugY
## 89         M     drugX
## 90         M     drugA
## 91         M     drugA
## 92         M     drugY
## 93         M     drugY
## 94         M     drugB
## 95         M     drugA
## 96         M     drugY
## 97         M     drugY
## 98         M     drugY
## 99         M     drugA
## 100        M     drugY
## 101        M     drugC
## 102        M     drugC
## 103        M     drugX
## 104        M     drugX

Male and Drug Distribution

##   male.Drug freq
## 1     drugA   14
## 2     drugB   10
## 3     drugC    9
## 4     drugX   27
## 5     drugY   44

Calculating Entropy for Child 2 (Entropy for Male)

## [1] -1.398591

How to Calculate Information Gain

\[\begin{aligned} IG(parent,children) = entropy(parent) - [(p(female) * entropy(female)) + (p(male) * entropy(male))]\end{aligned}\]

Calculating Information Gain for Female and Male as Children

## [1] 2.706516

How does this display in Tree Induction

Supervised Segmentation

Supervised Segmentation

Probability Estimation