Introduction

Preprocessing

First, let’s preprocess the data by converting feet+inches to inches: Height = Height_Feet * 12 + Height_Inches. Then we have all the information needed for male and female heights:

##    height Gender Freq
## 1      52 Female    3
## 2      53 Female    5
## 3      54 Female   12
## 4      55 Female   24
## 5      56 Female   44
## 6      57 Female  101
## 7      58 Female  163
## 8      59 Female  260
## 9      60 Female  404
## 10     61 Female  549
## 11     62 Female  693
## 12     63 Female  869
## 13     64 Female 1076
## 14     65 Female 1013
## 15     66 Female  951
## 16     67 Female  823
## 17     68 Female  695
## 18     69 Female  494
## 19     70 Female  299
## 20     71 Female  217
## 21     72 Female  110
## 22     73 Female   58
## 23     74 Female   20
## 24     75 Female   12
## 25     76 Female    5
## 26     77 Female    0
## 27     78 Female    0
## 28     79 Female    0
## 29     80 Female    0
## 30     81 Female    0
## 31     82 Female    0
## 32     83 Female    0
## 33     52   Male    0
## 34     53   Male    0
## 35     54   Male    0
## 36     55   Male    0
## 37     56   Male    0
## 38     57   Male    0
## 39     58   Male    0
## 40     59   Male    0
## 41     60   Male    1
## 42     61   Male   10
## 43     62   Male   14
## 44     63   Male   53
## 45     64   Male  117
## 46     65   Male  241
## 47     66   Male  369
## 48     67   Male  500
## 49     68   Male  700
## 50     69   Male  787
## 51     70   Male  849
## 52     71   Male  882
## 53     72   Male  873
## 54     73   Male  779
## 55     74   Male  610
## 56     75   Male  432
## 57     76   Male  274
## 58     77   Male  155
## 59     78   Male   83
## 60     79   Male   38
## 61     80   Male   24
## 62     81   Male    5
## 63     82   Male    3
## 64     83   Male    1

Histogram

Histogram Classfier

##   height male_female    male_pos
## 1     55      Female 0.000000000
## 2     60      Female 0.002469136
## 3     65      Female 0.192185008
## 4     70        Male 0.739547038
## 5     75        Male 0.972972973
## 6     80        Male 1.000000000

Parameters for the two normal distributions:

## [1] "Female--  Mean: 64.7257303370787,     Standard Deviation: 3.47843448028316"
## [1] "Male----  Mean: 70.7680769230769,    Standard Deviation: 3.30966736751305"

Bayesian Classifier

##   height male_female     male_pos
## 1     55      Female 0.0005405442
## 2     60      Female 0.0115206399
## 3     65      Female 0.1682957181
## 4     70        Male 0.7389322760
## 5     75        Male 0.9696021681
## 6     80        Male 0.9965588763

Repeat the above process for only the first 200 heights records:

##    height Gender Freq
## 1      55 Female    1
## 2      56 Female    1
## 3      57 Female    6
## 4      58 Female    2
## 5      59 Female    4
## 6      60 Female    2
## 7      61 Female    6
## 8      62 Female    9
## 9      63 Female   10
## 10     64 Female   11
## 11     65 Female   13
## 12     66 Female   14
## 13     67 Female   10
## 14     68 Female    7
## 15     69 Female    9
## 16     70 Female    2
## 17     71 Female    4
## 18     72 Female    0
## 19     73 Female    1
## 20     74 Female    0
## 21     75 Female    0
## 22     76 Female    0
## 23     77 Female    0
## 24     78 Female    0
## 25     80 Female    0
## 26     55   Male    0
## 27     56   Male    0
## 28     57   Male    0
## 29     58   Male    0
## 30     59   Male    0
## 31     60   Male    0
## 32     61   Male    0
## 33     62   Male    0
## 34     63   Male    0
## 35     64   Male    1
## 36     65   Male    2
## 37     66   Male    6
## 38     67   Male    7
## 39     68   Male    6
## 40     69   Male    7
## 41     70   Male   13
## 42     71   Male   13
## 43     72   Male    7
## 44     73   Male    3
## 45     74   Male    9
## 46     75   Male    5
## 47     76   Male    4
## 48     77   Male    2
## 49     78   Male    1
## 50     80   Male    2

##   height male_female  male_pos
## 1     55      Female 0.0000000
## 2     60      Female 0.0000000
## 3     65      Female 0.1333333
## 4     70        Male 0.8666667
## 5     75        Male 1.0000000
## 6     80        Male 1.0000000
## [1] "Female--  Mean: 64.4285714285714,     Standard Deviation: 3.75778899367409"
## [1] "Male----  Mean: 70.9431818181818,    Standard Deviation: 3.45864884436555"
##   height male_female     male_pos
## 1     55      Female 0.0005387538
## 2     60      Female 0.0126171389
## 3     65      Female 0.1803789018
## 4     70        Male 0.7335960589
## 5     75        Male 0.9615866723
## 6     80        Male 0.9939877375