Character recognition is one of the more challenging tasks in Machine Learning. In this example, we build a model using SVM and use it for charcter recognition.
## 'data.frame': 20000 obs. of 17 variables:
## $ charac : Factor w/ 26 levels "A","B","C","D",..: 12 15 18 3 20 16 10 6 23 10 ...
## $ xhash : int 3 2 3 5 3 2 3 2 5 6 ...
## $ yhash : int 7 4 7 6 4 2 9 1 7 9 ...
## $ breadth: int 3 3 3 6 3 3 5 3 5 9 ...
## $ depth : int 5 3 5 5 3 3 7 2 5 7 ...
## $ onpix : int 1 2 3 5 1 2 6 1 4 5 ...
## $ xb : int 0 7 6 5 5 6 9 5 3 8 ...
## $ yb : int 1 7 10 6 11 11 8 11 11 5 ...
## $ x2b : int 6 7 7 3 3 5 3 3 2 7 ...
## $ y2b : int 6 4 3 6 7 4 3 5 2 7 ...
## $ xyb : int 0 7 7 7 11 10 8 11 10 8 ...
## $ x2yb : int 0 5 4 7 9 7 4 9 9 6 ...
## $ xy2b : int 6 8 8 11 5 2 6 5 7 7 ...
## $ xedge : int 0 2 2 5 1 1 4 1 5 2 ...
## $ xedge_y: int 8 8 7 11 11 10 8 10 11 6 ...
## $ yedge : int 0 3 5 7 2 4 6 3 1 4 ...
## $ yedge_x: int 8 8 11 8 5 6 4 6 7 6 ...
## charac xhash yhash breadth
## U : 813 Min. : 0.000 Min. : 0.000 Min. : 0.000
## D : 805 1st Qu.: 3.000 1st Qu.: 5.000 1st Qu.: 4.000
## P : 803 Median : 4.000 Median : 7.000 Median : 5.000
## T : 796 Mean : 4.024 Mean : 7.035 Mean : 5.122
## M : 792 3rd Qu.: 5.000 3rd Qu.: 9.000 3rd Qu.: 6.000
## A : 789 Max. :15.000 Max. :15.000 Max. :15.000
## (Other):15202
## depth onpix xb yb
## Min. : 0.000 Min. : 0.000 Min. : 0.000 Min. : 0.0
## 1st Qu.: 4.000 1st Qu.: 2.000 1st Qu.: 6.000 1st Qu.: 6.0
## Median : 6.000 Median : 3.000 Median : 7.000 Median : 7.0
## Mean : 5.372 Mean : 3.506 Mean : 6.898 Mean : 7.5
## 3rd Qu.: 7.000 3rd Qu.: 5.000 3rd Qu.: 8.000 3rd Qu.: 9.0
## Max. :15.000 Max. :15.000 Max. :15.000 Max. :15.0
##
## x2b y2b xyb x2yb
## Min. : 0.000 Min. : 0.000 Min. : 0.000 Min. : 0.000
## 1st Qu.: 3.000 1st Qu.: 4.000 1st Qu.: 7.000 1st Qu.: 5.000
## Median : 4.000 Median : 5.000 Median : 8.000 Median : 6.000
## Mean : 4.629 Mean : 5.179 Mean : 8.282 Mean : 6.454
## 3rd Qu.: 6.000 3rd Qu.: 7.000 3rd Qu.:10.000 3rd Qu.: 8.000
## Max. :15.000 Max. :15.000 Max. :15.000 Max. :15.000
##
## xy2b xedge xedge_y yedge
## Min. : 0.000 Min. : 0.000 Min. : 0.000 Min. : 0.000
## 1st Qu.: 7.000 1st Qu.: 1.000 1st Qu.: 8.000 1st Qu.: 2.000
## Median : 8.000 Median : 3.000 Median : 8.000 Median : 3.000
## Mean : 7.929 Mean : 3.046 Mean : 8.339 Mean : 3.692
## 3rd Qu.: 9.000 3rd Qu.: 4.000 3rd Qu.: 9.000 3rd Qu.: 5.000
## Max. :15.000 Max. :15.000 Max. :15.000 Max. :15.000
##
## yedge_x
## Min. : 0.000
## 1st Qu.: 7.000
## Median : 8.000
## Mean : 7.801
## 3rd Qu.: 9.000
## Max. :15.000
##
## Setting default kernel parameters
##
## charac_predictions A B C D E F G H I J K L M N
## A 191 0 0 2 0 0 0 0 0 1 0 0 0 0
## B 0 171 0 7 1 1 0 5 1 0 2 0 0 0
## C 0 0 150 0 1 0 5 3 0 0 5 2 0 0
## D 0 6 0 184 0 0 4 9 5 4 1 0 0 5
## E 0 2 9 0 161 3 1 0 0 0 1 9 0 0
## F 0 0 1 0 1 162 1 6 4 0 0 0 0 0
## G 0 3 7 1 9 4 149 2 1 0 4 5 0 0
## H 0 0 2 1 0 0 1 133 0 1 3 2 2 8
## I 0 3 0 0 0 1 0 0 163 15 0 0 0 0
## J 1 0 0 0 0 0 0 2 5 159 0 0 0 0
## K 1 2 5 1 1 0 5 4 0 0 137 1 0 0
## L 0 0 1 0 1 0 3 0 0 0 2 163 0 0
## M 1 0 2 1 0 0 1 0 0 0 0 0 184 5
## N 0 0 0 3 0 0 0 2 0 2 0 0 1 177
## O 0 0 4 0 0 0 1 11 0 1 0 0 0 1
## P 0 0 0 0 0 2 1 0 0 0 0 0 0 0
## Q 0 0 0 0 2 0 10 3 0 0 0 4 0 0
## R 1 15 0 1 1 0 3 9 0 0 16 0 5 2
## S 1 4 0 0 3 3 2 0 2 1 0 2 0 0
## T 0 1 1 0 4 8 0 0 0 0 0 0 0 0
## U 1 0 3 0 0 0 0 2 0 0 0 0 2 0
## V 0 1 0 0 0 0 4 0 0 0 0 0 0 2
## W 0 0 0 0 0 0 1 0 0 0 0 0 8 0
## X 0 1 0 0 1 0 0 3 0 0 8 4 0 0
## Y 4 0 0 0 0 1 0 0 1 0 0 0 0 1
## Z 1 0 0 0 3 1 0 0 4 2 0 0 0 0
##
## charac_predictions O P Q R S T U V W X Y Z
## A 4 0 5 2 0 0 2 2 1 0 1 1
## B 0 0 1 4 10 0 0 4 0 0 0 0
## C 2 0 1 0 1 0 0 0 0 0 0 0
## D 6 1 0 6 1 0 0 0 0 2 0 0
## E 0 0 5 0 10 0 0 0 0 2 0 7
## F 0 15 0 0 7 2 0 0 0 1 4 0
## G 0 3 10 3 5 1 0 1 2 2 0 0
## H 16 0 0 3 0 0 1 4 2 0 2 0
## I 0 0 1 0 1 0 0 0 0 2 1 1
## J 0 0 2 0 1 0 0 0 0 1 0 4
## K 0 1 0 8 0 2 2 0 0 5 0 0
## L 0 0 0 1 4 0 0 0 0 1 0 1
## M 2 0 0 0 0 0 5 0 8 0 1 0
## N 0 0 0 1 0 0 0 1 0 0 0 0
## O 144 0 4 1 0 0 2 0 2 0 0 0
## P 0 145 0 0 0 0 0 1 0 0 1 0
## Q 0 1 179 2 6 0 0 0 0 0 3 1
## R 1 0 0 151 2 0 0 0 0 2 0 0
## S 0 1 7 0 119 4 0 0 0 1 1 15
## T 0 0 0 1 4 181 0 0 0 3 3 3
## U 2 0 0 0 0 1 173 0 0 0 0 0
## V 0 1 0 0 0 0 0 170 1 0 7 0
## W 5 0 0 0 0 0 1 2 168 0 0 0
## X 0 0 0 0 0 2 0 0 0 186 1 0
## Y 0 2 0 0 0 4 0 2 0 0 194 0
## Z 0 0 1 0 24 1 0 0 0 0 0 135
## correct_predictn
## FALSE TRUE
## 771 4229
## correct_predictn
## FALSE TRUE
## 0.1542 0.8458
##
## charac_predictions_rbf A B C D E F G H I J K L M
## A 197 0 0 0 0 0 0 0 0 0 0 0 0
## B 0 196 0 3 2 3 1 5 0 0 0 0 5
## C 1 0 162 0 2 0 1 0 0 0 0 0 0
## D 0 2 0 193 0 1 3 4 3 1 0 0 0
## E 0 1 7 0 171 6 0 0 0 1 0 9 0
## F 0 0 0 0 0 169 0 1 3 0 0 0 0
## G 0 0 5 0 10 0 184 4 0 0 0 5 1
## H 0 0 1 2 0 1 0 162 0 1 3 1 0
## I 0 1 0 0 0 0 0 0 171 8 0 0 0
## J 0 0 0 0 0 0 0 0 7 170 0 0 0
## K 0 0 0 0 1 0 1 5 0 0 161 0 0
## L 0 0 0 0 0 0 0 0 0 0 0 170 0
## M 1 0 1 0 0 0 0 0 0 0 0 0 192
## N 0 0 0 1 0 0 0 1 0 1 0 0 1
## O 0 0 6 0 0 0 1 5 0 2 0 0 0
## P 0 0 0 0 0 2 0 0 0 0 0 0 0
## Q 0 0 0 0 0 0 0 1 0 1 0 0 0
## R 0 7 1 1 1 0 0 4 0 0 11 2 0
## S 0 2 0 0 0 0 0 0 1 1 0 2 0
## T 0 0 0 0 0 3 0 0 0 0 0 0 0
## U 0 0 2 1 0 0 0 0 0 0 0 0 0
## V 0 0 0 0 0 0 0 0 0 0 0 0 0
## W 0 0 0 0 0 0 1 0 0 0 0 0 3
## X 0 0 0 0 0 0 0 0 0 0 4 3 0
## Y 3 0 0 0 0 1 0 2 0 0 0 0 0
## Z 0 0 0 0 2 0 0 0 1 0 0 0 0
##
## charac_predictions_rbf N O P Q R S T U V W X Y Z
## A 0 0 0 0 0 0 0 0 0 0 0 0 0
## B 0 0 0 1 2 1 0 0 5 1 0 0 0
## C 0 0 0 0 0 1 0 0 0 0 0 0 0
## D 4 6 1 0 3 0 0 0 0 0 1 0 0
## E 0 0 0 0 0 1 1 0 0 0 1 0 3
## F 0 0 13 0 0 6 2 0 1 0 0 0 0
## G 0 1 1 2 1 0 0 0 0 2 0 0 0
## H 3 0 0 1 0 0 4 0 1 1 0 2 0
## I 0 0 0 0 0 0 0 0 0 0 0 0 0
## J 0 0 0 0 0 0 0 0 0 0 0 0 0
## K 0 0 1 0 2 0 0 2 0 0 3 0 0
## L 0 0 0 0 0 0 0 0 0 0 0 0 0
## M 1 0 0 0 0 0 0 2 0 3 0 1 0
## N 180 0 0 0 1 0 0 0 0 0 0 0 0
## O 4 166 0 3 0 0 0 1 0 2 1 0 0
## P 0 0 152 0 0 0 1 0 0 0 0 1 0
## Q 0 3 1 208 0 0 0 0 0 0 0 0 3
## R 7 0 0 0 174 1 0 0 4 0 1 0 0
## S 0 0 0 1 0 184 2 0 0 0 0 0 1
## T 0 0 0 0 0 0 184 0 0 0 0 2 0
## U 0 2 0 0 0 0 1 178 0 1 0 0 0
## V 0 0 0 0 0 0 0 0 174 1 0 3 0
## W 1 4 0 0 0 0 0 3 1 173 0 0 0
## X 0 0 0 0 0 0 2 0 0 0 201 1 1
## Y 1 0 1 0 0 0 1 0 1 0 0 209 0
## Z 0 0 0 0 0 1 0 0 0 0 0 0 160
## correct_predictn_rbf
## FALSE TRUE
## 359 4641
## correct_predictn_rbf
## FALSE TRUE
## 0.0718 0.9282
## Setting default kernel parameters
##
## charac_predictions_poly A B C D E F G H I J K L
## A 191 0 0 2 0 0 0 0 0 1 0 0
## B 0 171 0 7 1 1 0 5 1 0 2 0
## C 0 0 150 0 1 0 5 3 0 0 5 2
## D 0 6 0 184 0 0 4 9 5 4 1 0
## E 0 2 9 0 161 3 1 0 0 0 1 9
## F 0 0 1 0 1 162 1 6 4 0 0 0
## G 0 3 7 1 9 4 149 2 1 0 4 5
## H 0 0 2 1 0 0 1 133 0 1 3 2
## I 0 3 0 0 0 1 0 0 163 15 0 0
## J 1 0 0 0 0 0 0 2 5 159 0 0
## K 1 2 5 1 1 0 5 4 0 0 137 1
## L 0 0 1 0 1 0 3 0 0 0 2 163
## M 1 0 2 1 0 0 1 0 0 0 0 0
## N 0 0 0 3 0 0 0 2 0 2 0 0
## O 0 0 4 0 0 0 1 11 0 1 0 0
## P 0 0 0 0 0 2 1 0 0 0 0 0
## Q 0 0 0 0 2 0 10 3 0 0 0 4
## R 1 15 0 1 1 0 3 9 0 0 16 0
## S 1 4 0 0 3 3 2 0 2 1 0 2
## T 0 1 1 0 4 8 0 0 0 0 0 0
## U 1 0 3 0 0 0 0 2 0 0 0 0
## V 0 1 0 0 0 0 4 0 0 0 0 0
## W 0 0 0 0 0 0 1 0 0 0 0 0
## X 0 1 0 0 1 0 0 3 0 0 8 4
## Y 4 0 0 0 0 1 0 0 1 0 0 0
## Z 1 0 0 0 3 1 0 0 4 2 0 0
##
## charac_predictions_poly M N O P Q R S T U V W X
## A 0 0 4 0 5 2 0 0 2 2 1 0
## B 0 0 0 0 1 4 10 0 0 4 0 0
## C 0 0 2 0 1 0 1 0 0 0 0 0
## D 0 5 6 1 0 6 1 0 0 0 0 2
## E 0 0 0 0 5 0 10 0 0 0 0 2
## F 0 0 0 15 0 0 7 2 0 0 0 1
## G 0 0 0 3 10 3 5 1 0 1 2 2
## H 2 8 16 0 0 3 0 0 1 4 2 0
## I 0 0 0 0 1 0 1 0 0 0 0 2
## J 0 0 0 0 2 0 1 0 0 0 0 1
## K 0 0 0 1 0 8 0 2 2 0 0 5
## L 0 0 0 0 0 1 4 0 0 0 0 1
## M 184 5 2 0 0 0 0 0 5 0 8 0
## N 1 177 0 0 0 1 0 0 0 1 0 0
## O 0 1 144 0 4 1 0 0 2 0 2 0
## P 0 0 0 145 0 0 0 0 0 1 0 0
## Q 0 0 0 1 179 2 6 0 0 0 0 0
## R 5 2 1 0 0 151 2 0 0 0 0 2
## S 0 0 0 1 7 0 119 4 0 0 0 1
## T 0 0 0 0 0 1 4 181 0 0 0 3
## U 2 0 2 0 0 0 0 1 173 0 0 0
## V 0 2 0 1 0 0 0 0 0 170 1 0
## W 8 0 5 0 0 0 0 0 1 2 168 0
## X 0 0 0 0 0 0 0 2 0 0 0 186
## Y 0 1 0 2 0 0 0 4 0 2 0 0
## Z 0 0 0 0 1 0 24 1 0 0 0 0
##
## charac_predictions_poly Y Z
## A 1 1
## B 0 0
## C 0 0
## D 0 0
## E 0 7
## F 4 0
## G 0 0
## H 2 0
## I 1 1
## J 0 4
## K 0 0
## L 0 1
## M 1 0
## N 0 0
## O 0 0
## P 1 0
## Q 3 1
## R 0 0
## S 1 15
## T 3 3
## U 0 0
## V 7 0
## W 0 0
## X 1 0
## Y 194 0
## Z 0 135
## correct_predictn_poly
## FALSE TRUE
## 771 4229
## correct_predictn_poly
## FALSE TRUE
## 0.1542 0.8458
## Setting default kernel parameters
##
## charac_predictions_tan A B C D E F G H I J K L M N O P Q
## A 29 34 1 27 6 3 12 39 18 32 30 33 63 22 42 2 67
## B 2 55 10 19 13 6 19 8 4 3 9 3 6 0 8 4 27
## C 0 0 13 0 0 0 6 0 1 4 3 1 0 0 2 0 0
## D 2 3 1 23 0 15 1 0 1 0 0 4 0 4 6 9 2
## E 0 0 23 0 7 0 10 0 2 0 2 2 0 0 0 0 1
## F 0 4 9 2 9 49 1 6 51 21 1 0 0 7 1 88 0
## G 1 1 9 1 3 1 3 16 0 0 25 19 0 10 8 0 27
## H 5 6 6 27 1 7 9 9 0 2 1 6 23 15 21 17 16
## I 74 27 12 35 17 17 12 2 17 70 5 18 0 3 0 5 1
## J 29 13 8 18 25 1 34 15 38 20 10 14 7 4 28 5 18
## K 7 1 28 0 11 16 9 5 1 11 14 20 16 7 8 1 1
## L 38 15 34 11 67 0 48 17 27 1 29 25 0 5 25 0 14
## M 1 3 0 1 0 0 1 5 0 0 2 0 11 9 0 0 0
## N 0 0 0 1 0 0 0 4 0 1 0 0 26 12 1 1 1
## O 1 5 1 7 0 0 3 42 0 5 2 19 34 34 21 4 7
## P 0 5 0 5 0 8 1 2 0 4 0 0 0 2 0 5 0
## Q 1 8 1 0 1 0 15 2 0 1 1 5 1 2 0 1 4
## R 5 5 6 3 5 6 1 1 2 3 2 2 0 1 1 1 0
## S 0 17 1 8 11 3 3 3 1 2 2 4 0 1 3 3 11
## T 4 0 16 10 8 36 0 4 5 0 15 0 1 11 2 12 0
## U 0 0 1 3 1 0 1 3 0 0 2 1 11 12 1 0 0
## V 2 0 4 0 2 15 1 11 0 1 15 1 1 30 2 9 12
## W 0 1 0 0 0 1 2 0 1 0 4 3 2 6 1 2 3
## X 0 0 1 0 0 1 0 0 16 3 5 7 0 0 0 0 0
## Y 0 0 0 0 0 0 0 0 1 0 0 1 0 4 0 1 4
## Z 1 6 0 0 2 1 0 0 0 2 0 4 0 0 1 0 0
##
## charac_predictions_tan R S T U V W X Y Z
## A 38 9 1 14 3 6 24 1 6
## B 41 44 2 0 1 2 19 4 26
## C 0 2 0 1 2 0 0 0 0
## D 3 0 18 0 0 0 2 5 2
## E 0 0 2 0 0 0 1 0 6
## F 1 23 44 4 36 14 5 36 19
## G 1 4 1 6 1 1 2 0 0
## H 12 0 1 13 13 15 0 14 1
## I 22 20 8 0 1 0 32 7 21
## J 23 32 3 1 0 0 34 0 20
## K 6 0 2 11 11 5 0 3 0
## L 12 16 5 11 0 0 40 0 35
## M 1 0 1 0 2 1 0 0 0
## N 0 0 0 5 19 44 0 13 0
## O 3 2 0 25 0 2 1 0 0
## P 0 0 8 0 13 0 0 20 0
## Q 10 11 1 5 4 5 1 8 2
## R 5 6 0 0 0 0 3 0 1
## S 2 8 8 0 2 0 4 2 12
## T 0 6 11 42 6 1 22 19 14
## U 0 0 7 14 15 16 3 2 0
## V 0 3 28 27 38 54 6 49 1
## W 1 0 1 0 6 2 0 7 0
## X 2 0 35 2 2 0 8 18 1
## Y 0 1 10 5 12 16 0 8 0
## Z 0 8 1 0 0 0 1 3 1
## correct_predictn_tan
## FALSE TRUE
## 4588 412
## correct_predictn_tan
## FALSE TRUE
## 0.9176 0.0824
Based on the results obtained using four kernels i.e. vanilladot(linear), rbfdot(radial basis), tanhdot (hyperbolic tangentsigmoid) and polydot(polynomial), we see that radial basis non linear mapping gives the most accurate result.