Pre-rotated Multi-scaling Digit Recognition: matlab Scripts

The main program of Pre-rotated Multi-scaling Digit Recognition divides into fours parts by calibrating and trainning two versions of CNN nets

    1. CNNDigitTrainedNet(DigitCNNLayer); and
    1. CNNAugmented180Net(DigitCNNLayer + Rotation Augmented)

The CNN integrated 10-neuron fully-connected at the end for label classifiction Two versons of CNN are calibrated with MNIST Database. Details of my architecture nets can refer to my report. Both well-calibrated CNNDigitTrainedNet and CNNAugmented180Net are train over more 30 minutes with single GPU processor. To save your times and resources, you only need to re-load the well-calibrated”CNNDigitTrainedNet.mat” and “CNNAugmented180Net.mat” for label classification Purpose. To search mulit-scale digit images, I use Gaussian filter with 9-square-box automatically search Methods.

Four Input File Images:

Main.m files

Case 1 : Computer-Generated Digits Images

clear all;  clc;  close all;

    imagepaper = imread('computer_generated.png') ;
    imagepaper = im2double(imagepaper);
    figure ; imshow(imagepaper) ; axis on
    [binaryImageWithBackground, binaryImageWithoutBackground] = dk_BackgroundRemover(imagepaper);
    SquareBoxScaleUp = 2.5   

    [searchDigitLocation, numbOfDigitFound] = dk_search_digit(binaryImageWithoutBackground, SquareBoxScaleUp) ;

    dk_fit_digit_9squareboxCase01(imagepaper, searchDigitLocation) ;
  
    load CNNDigitTrainedNet.mat
    load TestImageNormalizedCase01.mat
  
    computergeneratedLabelsInOrder = [3 4 3 8 2 0 7 2 1 0 9 6 5 0 1 0 7]
    TestLabels= computergeneratedLabelsInOrder  ;
    
    TestImage28X28X1 = TestImageNormalizedCase01(:,:,2,:);
    [TestPred, Testscores] = classify(CNNDigitTrainedNet,  TestImage28X28X1 ) ;
    confusionmat( categorical(TestLabels), TestPred )
    accuracy = sum(TestPred == categorical(TestLabels) ) /numel(TestLabels)

Case 2 : Computer-Generated-Pre-Rotated Digits Images:

clear all;  clc;  close all;

    imagepaper = imread('computer_generated_rotated.png') ; 
    imagepaper = im2double(imagepaper);
    figure ; imshow(imagepaper) ; axis on
    [binaryImageWithBackground, binaryImageWithoutBackground] = dk_BackgroundRemover(imagepaper);
    SquareBoxScaleUp = 2.0  

    [searchDigitLocation, numbOfDigitFound] = dk_search_digit(binaryImageWithoutBackground, SquareBoxScaleUp) ;
    rotatedAngle = -15  

    dk_fit_digit_9squarebox_rotatedCase02(imagepaper,searchDigitLocation, rotatedAngle ) ;
    
    load  CNNAugmented180Net.mat
    load TestImageNormalizedRotatedCase02.mat
  
    computergeneratedrotatedLabelsInOrder = [3 4 3 8 2  1 0 7 2 0  9 6 5 7 1  0 0]
    TestLabels =   computergeneratedrotatedLabelsInOrder ; 
    
    TestImage28X28X1 = TestImageNormalizedRotatedCase02(:,:,2,:);
    [TestPred, Testscores] = classify(CNNAugmented180Net,  TestImage28X28X1 ) ;
    confusionmat( categorical(TestLabels), TestPred )
    accuracy = sum(TestPred == categorical(TestLabels) ) /numel(TestLabels)

Case 3 : Handwritten Digits Images

clear all;  clc;  close all;

    imagepaper = imread('handwritten.png');
    imagepaper = im2double(imagepaper);
    figure ; imshow(imagepaper) ; axis on
    [binaryImageWithBackground, binaryImageWithoutBackground] = dk_BackgroundRemover(imagepaper);
    dk_search_digit(binaryImageWithoutBackground, 2.5) ;
    load finalpositionCase03.mat
    searchDigitLocation = finalpositionCase03 ;
    dk_fit_digit_9squareboxHandwrittenCase03(imagepaper,searchDigitLocation ) ;
    
    
    load CNNDigitTrainedNet.mat
    load TestImageNormalizedHandwrittenCase03
  
    HandwrittenLabelsInOrder = [7 7 0 0 3 8 2 4 0 2 1 8 4 9 6 8 6 1 3 1 8 4 1 1 5 8 2 3 4 2 ]
    TestLabels = HandwrittenLabelsInOrder  ;
    
    TestImage28X28X1 =   TestImageNormalizedHandwrittenCase03(:,:,1,:);
    [TestPred, Testscores] = classify(CNNDigitTrainedNet,  TestImage28X28X1 ) ;
    confusionmat( categorical(TestLabels), TestPred )
    accuracy = sum(TestPred == categorical(TestLabels) ) /numel(TestLabels)

Case 4: Handwritten-Pre-Rotated Digits Images

clear all;  clc;  close all;

    imagepaper = imread('handwritten_rotated.png');
    imagepaper = im2double(imagepaper);
    figure ; imshow(imagepaper) ; axis on
    [binaryImageWithBackground, binaryImageWithoutBackground] = dk_BackgroundRemover(imagepaper);
    dk_search_digit(binaryImageWithoutBackground, 1.59) ;
    load finalpositionCase04.mat
    searchDigitLocation = finalpositionCase04 ;    
    rotatedAngle = +15
    
    dk_fit_digit_9squarebox_handwrittenrotatedCase04(imagepaper,  searchDigitLocation,  rotatedAngle)
    
    load CNNDigitTrainedNet.mat
    load TestImageNormalizedHandwrittenRotatedCase04
  
    HandwrittenRotatedLabelsInOrder = [0 3 0 8 7 6 4 4 1 2 1 2 2 8 5 7 7 1 6 6  4 ]
    TestLabels =  categorical(HandwrittenRotatedLabelsInOrder);
    
    TestImage28X28X1 = TestImageNormalizedHandwrittenRotatedCase04(:,:,1,:); 
    [TestPred, Testscores] = classify(CNNDigitTrainedNet,  TestImage28X28X1 ) ;
    confusionmat( categorical(TestLabels), TestPred )
    accuracy = sum(TestPred == categorical(TestLabels) ) /numel(TestLabels)  

Deep Net Calibration

Load train digit data

clear all;
memory

filenameImagesTrain = 'MNISTDB/train-images.idx3-ubyte';
filenameLabelsTrain = 'MNISTDB/train-labels.idx1-ubyte';
filenameImagesTest =  'MNISTDB/t10k-images.idx3-ubyte';
filenameLabelsTest =  'MNISTDB/t10k-labels.idx1-ubyte';

TrainImages = reshape((loadMNISTImages( filenameImagesTrain )),[28,28,1,60000]);
TrainLabels = ((loadMNISTLabels( filenameLabelsTrain  )));
TestImages = reshape((loadMNISTImages( filenameImagesTest )),[28,28,1,10000]);
TestLabels = ((loadMNISTLabels(filenameLabelsTest  )));

CNN Layer Configuaration

InputLayer = imageInputLayer([28 28 1],'DataAugmentation','none','Normalization','none','Name','InputLayer');
CNNLayer01 = convolution2dLayer(4,32,'Stride',1,'Padding',0,'BiasLearnRateFactor',2,'NumChannels',1,...
        'WeightLearnRateFactor',2, 'WeightL2Factor',1,'BiasL2Factor',1,'Name','CNNLayer01');
CNNLayer01.Weights = randn([4 4 1 32])*0.1;
CNNLayer01.Bias = randn([1 1 32])*0.1;
ReluLayer01 = reluLayer('Name','ReluLayer01');
NormalizationLayer01 = crossChannelNormalizationLayer(3,'Name','NormalizationLayer01','Alpha',0.0001,'Beta',0.75,'K',2);
MaxPooling01 = maxPooling2dLayer(3,'Stride',3,'Name','MaxPooling01','Padding',1);
DropLayer01 = dropoutLayer(0.35, 'Name','DropLayer01');
CNNLayer02 = convolution2dLayer(3,16,'Stride',1, 'Padding',0,'BiasLearnRateFactor',1,'NumChannels',32,...
    'WeightLearnRateFactor',1, 'WeightL2Factor',1,'BiasL2Factor',1,'Name','CNNLayer02');
CNNLayer02.Weights = randn([3 3 32 16])*0.0001;
CNNLayer02.Bias = randn([1 1 16])*0.00001;
ReluLayer02 = reluLayer('Name','ReluLayer02');
NormalizationLayer02 = crossChannelNormalizationLayer(3,'Name','NormalizationLayer02','Alpha',0.0001,'Beta',0.75,'K',2);
DropLayer02 = dropoutLayer(0.25, 'Name','DropLayer02');
FullyConnectedOutput = fullyConnectedLayer(10,'WeightLearnRateFactor',1,'BiasLearnRateFactor',1,...
    'WeightL2Factor',1,'BiasL2Factor',1,'Name','FullyConnectedOutput');
FullyConnectedOutput.Weights = randn([10 784])*0.0001;
FullyConnectedOutput.Bias = randn([10 1])*0.0001+1;
SoftMaxL= softmaxLayer('Name','SoftMaxL');
classifyLabelOutput = classificationLayer('Name','classifyLabelOutput');
    
% options = trainingOptions('sgdm','LearnRateSchedule','piecewise','LearnRateDropFactor',0.75,... 
%       'LearnRateDropPeriod',1,'L2Regularization',0.0001,'MaxEpochs',16,'Momentum',0.9,'Shuffle','once',... 
%       'MiniBatchSize',15,'Verbose',1,'InitialLearnRate',0.043);
  
options = trainingOptions('sgdm','LearnRateSchedule','piecewise','LearnRateDropFactor',0.75,... 
      'LearnRateDropPeriod',1,'L2Regularization',0.0001,'MaxEpochs',16,'Momentum',0.9,'Shuffle','once',... 
      'MiniBatchSize',15,'Verbose',1,'InitialLearnRate',0.043,'Plots','training-progress');  
  
DigitCNNLayer =[InputLayer, CNNLayer01, ReluLayer01,NormalizationLayer01, MaxPooling01, DropLayer01,...
        CNNLayer02, ReluLayer02, NormalizationLayer02,DropLayer02, FullyConnectedOutput, SoftMaxL, classifyLabelOutput];

Calibrate CNNDigitTrainedNet

CNNDigitTrainedNet  = trainNetwork(TrainImages,categorical(TrainLabels),DigitCNNLayer,options) ;

[TestPred, Testscores] = classify(CNNDigitTrainedNet, TestImages )  ; 
 accuracy = sum(TestPred == categorical(TestLabels) ) /numel(TestLabels)  ;
 confusionmat(TestPred , categorical(TestLabels))

Calibrate CNNAugmented180Net

imageSize = [28 28 1];
imageAugmenter = imageDataAugmenter('RandRotation',[-180,180])

augimds = augmentedImageDatastore(imageSize,TrainImages,categorical(TrainLabels),'DataAugmentation',imageAugmenter);

options = trainingOptions('sgdm','LearnRateSchedule','piecewise','LearnRateDropFactor',0.75,... 
      'LearnRateDropPeriod',1,'L2Regularization',0.0001,'MaxEpochs',16,'Momentum',0.9,'Shuffle','once',... 
      'MiniBatchSize',15,'Verbose',1,'InitialLearnRate',0.043,'Plots','training-progress'); 

CNNAugmented180Net = trainNetwork(augimds,DigitCNNLayer,options) ;

[TestPred, Testscores] = classify(CNNDigitTrainedNet, TestImages )  ; 
 accuracy = sum(TestPred == categorical(TestLabels) ) /numel(TestLabels)  ;
 confusionmat(TestPred , categorical(TestLabels)) 

Functions:

dk_BackgroundRemove

function [binaryImageWithBackground, binaryImageWithoutBackground] = dk_BackgroundRemover(imagepaper)

invertedImage = 1 - rgb2gray(imagepaper);
normalizedImage = (invertedImage - min(min(invertedImage))) / (max(max(invertedImage)) - min(min(invertedImage)));
squarebox = 9 ;
backgroundNoiseImage = ordfilt2(normalizedImage, 1, ones(squarebox), 'symmetric');
backgroundNoiseImage = ordfilt2(backgroundNoiseImage, 5, ones(squarebox), 'symmetric');
substractNormalizedImagefrombackground = imsubtract(normalizedImage, backgroundNoiseImage);
 binaryImageWithoutBackground = imbinarize(substractNormalizedImagefrombackground, 0.5);
%binaryImageWithoutBackground = imbinarize(substractNormalizedImagefrombackground, 0.7);
binaryImageWithBackground = 1 - (substractNormalizedImagefrombackground .* binaryImageWithoutBackground);

end

dk_classify_all_digits

function dk_classify_all_digits( digits_validation, digits_training , location, scaling)
    TestSetNumber = length(digits_validation);
    predictedCorrect = 0;
    for kk = 1 : TestSetNumber 
        label = dk_classify_digit( digits_validation(kk).image, digits_training, location, scaling);
        if label == digits_validation(kk).label
                predictedCorrect = predictedCorrect + 1;
        end
    end
    accuracy = predictedCorrect / TestSetNumber;
    fprintf('Test Image Number is  %d \n', TestSetNumber)
    fprintf('Digit Image Correct-Predicted is %.2f%% \n\n', accuracy *100)

end

dk_classifyAllDigitsFoundOnPapers

function dk_classifyAllDigitsFoundOnPapers( binaryImageWithBackground, TestDigitsFound, numbOfDigitFound, actualLabelsInOrder, SquareBoxScaleUp) 
    scaling = 39; location = [20, 20]; load digits.mat
    digits_training = dk_prepare_digits(digits_training,location,scaling);
    predictionVector = zeros(1, numbOfDigitFound);

    for i = 1 : numbOfDigitFound
        predictionVector(i) = dk_classify_digit(binaryImageWithBackground,  digits_training, TestDigitsFound(i).location, TestDigitsFound(i).scaling);
    end
    disp('Predicted  Actual')
    disp([predictionVector(:) actualLabelsInOrder(:)])
    errorVector = find(predictionVector - actualLabelsInOrder ~= 0);
    fprintf('SquareBoxScaleUp   =  %.2f \n', SquareBoxScaleUp ) 
    fprintf('Digits Wrongly Predicted = %d   \n', length(errorVector) )
    accuracy = (1- (length(errorVector)/numbOfDigitFound))*100;
    fprintf('Prediction Accuracy =  %.2f%%.\n', accuracy )
end

dk_fit_digit_9squarebox

function dk_fit_digit_9squarebox(imagepaper,searchDigitLocation )
    figure(1); imagesc(imagepaper); 
    axis image;colormap gray ; hold on ;
    for jj = 1 : size(searchDigitLocation, 1)
        [centres, radius] = dk_place_regions(searchDigitLocation(jj, 1:2), searchDigitLocation(jj, 3));
        plot_squares(imagepaper, centres, radius)
        xlocationMin = min(centres(1,:))
        xlocationMax = max(centres(1,:))
        ylocationMin = min(centres(2,:))
        ylocationMax = max(centres(2,:))
        I2ImRd = imcrop(imagepaper ,[ xlocationMin  ylocationMin (xlocationMax-xlocationMin)  (ylocationMax-ylocationMin )]);
        resizedImage = imresize(I2ImRd, [28, 28])
        b2wrResizedImage = (1 - resizedImage)
        name = sprintf('dkDataFolder/ComputerGeneratedFolder/fig0%s.png',num2str(jj) )
        imwrite(b2wrResizedImage, name ) 
    end
    hold off
end

dk_fit_digit_9squarebox_rotated

function dk_fit_digit_9squarebox_rotated(imagepaper,searchDigitLocation, rotatedAngle )
    figure(1); imagesc(imagepaper); 
    axis image;colormap gray ; hold on ;
    for jj = 1 : size(searchDigitLocation, 1)
        [centres, radius] = dk_place_regions(searchDigitLocation(jj, 1:2), searchDigitLocation(jj, 3));
        plot_squares(imagepaper, centres, radius) ;
        xlocationMin = min(centres(1,:))  ;
        xlocationMax = max(centres(1,:))  ;
        ylocationMin = min(centres(2,:))  ;
        ylocationMax = max(centres(2,:))  ;
        I2ImRd = imcrop(imagepaper ,[ xlocationMin  ylocationMin (xlocationMax-xlocationMin)  (ylocationMax-ylocationMin )]);
        resizedImage = imresize(I2ImRd, [28, 28])  ;
        b2wrResizedImage = (1 - resizedImage)   ;
        b2wrResizedImageRotated = imrotate( b2wrResizedImage,rotatedAngle) ;
        name = sprintf('dkDataFolder/ComputerGeneratedPreRotatedFolder/fig45Rotated0%s.png',num2str(jj) );
        imwrite(b2wrResizedImageRotated, name ) ;
    end
    hold off
end

dk_gaussian_filter

function filterOut = dk_gaussian_filter(image, Sigma )
    hsize =  2*Sigma  ; 
%     hsize =  4*Sigma  ; 
    Gau2DFilter = fspecial('gaussian', hsize, Sigma);
    filterOut = imfilter(image, Gau2DFilter, 'symmetric');
end

dk_gaussian_gradients

function [ grad_x, grad_y ] = dk_gaussian_gradients(image, Sigma )

GSFilterImg = dk_gaussian_filter(image, Sigma);
yh = [-0.5 0 0.5]' ;
xh = [-0.5 0 0.5]  ;
grad_y = imfilter(GSFilterImg, yh, 'symmetric');
grad_x = imfilter(GSFilterImg, xh, 'symmetric');

end

dk_get_patch

function patch = dk_get_patch( image, x, y, patch_radius )
    [yLength, xLength, ~] = size(image);
    xPatchLeft   = x - patch_radius ;
    xPatchRight  = x + patch_radius ;
    yPatchLeft   = y - patch_radius ;
    yPatchRight  = y + patch_radius ;
    if  xPatchLeft  < 1   ||  xPatchRight > xLength
        error('X_Patch out of bound.') 
    elseif yPatchLeft  < 1   ||  yPatchRight > yLength
           error('Y_Patch out of bound.')
    else
        rowN = yPatchLeft : yPatchRight;
        colN = xPatchLeft : xPatchRight;
        patch = image(rowN,colN);
    end
end

dk_gradient_descriptor

function descriptor = dk_gradient_descriptor( patched_image, location, scaling )

factor = 0.05 ;
SigmaScaled = round(scaling * factor);
imageGaussFilter = dk_gaussian_filter(patched_image, SigmaScaled);
[GaussGradientScaled_X, GaussGradientScaled_Y] = dk_gaussian_gradients(imageGaussFilter, SigmaScaled);
[localCentres, localRadius] = dk_place_regions(location,scaling );
squarebox = 9 ;
binNum = 8;
descriptor = zeros(squarebox*binNum, 1); % 8 bins x 9 squareboxes = 72 descriptors
for kk = 1 : squarebox  
    PatchedGaussGradientScaled_X = dk_get_patch(GaussGradientScaled_X, localCentres(1, kk), localCentres(2, kk), localRadius);
    PatchedGaussGradientScaled_Y = dk_get_patch(GaussGradientScaled_Y, localCentres(1, kk), localCentres(2, kk), localRadius);
    descriptor((kk-1)*binNum+1 : kk*binNum) = dk_gradient_histogram(PatchedGaussGradientScaled_X, PatchedGaussGradientScaled_Y);   
end  

% Descriptor Normalization
descriptor = (descriptor - min(descriptor)) / (max(descriptor) - min(descriptor));

end

dk_place_regions

function [localCentres, localRadius] = dk_place_regions(location, scaling)

% compute  local centers, local radius for 9 squareboxes 

localRadius = floor((scaling - 3) / 6);
relativeCenter = (3 * localRadius) + 2;
relativeCenterXY = [relativeCenter, relativeCenter];
relativeCenterLocation = [(relativeCenter - 2*localRadius), relativeCenter, (relativeCenter + 2*localRadius)];
[grid_X, grid_Y] = meshgrid(relativeCenterLocation);
distanceGridXY = bsxfun(@minus, [grid_X(:)'; grid_Y(:)'], relativeCenterXY');
localCentres = bsxfun(@plus, distanceGridXY, location');
localRadius = localRadius + 1;

end

dk_prepare_digits

function  digits_training = dk_prepare_digits(digits_training,location, scaling )

    for i = 1 : length( digits_training)
         digits_training(i).descriptor = dk_gradient_descriptor( digits_training(i).image, location, scaling);
    end

end

dk_search_digit

function [searchDigitLocation,numbOfDigitFound] = dk_search_digit(binaryImageWithoutBackground, SquareBoxScaleUp)
    connected = 4 ;
    [legnthPatchForEachDigit,numbOfDigitFound] = bwlabel(binaryImageWithoutBackground, connected);
    searchDigitLocation = zeros(numbOfDigitFound, 3);
    for kk = 1 : numbOfDigitFound
        [rowVector, colVector] = find(legnthPatchForEachDigit == kk);
        fitWithSquareBox = max([max(rowVector) - min(rowVector), max(colVector) - min(colVector)]);
        searchDigitLocation(kk, :) =round( [((max(colVector) + min(colVector))/2),((max(rowVector) + min(rowVector))/2),(SquareBoxScaleUp * fitWithSquareBox)] ) ;
    end  
end

dk_TestDigitsArray

function TestDigits = dk_TestDigitsArray(searchDigitLocation, numbOfDigitFound, actualLabelsInOrder)
    TestDigits = struct('location', [], 'scaling', [], 'label', []);
    for kk = 1 : numbOfDigitFound
        TestDigits(kk).location = searchDigitLocation(kk, 1:2);
        TestDigits(kk).scaling  = searchDigitLocation(kk, 3);
        TestDigits(kk).label    = actualLabelsInOrder(kk);  
    end

end

Outputs