Tuesday, November 22, 2005
Wednesday, October 05, 2005
Training neural network for bangla character recognition
[::] Background:
A network is to be designed and trained to recognize the 4 letters of the bangla alphabet (aw, ba, ka, ra).
Each letter is represented as a 15 by 21 grid of boolean values. A matrix ALPHABET contains the bit maps of 4 letters of the bangla alphabet (aw, ba, ka, ra).
The network receives the 315> Boolean values as a 315-element input vector. It is then required to identify the letter by responding with a 4-element output vector. The 4 elements of the output vector each represent a letter.
Target vectors TARGETS for each letter is defined. Each target vector has 4 elements with all zeros, except for a single 1. 'aw' has a 1 in the first element, 'ba' in the second, etc.
Used a feed-forward network to train to recognize bangla character bit maps, in the presence of noise.
The neural network needs 315 inputs and 4 neurons in its output layer to identify the letters. The network is a two-layer log-sigmoid / log-sigmoid network. The log-sigmoid transfer function was picked because its output range (0 to 1) is perfect for learning to output boolean values. The hidden (first) layer has 10 neurons. This number was picked by guesswork.
The network is trained to output a 1 in the correct position of the output vector and to fill the rest of the output vector with 0’s. However, noisy input vectors may result in the network not creating perfect 1’s and 0’s. After the network is trained the output is passed through the competitive transfer function compet. This makes sure that the output corresponding to the letter most like the noisy input vector takes on a value of 1, and all others have a value of 0. The result of this post-processing is the output that is actually used.
To create a network that can handle noisy input vectors it had to be trained on both ideal and noisy vectors. To do this, the network was first trained on ideal vectors until it had a low sum-squared error. Then, the network was trained on 10 sets of ideal and noisy vectors. The network was trained on two copies of the noise-free alphabet at the same time as it was trained on noisy vectors. The two copies of the noise-free alphabet were used to maintain the network’s ability to classify ideal input vectors. Unfortunately, after the training described above the network might have learned to classify some difficult noisy vectors at the expense of properly classifying a noise-free vector. Therefore, the network was again trained on just ideal vectors. This ensured that the network would respond perfectly when presented with an ideal letter. All training was done using backpropagation with both adaptive learning rate and momentum with the function trainbpx.
To test the system, all the 4 letters with noise were created and presented to the network.
Script files used are:
1) imagePreprocess2.m
2) bocrProb2.m
3) nnBOCR2.m
4) vector2array.m
5) recognitionTest.m
[::] Source Code:
// imagePreprocess2.m // begin of code
%read images
Iaw = imread('J:\MatLabWork\nn-bocr2\testImages\aw.bmp');
Iba = imread('J:\MatLabWork\nn-bocr2\testImages\ba.bmp');
Ika = imread('J:\MatLabWork\nn-bocr2\testImages\ka.bmp');
Ira = imread('J:\MatLabWork\nn-bocr2\testImages\ra.bmp');
level = graythresh(Iaw);
bwIaw = im2bw(Iaw,level);
level = graythresh(Iba);
bwIba = im2bw(Iba,level);
level = graythresh(Ika);
bwIka = im2bw(Ika,level);
level = graythresh(Ira);
bwIra = im2bw(Ira,level);
%imview(Iaw)
%Iaw --> 79x65
%imview(Iba)
%Iba --> 56x64
%imview(Ika)
%Ika --> 70x72
%imview(Ira)
%Ira --> 56x64
row = 15;
col = 21;
letterCount = 4;
elements = row * col;
%imresize(bwIaw,0.25);
Jaw = imresize( bwIaw, [col row], 'bilinear');
%imresize(bwIba,0.25);
Jba = imresize( bwIba, [col row], 'bilinear' );
%imresize(bwIka,0.25);
Jka = imresize( bwIka, [col row], 'bilinear' );
%imresize(bwIra,0.25);
Jra = imresize( bwIra, [col row], 'bilinear' );
// imagePreprocess2.m // end of code
// bocrProb2.m // begin of code
function [alphabet,targets] = bocrProb2()
%BOCRPROB2 Character recognition problem definition
%
% [ALHABET,TARGETS] = bocrProb()
% Returns:
% ALPHABET - 315x4 matrix of 15x21 bit maps for each letter.
% TARGETS - 4x4 target vectors.
% nAwsher
letterAw = [1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ...
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ...
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ...
1 1 0 0 0 0 0 0 0 0 0 0 0 1 1 ...
1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 ...
1 1 1 1 1 1 0 0 1 1 1 0 1 1 1 ...
1 1 0 1 1 0 0 0 0 1 1 0 1 1 1 ...
1 1 0 1 1 0 0 0 0 1 1 0 1 1 1 ...
1 1 0 1 1 0 0 0 1 1 1 0 1 1 1 ...
1 1 1 0 1 0 0 0 1 0 1 0 1 1 1 ...
1 1 1 0 1 1 0 1 1 0 1 0 1 1 1 ...
1 1 1 0 1 1 1 1 1 0 1 0 1 1 1 ...
1 1 1 1 0 1 1 1 1 0 1 0 1 1 1 ...
1 1 1 1 0 1 1 1 1 0 1 0 1 1 1 ...
1 1 1 1 1 0 1 1 0 1 0 0 1 1 1 ...
1 1 1 1 1 0 0 0 0 1 1 0 1 1 1 ...
1 1 1 1 1 1 1 0 1 1 1 0 1 1 1 ...
1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 ...
1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 ...
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ...
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1]';
letterBa = [1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ...
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ...
1 1 0 0 0 0 0 0 0 0 0 0 0 1 1 ...
1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 ...
1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 ...
1 1 1 1 1 1 1 1 1 0 0 0 1 1 1 ...
1 1 1 1 1 1 1 0 0 0 1 0 1 1 1 ...
1 1 1 1 1 1 0 0 1 1 1 0 1 1 1 ...
1 1 1 1 1 0 0 1 1 1 1 0 1 1 1 ...
1 1 1 0 0 0 1 1 1 1 1 0 1 1 1 ...
1 1 1 0 0 0 1 1 1 1 1 0 1 1 1 ...
1 1 1 0 0 0 0 1 1 1 1 0 1 1 1 ...
1 1 1 1 1 0 0 0 1 1 1 0 1 1 1 ...
1 1 1 1 1 1 1 0 0 1 1 0 1 1 1 ...
1 1 1 1 1 1 1 1 0 0 1 0 1 1 1 ...
1 1 1 1 1 1 1 1 1 0 1 0 1 1 1 ...
1 1 1 1 1 1 1 1 1 1 0 0 1 1 1 ...
1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 ...
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ...
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ...
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1]';
letterKa = [1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ...
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ...
1 0 0 0 0 0 0 0 0 0 0 0 0 1 1 ...
1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 ...
1 1 1 1 1 1 1 0 0 0 1 1 1 1 1 ...
1 1 1 1 1 1 0 0 0 0 0 1 1 1 1 ...
1 1 1 1 0 0 1 1 0 1 0 0 1 1 1 ...
1 1 1 0 0 1 1 1 0 1 1 0 1 1 1 ...
1 1 0 0 1 1 1 1 0 1 1 1 1 1 1 ...
1 1 0 0 1 1 1 1 0 1 1 1 1 1 1 ...
1 1 1 0 0 1 1 1 0 1 0 0 0 1 1 ...
1 1 1 1 1 0 1 1 0 1 0 0 1 1 1 ...
1 1 1 1 1 1 0 1 0 1 0 0 1 1 1 ...
1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 ...
1 1 1 1 1 1 1 0 0 1 1 1 1 1 1 ...
1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 ...
1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 ...
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ...
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ...
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ...
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1]';
letterRa = [1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ...
1 0 0 0 0 0 0 0 0 0 0 0 1 1 1 ...
1 1 0 0 0 0 0 0 0 0 0 0 0 1 1 ...
1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 ...
1 1 1 1 1 1 1 1 0 0 0 1 1 1 1 ...
1 1 1 1 1 1 0 0 0 1 0 1 1 1 1 ...
1 1 1 1 1 0 0 1 1 1 0 1 1 1 1 ...
1 1 1 1 0 0 1 1 1 1 0 1 1 1 1 ...
1 1 1 0 0 1 1 1 1 1 0 1 1 1 1 ...
1 1 0 0 1 1 1 1 1 1 0 1 1 1 1 ...
1 1 1 0 0 0 1 1 1 1 0 1 1 1 1 ...
1 1 1 1 1 0 0 0 1 1 0 1 1 1 1 ...
1 1 1 1 1 1 1 0 0 1 0 1 1 1 1 ...
1 1 1 1 1 1 1 1 0 1 0 1 1 1 1 ...
1 1 1 1 1 0 1 1 1 0 0 1 1 1 1 ...
1 1 1 1 1 0 0 1 1 1 0 1 1 1 1 ...
1 1 1 1 1 0 0 1 1 1 0 1 1 1 1 ...
1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 ...
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ...
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ...
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1]';
alphabet = [letterAw,letterBa,letterKa,letterRa];
targets = eye(4);
// bocrProb2.m // end of code
// nnBOCR2.m // begin of code
% NEWFF - Inititializes feed-forward networks.
% TRAINGDX - Trains a feed-forward network with faster backpropagation.
% SIM - Simulates feed-forward networks.
% CHARACTER RECOGNITION:
% Using the above functions a feed-forward network is trained
% to recognize character bit maps, in the presence of noise.
fprintf('% Strike any key to continue...\n');
pause % Strike any key to continue...
% DEFINING THE MODEL PROBLEM
% ==========================
% The script file bocrProb1 defines a matrix ALPHABET
% which contains the bit maps of 4 letters of the
% bangla alphabet (aw, ba, ka, ra).
% This file also defines target vectors TARGETS for
% each letter. Each target vector has 4 elements with
% all zeros, except for a single 1. Aw has a 1 in the
% first element, Ba in the second, etc.
[alphabet,targets] = bocrProb2;
%[alphabet,targets] = bocrProb2(Jaw,Jba,Jka,Jra);
[R,Q] = size(alphabet);
[S2,Q] = size(targets);
fprintf('% Strike any key to define the network...\n');
pause % Strike any key to define the network...
% DEFINING THE NETWORK
% ====================
% The character recognition network will have 25 TANSIG
% neurons in its hidden layer.
S1 = 10;
net = newff(minmax(alphabet),[S1 S2],{'logsig' 'logsig'},'traingdx');
net.LW{2,1} = net.LW{2,1}*0.01;
net.b{2} = net.b{2}*0.01;
fprintf('% Strike any key to train the network...\n');
pause % Strike any key to train the network...
% TRAINING THE NETWORK WITHOUT NOISE
% ==================================
net.performFcn = 'sse'; % Sum-Squared Error performance function
net.trainParam.goal = 0.1; % Sum-squared error goal.
net.trainParam.show = 20; % Frequency of progress displays (in epochs).
net.trainParam.epochs = 5000; % Maximum number of epochs to train.
net.trainParam.mc = 0.95; % Momentum constant.
% Training begins...please wait...
P = alphabet;
T = targets;
[net,tr] = train(net,P,T);
% ...and finally finishes.
fprintf('% Strike any key to train the network with noise...\n');
pause % Strike any key to train the network with noise...
% TRAINING THE NETWORK WITH NOISE
% ===============================
% A copy of the network will now be made. This copy will
% be trained with noisy examples of letters of the alphabet.
netn = net;
netn.trainParam.goal = 0.6; % Mean-squared error goal.
netn.trainParam.epochs = 300; % Maximum number of epochs to train.
% The network will be trained on 10 sets of noisy data.
fprintf('% Strike any key to begin training...\n');
pause % Strike any key to begin training...
% Training begins...please wait...
T = [targets targets targets targets];
for pass = 1:10
fprintf('Pass = %.0f\n',pass);
P = [alphabet, alphabet, ...
(alphabet + randn(R,Q)*0.1), ...
(alphabet + randn(R,Q)*0.2)];
[netn,tr] = train(netn,P,T);
echo off
end
echo on
% ...and finally finishes.
fprintf('% Strike any key to finish training the network...\n');
pause % Strike any key to finish training the network...
% TRAINING THE SECOND NETWORK WITHOUT NOISE
% =========================================
% The second network is now retrained without noise to
% insure that it correctly categorizes non-noizy letters.
netn.trainParam.goal = 0.1; % Mean-squared error goal.
netn.trainParam.epochs = 500; % Maximum number of epochs to train.
net.trainParam.show = 5; % Frequency of progress displays (in epochs).
% Training begins...please wait...
P = alphabet;
T = targets;
[netn,tr] = train(netn,P,T);
% ...and finally finishes.
fprintf('% Strike any key to test the networks...\n');
pause % Strike any key to test the networks...
% TRAINING THE NETWORK
% ====================
% SET TESTING PARAMETERS
noise_range = 0:.05:.5;
max_test = 100;
network1 = [];
network2 = [];
T = targets;
% PERFORM THE TEST
for noiselevel = noise_range
fprintf('Testing networks with noise level of %.2f.\n',noiselevel);
errors1 = 0;
errors2 = 0;
for i=1:max_test
P = alphabet + randn(elements,letterCount)*noiselevel;
% TEST NETWORK 1
A = sim(net,P);
AA = compet(A);
errors1 = errors1 + sum(sum(abs(AA-T)))/2;
% TEST NETWORK 2
An = sim(netn,P);
AAn = compet(An);
errors2 = errors2 + sum(sum(abs(AAn-T)))/2;
echo off
end
% AVERAGE ERRORS FOR 100 SETS OF 4 TARGET VECTORS.
network1 = [network1 errors1/letterCount/100];
network2 = [network2 errors2/letterCount/100];
end
echo on
fprintf('% Strike any key to display the test results...\n');
pause % Strike any key to display the test results...
% DISPLAY RESULTS
% ===============
% Here is a plot showing the percentage of errors for
% the two networks for varying levels of noise.
clf
plot(noise_range,network1*100,'--',noise_range,network2*100);
title('Percentage of Recognition Errors');
xlabel('Noise Level');
ylabel('Network 1 - - Network 2 ---');
% Network 1, trained without noise, has more errors due
% to noise than does Network 2, which was trained with noise.
echo off
disp('End of nnBOCR2')
// nnBOCR2.m // end of code
// vector2array.m // begin of code
function [letterArray] = vector2array(letterVector, row, col)
%letterVector = letterVector';
k = 1;
for i = 1 : col
for j = 1 : row
letterArray(i, j) = letterVector(k);
k = k + 1;
end
end
// vector2array.m // end of code
// recognitionTest.m // begin of code
function recognitionTest(net, row, col, alphabet, letterNum)
elements = row * col;
[letterArray] = vector2array(alphabet(:,letterNum), row, col);
imshow(letterArray)
fprintf('imshow(..) ... original letter\n');
fprintf('% Strike any key to continue...\n');
pause
noisyLetter = alphabet(:,letterNum) + randn(elements,1) * 0.2;
[letterArray] = vector2array(noisyLetter, row, col);
imshow(letterArray)
%plotchar(noisyAw);
fprintf('imshow(..) ... noisy letter\n');
fprintf('% Strike any key to continue...\n');
pause
A2 = sim(net,noisyLetter);
A2 = compet(A2);
answer = find(compet(A2) == 1);
[letterArray] = vector2array(alphabet(:,answer), row, col);
fprintf('imshow(..) ... recognized letter\n');
imshow(letterArray)
disp('End of recognitionTest')
// recognitionTest.m // end of code
[::] Test Cases & Results:

original 'aw'

original 'ba'

original 'ka'

original 'ra'

noisy 'aw'

noisy 'ba'

noisy 'ka'

noisy 'ra'

training with traingdx

percentage of error recognition
[::] Observation:
The training was successful. The system was able to recognize the 4 letters with noise.
[::] Software used:
Matlab 7
[::] References:
-1]
Neural Network Toolbox
(For Use with MATLAB)
Howard Demuth
Mark Beale
Martin Hagan
(User’s Guide Version 4)
Saturday, September 03, 2005
Segmentation Test 1
test using the segmentation code
MATLAB version 7.0.0





MATLAB version 7.0.0

test image: source

test image: histogram

test image: extracted line

test image: histogram of extracted line

test image: extracted word
Sample Code: Segmentation
Below is the sample code we used for line & word segmentation just to get a kick start of the OCR segmentation process.
%---------------------------------------------------------
% line segmentation
%---------------------------------------------------------
I = imread('imageFileName');
level = graythresh(I);
bwI = im2bw(I,level);
[rows cols] = size(bwI);
rowI1 = sum(bwI')';
for i=1:size(rowI1)
rowI1(i) = cols - rowI1(i);
end
plot(rowI1)
[rowL, colL] = size(rowI1);
%---------P(1,:) --> row 1, all column
l = 1;
bool = 1;
for k = 1 : rowL-1
if bool == 1
if rowI1(k,1) == 0 & rowI1( k+1, 1) > 0
startI(l) = k;
bool = 0;
end
else
if rowI1(k,1) > 0 & rowI1( k+1, 1) == 0
endL(l) = k;
bool = 1;
l = l + 1;
end
end
end
clear line
lineNo = 5;
for i = startI(lineNo) : endL(lineNo)
for j = 1 : cols
line(i-startI(lineNo)+1,j) = bwI(i,j);
end
end
imshow(line)
%---------------------------------------------------------
% word segmentation
%---------------------------------------------------------
[lineRow lineCol] = size(line);
linePix = sum(line);
for i=1:lineCol
linePix(i) = lineRow - linePix(i);
end
l = 1;
bool = 1;
for k = 1 : lineCol - 1
if bool == 1
if linePix(1,k) == 0 & linePix(1,k+1) > 0
wordStart(l) = k;
bool = 0;
end
else
if linePix(1,k) > 0 & linePix(1,k+1) == 0
wordEnd(l) = k;
bool = 1;
l = l + 1;
end
end
end
clear word
wordNo = 2;
for i = 1 : lineRow
for j = wordStart(wordNo) : wordEnd(wordNo)
word(i, j-wordStart(wordNo)+1) = line(i,j);
end
end
imshow(word)
%---------------------------------------------------------
%---------------------------------------------------------
% line segmentation
%---------------------------------------------------------
I = imread('imageFileName');
level = graythresh(I);
bwI = im2bw(I,level);
[rows cols] = size(bwI);
rowI1 = sum(bwI')';
for i=1:size(rowI1)
rowI1(i) = cols - rowI1(i);
end
plot(rowI1)
[rowL, colL] = size(rowI1);
%---------P(1,:) --> row 1, all column
l = 1;
bool = 1;
for k = 1 : rowL-1
if bool == 1
if rowI1(k,1) == 0 & rowI1( k+1, 1) > 0
startI(l) = k;
bool = 0;
end
else
if rowI1(k,1) > 0 & rowI1( k+1, 1) == 0
endL(l) = k;
bool = 1;
l = l + 1;
end
end
end
clear line
lineNo = 5;
for i = startI(lineNo) : endL(lineNo)
for j = 1 : cols
line(i-startI(lineNo)+1,j) = bwI(i,j);
end
end
imshow(line)
%---------------------------------------------------------
% word segmentation
%---------------------------------------------------------
[lineRow lineCol] = size(line);
linePix = sum(line);
for i=1:lineCol
linePix(i) = lineRow - linePix(i);
end
l = 1;
bool = 1;
for k = 1 : lineCol - 1
if bool == 1
if linePix(1,k) == 0 & linePix(1,k+1) > 0
wordStart(l) = k;
bool = 0;
end
else
if linePix(1,k) > 0 & linePix(1,k+1) == 0
wordEnd(l) = k;
bool = 1;
l = l + 1;
end
end
end
clear word
wordNo = 2;
for i = 1 : lineRow
for j = wordStart(wordNo) : wordEnd(wordNo)
word(i, j-wordStart(wordNo)+1) = line(i,j);
end
end
imshow(word)
%---------------------------------------------------------




