Tuesday 26 May 2015

Getting better OCR results from mazes or puzzles - Method 1.

ocr() function in MATLAB supports few helpful attributes.
Following three(roi , TextLayout and CharacterSet) might help you increase OCR efficiency in reading characters from puzzles or places where you may have an idea of spacing between characters.

Step 1: Set up a grid of Region Of Interest(roi).
I am working on a Sudoku puzzle solving program. So basically, I know the size of Sudoku I am feeding my program. This gives me a rough idea of each small cell dimensions. Or you can find it a more non-rigid way by :
1.Find size of image processed and cropped maze.
2.Divide rows and columns as Sudoku dimensions.

"Moreover, your region of interest is not the complete cell Of course, the interested character lies in middle of the cell. So amend the rectangle dimension as necessary and set up an array of xmin, ymin, width and height of rectangle."

[row column]=size(out); %out is my image-processed array
r_round=round((row)/9);
c_round=round((column)/9);
u=1;
for i=0:8
    for j=0:8
        xinit= i*c_round+20;          % +20 and -35 is for adjusting coordinate a bit from theoretical
         yinit= j*r_round+35;       %value as  we are interested in central part of cell where element lies.
         roi(u,1:4)=[xinit yinit 125 125];
        u=u+1;
    end
end

Step 2: This step is probably the final step. It is kind of simple. Just direct the in-built ocr() function to do the job and provide parameters correctly.

ocrtext=ocr(out,roi,'TextLayout','Block','CharacterSet','123456789');

%Here 'TextLayout' attribute specifies the layout of characters to be read. I have set it to 'Block' as %my characters won't have a continuity in background and moreover, they are like blocks. You can %check Documentation of MATLAB for other values. 'CharacterSet' specifies  which characters %shall be considered for matching the values. I have put 1 through 9 as my Sudoku puzzle has only %digits.

----------------------------------------------------------------------------------------------------------------------
ocrtext cell contains ocrWords, ocrConfidences and various useful props.
The ocrtext will contain 81 ocrtext cells. Each corresponding to a cell on SUDOKU maze.
Store these cells in an array and use it for further SUDOKU solving.
Be careful about cells and arrays. Know difference between them and convert cells to matrices using cell2mat().
Refer Documentation of MATLAB  by typing "doc"(without quotes) in command window.

No comments:

Post a Comment