OCR: How to recognize MICR E-13B characters from image in .NET
In This Topic
VintaSoft Imaging .NET SDK with
VintaSoft OCR .NET Plug-in allows to recognize text from images using Tesseract OCR engine. A lot of dictionaries created for Tesseract OCR engine provide the ability to run the text recognition in more than 100 languages.
Few authors found in Internet suggest their free dictionaries for recognition of MICR E-13B symbols using Tesseract OCR engine.
We have tested some of them and made sure that the dictionary "e13b.traineddata" provides good quality of MICR E-13B symbols recognition, which is comparable with recognition quality of professional MICR E-13B symbols recognizers.
The "e13b.traineddata" dictionary is offered under BSD-3 license, which allows the free use and redistribution of this file.
We make the "e13b.traineddata" dictionary available for download from
our site, also the dictionary can be downloaded from other Internet resources.
The "e13b.traineddata" dictionary is added into the list of supported dictionaries (MICR item in Vintasoft.Imaging.Ocr.OcrLanguage enumeration) since version 11.0.5.1 of VintaSoft OCR .NET Plug-in.
More detailed information about MICR symbols you can read in Wikipedia:
https://en.wikipedia.org/wiki/Magnetic_ink_character_recognition
Here is an image from Wikipedia on which is represented MICR E-13B symbols:
Here is C#/VB.NET code that shows how to recognize MICR E-13B symbols from image using Tesseract OCR engine:
/// <summary>
/// Recognizes MICR E-13B characters from image using Tesseract OCR engine.
/// </summary>
/// <param name="filename">The name of file, which stores images with MICR E-13B characters.</param>
public static void RecognizeMicrE13BCharactersUsingTesseractOCR(string filename)
{
// create an image collection
using (Vintasoft.Imaging.ImageCollection images =
new Vintasoft.Imaging.ImageCollection())
{
// add images from file to the image collection
images.Add(filename);
System.Console.WriteLine("Create Tesseract OCR engine...");
// create the Tesseract OCR engine
using (Vintasoft.Imaging.Ocr.Tesseract.TesseractOcr tesseractOcr =
new Vintasoft.Imaging.Ocr.Tesseract.TesseractOcr())
{
System.Console.WriteLine("Initialize OCR engine...");
// init the Tesseract OCR engine for recognition of MICR E-13B characters
tesseractOcr.Init(
new Vintasoft.Imaging.Ocr.OcrEngineSettings(
Vintasoft.Imaging.Ocr.OcrLanguage.MICR));
// for each image in image collection
foreach (Vintasoft.Imaging.VintasoftImage image in images)
{
System.Console.WriteLine("Recognize the image...");
// recognize text in image
Vintasoft.Imaging.Ocr.Results.OcrPage ocrResult =
tesseractOcr.Recognize(image);
// output the recognized text
System.Console.WriteLine("Page Text:");
System.Console.WriteLine(ocrResult.GetText());
System.Console.WriteLine();
}
// shutdown the Tesseract OCR engine
tesseractOcr.Shutdown();
}
// free images
images.ClearAndDisposeItems();
}
}
''' <summary>
''' Recognizes MICR E-13B characters from image using Tesseract OCR engine.
''' </summary>
''' <param name="filename">The name of file, which stores images with MICR E-13B characters.</param>
Public Shared Sub RecognizeMicrE13BCharactersUsingTesseractOCR(filename As String)
' create an image collection
Using images As New Vintasoft.Imaging.ImageCollection()
' add images from file to the image collection
images.Add(filename)
System.Console.WriteLine("Create Tesseract OCR engine...")
' create the Tesseract OCR engine
Using tesseractOcr As New Vintasoft.Imaging.Ocr.Tesseract.TesseractOcr()
System.Console.WriteLine("Initialize OCR engine...")
' init the Tesseract OCR engine for recognition of MICR E-13B characters
tesseractOcr.Init(New Vintasoft.Imaging.Ocr.OcrEngineSettings(Vintasoft.Imaging.Ocr.OcrLanguage.MICR))
' for each image in image collection
For Each image As Vintasoft.Imaging.VintasoftImage In images
System.Console.WriteLine("Recognize the image...")
' recognize text in image
Dim ocrResult As Vintasoft.Imaging.Ocr.Results.OcrPage = tesseractOcr.Recognize(image)
' output the recognized text
System.Console.WriteLine("Page Text:")
System.Console.WriteLine(ocrResult.GetText())
System.Console.WriteLine()
Next
' shutdown the Tesseract OCR engine
tesseractOcr.Shutdown()
End Using
' free images
images.ClearAndDisposeItems()
End Using
End Sub