VintaSoft OCR .NET Plug-in - Development History

Product Release Notes

This page provides you the information about development history of VintaSoft OCR .NET Plug-in. The information about the Plug-in API history can be obtained from documentation.

  • .NET development:
    • The used Tesseract OCR engine has been updated to version 5.3.
  • .NET development:
    • Added support for .NET 7 in Windows, Linux and macOS.
    • Added the ability to recognize text in Linux.
    • Improved the algorithm that detects regions of recognized symbols.
    • Demo applications:
      • OCR Demo: Added the ability to create searchable PDF document in text over image mode.
    • Fixed several minor bugs.
  • .NET development
    • Supported platforms:
      • Added the support for .NET 6 for Windows.
    • Supported development environments:
      • Added the compatibility support for Visual Studio 2022.
    • Supported operation systems:
      • Added the compatibility support for OS Windows 11.
      • Discontinued the compatibility support for OS Windows Server 2003.
    • The used Tesseract OCR engine has been updated to version 5.0. Our tests have shown that Tesseract OCR 5 and Tesseract OCR 4 provide similar text recognition results but Tesseract OCR 5 is up to 2 times faster than Tesseract OCR 4.
    • Added the ability to convert OcrPage object to a TextRegion object (OcrDocument.Create and OcrPage.Create methods).
    • Demo applications:
      • Added new functionality to the OCR Demo application:
        • Added the ability to load OCR results as text from PDF document.
    • Fixed several minor bugs.
  • Web development
    • Demo applications:
      • Improved code of ASP.NET OCR Demo (ASP.NET Core Angular OCR Demo, ASP.NET MVC OCR Demo, ASP.NET WebForms OCR Demo) and now the demo allows to:
        • view document before text recognition
        • preprocess document pages before text recognition
        • recognize text from the whole document, separate page or page region.
    • Fixed several minor bugs in OCR web service.
  • .NET development
    • Supported platforms:
      • Added the support for .NET 5 for Windows.
  • .NET development
    • Supported platforms:
      • Added support (without UI controls) for .NET Core 3 for Windows.
        Created the following .NET Core assemblies:
        • Vintasoft.Imaging.Ocr.dll
        • Vintasoft.Imaging.Ocr.Tesseract.dll
      • Discontinued support of .NET Framework 2.0. Now SDK supports .NET Framework 4+ and 3.5.
    • The used Tesseract OCR engine has been updated to version 4.1.0.
  • The used Tesseract OCR engine has been updated to version 4.0:
    • improved the quality and performance of text recognition
    • extended the list of supported languages for recognition
  • Added the ability to recognize text in several languages using the functionality of Tesseract OCR engine. Previous versions allowed to recognize text in several languages using the SDK functionality.
  • OCR demo application has been added the ability to select several languages for text recognition.
  • The used Tesseract OCR engine has been updated to version 3.04:
    • improved the quality of text recognition
    • extended the list of supported languages for recognition
  • Added the ability to use the Tesseract OCR engine in multithreaded environment.
  • Improved the quality of text recognition in color images.
  • Reduced the peak memory usage allocated during recognition of text in color images.
  • Added the ability to import/export the tree of recognition results to HOCR format.
  • Many minor fixes and improvements.
  • Added the ability to specify the ortogonal rotation of text region before the text recognition. In previous versions all text was recognized as non-rotated.
  • OCR Demo now can create searchable PDF documents with MRC compression.
  • Some minor fixes.
  • Improved the code of Ocr Demo application.
  • Assemblies were renamed and made changes in structure of namespaces. For more information click here.
  • The used Tesseract OCR Engine upgraded to version 3.02:
    • Improved OCR quality.
    • New supported languages: Afrikaans, Albanian, Azerbaijani, Belarusian, Bengali, Estonian, Basque, Frankish, Galician, Croatian, Icelandic, Malayalam, Macedonian, Maltese, Malay, Swahili, Tamil, Telugu.
  • Some minor fixes.
  • Base OCR .NET interface created (Vintasoft.Ocr.dll):
    • The ability to recognize text on image or image collection.
    • The ability to recognize text in the region of interest on image.
    • The ability to receive progress of recognition.
    • The ability to apply the image segmentation before starting of optical character recognition and set the recognition parameters for each image region.
    • The ability to obtain the recognition result in a hierarchy: Document, Page, Region, Paragraph, Line, Symbol.
    • The ability to browse through the recognition result.
    • The ability to edit the recognition result.
    • The ability to save the recognition result into a text (TXT) document.
  • Tesseract OCR interface created (Vintasoft.Ocr.Tesseract.dll):
    • Created interface provides access to Tesseract OCR engine.
    • The ability to recognize text the image.
    • The ability to recognize text in the region of interest on image.
    • Supported languages: Arabic, Bulgarian, Catalan, Czech, Сherokee, СhineseSimplified, ChineseTraditional, Danish, Dutch, English, Finnish, French, German, Greek, Hebrew, Hindi, Hungarian, Indonesian, Italian, Japanese, Korean, Latvian, Lithuanian, Norwegian, Polish, Portuguese, Romanian, Russian, Serbian, Slovakian, Slovenian, Spanish, Swedish, Tagalog, Thai, Turkish, Ukrainian, Vietnamese.
    • The ability to receive progress of recognition.
    • The ability to get/set Tesseract OCR variables.
    • The ability to use user-defined dictionaries.
  • Searchable PDF generation interface created (Vintasoft.Pdf.Ocr.dll):
    • The ability to save the recognition result into searchable PDF document as text.
    • The ability to save the recognition result into searchable PDF document as rasterized image and hidden text.