VintaSoft OCR .NET Plug-in - Development History

Product Release Notes

This page provides you the information about development history of VintaSoft OCR .NET Plug-in. The information about the Plug-in API history can be obtained from documentation.

  • Added support for .NET 8.0 in Windows, Linux and macOS.
  • The used Tesseract OCR engine has been updated to version 5.3.3.
  • Now all text blocks received from the image segmentation command are marked as blocks of RecognizeSingleColumn type. Previously, these blocks were marked as blocks of RecognizeSingleBlocks type. This change increased the quality of text recognition for complex text and did not reduce the overall performance of text recognition.
  • .NET development:
    • The used Tesseract OCR engine has been updated to version 5.3.
  • .NET development:
    • Added support for .NET 7 in Windows, Linux and macOS.
    • Added the ability to recognize text in Linux.
    • Improved the algorithm that detects regions of recognized symbols.
    • Demo applications:
      • OCR Demo: Added the ability to create searchable PDF document in text over image mode.
    • Fixed several minor bugs.
  • .NET development
    • Supported platforms:
      • Added the support for .NET 6 for Windows.
    • Supported development environments:
      • Added the compatibility support for Visual Studio 2022.
    • Supported operation systems:
      • Added the compatibility support for OS Windows 11.
      • Discontinued the compatibility support for OS Windows Server 2003.
    • The used Tesseract OCR engine has been updated to version 5.0. Our tests have shown that Tesseract OCR 5 and Tesseract OCR 4 provide similar text recognition results but Tesseract OCR 5 is up to 2 times faster than Tesseract OCR 4.
    • Added the ability to convert OcrPage object to a TextRegion object (OcrDocument.Create and OcrPage.Create methods).
    • Demo applications:
      • Added new functionality to the OCR Demo application:
        • Added the ability to load OCR results as text from PDF document.
    • Fixed several minor bugs.
  • Web development
    • Demo applications:
      • Improved code of ASP.NET OCR Demo (ASP.NET Core Angular OCR Demo, ASP.NET MVC OCR Demo, ASP.NET WebForms OCR Demo) and now the demo allows to:
        • view document before text recognition
        • preprocess document pages before text recognition
        • recognize text from the whole document, separate page or page region.
    • Fixed several minor bugs in OCR web service.
  • .NET development
    • Supported platforms:
      • Added the support for .NET 5 for Windows.
  • .NET development
    • Supported platforms:
      • Added support (without UI controls) for .NET Core 3 for Windows.
        Created the following .NET Core assemblies:
        • Vintasoft.Imaging.Ocr.dll
        • Vintasoft.Imaging.Ocr.Tesseract.dll
      • Discontinued support of .NET Framework 2.0. Now SDK supports .NET Framework 4+ and 3.5.
    • The used Tesseract OCR engine has been updated to version 4.1.0.
  • The used Tesseract OCR engine has been updated to version 4.0:
    • improved the quality and performance of text recognition
    • extended the list of supported languages for recognition
  • Added the ability to recognize text in several languages using the functionality of Tesseract OCR engine. Previous versions allowed to recognize text in several languages using the SDK functionality.
  • OCR demo application has been added the ability to select several languages for text recognition.
  • The used Tesseract OCR engine has been updated to version 3.04:
    • improved the quality of text recognition
    • extended the list of supported languages for recognition
  • Added the ability to use the Tesseract OCR engine in multithreaded environment.
  • Improved the quality of text recognition in color images.
  • Reduced the peak memory usage allocated during recognition of text in color images.
  • Added the ability to import/export the tree of recognition results to HOCR format.
  • Many minor fixes and improvements.
  • Added the ability to specify the ortogonal rotation of text region before the text recognition. In previous versions all text was recognized as non-rotated.
  • OCR Demo now can create searchable PDF documents with MRC compression.
  • Some minor fixes.
  • Improved the code of Ocr Demo application.
  • Assemblies were renamed and made changes in structure of namespaces. For more information click here.
  • The used Tesseract OCR Engine upgraded to version 3.02:
    • Improved OCR quality.
    • New supported languages: Afrikaans, Albanian, Azerbaijani, Belarusian, Bengali, Estonian, Basque, Frankish, Galician, Croatian, Icelandic, Malayalam, Macedonian, Maltese, Malay, Swahili, Tamil, Telugu.
  • Some minor fixes.
  • Base OCR .NET interface created (Vintasoft.Ocr.dll):
    • The ability to recognize text on image or image collection.
    • The ability to recognize text in the region of interest on image.
    • The ability to receive progress of recognition.
    • The ability to apply the image segmentation before starting of optical character recognition and set the recognition parameters for each image region.
    • The ability to obtain the recognition result in a hierarchy: Document, Page, Region, Paragraph, Line, Symbol.
    • The ability to browse through the recognition result.
    • The ability to edit the recognition result.
    • The ability to save the recognition result into a text (TXT) document.
  • Tesseract OCR interface created (Vintasoft.Ocr.Tesseract.dll):
    • Created interface provides access to Tesseract OCR engine.
    • The ability to recognize text the image.
    • The ability to recognize text in the region of interest on image.
    • Supported languages: Arabic, Bulgarian, Catalan, Czech, Сherokee, СhineseSimplified, ChineseTraditional, Danish, Dutch, English, Finnish, French, German, Greek, Hebrew, Hindi, Hungarian, Indonesian, Italian, Japanese, Korean, Latvian, Lithuanian, Norwegian, Polish, Portuguese, Romanian, Russian, Serbian, Slovakian, Slovenian, Spanish, Swedish, Tagalog, Thai, Turkish, Ukrainian, Vietnamese.
    • The ability to receive progress of recognition.
    • The ability to get/set Tesseract OCR variables.
    • The ability to use user-defined dictionaries.
  • Searchable PDF generation interface created (Vintasoft.Pdf.Ocr.dll):
    • The ability to save the recognition result into searchable PDF document as text.
    • The ability to save the recognition result into searchable PDF document as rasterized image and hidden text.