Using Rectangle Annotation To Read text from PDF
Posted: Fri Feb 28, 2020 3:18 am
Hi,
I am in the process of evaluation of VintaSoft libraries (at the moment I am using annotations for pdf). Our goal is to use rectangle annotations so our users can mark text in the pdf and by using their rectangle annotations we want to read marked text. This is the method that I have found in the documentation pages
and I am calling this method
where _annotationViewer.AnnotationDataCollection[0] is the rectangle annotation that I am using to mark text in pdf. All the time as result of this method I am getting text that is lower to the left from the text I want to read. Does anyone know what I am missing here? Thanks.
I am in the process of evaluation of VintaSoft libraries (at the moment I am using annotations for pdf). Our goal is to use rectangle annotations so our users can mark text in the pdf and by using their rectangle annotations we want to read marked text. This is the method that I have found in the documentation pages
Code: Select all
public string GetRegionTextPage(
Vintasoft.Imaging.Pdf.Tree.PdfPage page,
Vintasoft.Imaging.UI.ImageViewer imageViewer,
System.Drawing.Rectangle selectedRegion)
{
// convert the rectangle from the control coordinates to the image coordinates
System.Drawing.RectangleF imageCoordinateSystemRectangle =
imageViewer.RectangleToImage(selectedRegion);
// get left-top point of the rectangle
System.Drawing.PointF pdfPageCoordinateSystemPoint1 = imageCoordinateSystemRectangle.Location;
// get rigth-bottom point of the rectangle
System.Drawing.PointF pdfPageCoordinateSystemPoint2 =
new System.Drawing.PointF(imageCoordinateSystemRectangle.Right, imageCoordinateSystemRectangle.Bottom);
// get resolution of the image
Vintasoft.Imaging.Resolution resolution = imageViewer.Image.Resolution;
// convert points from the image coordinate space to the page coordinate space
page.PointToUnit(ref pdfPageCoordinateSystemPoint1, resolution);
page.PointToUnit(ref pdfPageCoordinateSystemPoint2, resolution);
// create rectangle in the page's coordinate space
System.Drawing.RectangleF rectangle = new System.Drawing.RectangleF(new PointF(pdfPageCoordinateSystemPoint1.X, pdfPageCoordinateSystemPoint1.Y),
new System.Drawing.SizeF(
pdfPageCoordinateSystemPoint2.X - pdfPageCoordinateSystemPoint1.X,
pdfPageCoordinateSystemPoint2.Y - pdfPageCoordinateSystemPoint1.Y));
// get text region of the page
Vintasoft.Imaging.Text.TextRegion textRegion = page.TextRegion.GetSubregion(
rectangle,
Vintasoft.Imaging.Text.TextSelectionMode.Rectangle);
string textContent = string.Empty;
// if text region is found
if (textRegion != null)
textContent = textRegion.TextContent;
return textContent;
}
Code: Select all
GetRegionTextPage(page, _annotationViewer, new Rectangle(new Point((int)_annotationViewer.AnnotationDataCollection[0].Location.X, (int)_annotationViewer.AnnotationDataCollection[0].Location.Y), new Size((int)_annotationViewer.AnnotationDataCollection[0].Size.Width, (int)_annotationViewer.AnnotationDataCollection[0].Size.Height)));