I need to read (in VB.NET application) TEXT CONTENT from a particular page and AREA from pdf files.
I try with this sample (below) but don't function (don't function well).
I have problem to know exact coordinates, i try with sample PdfReaderDemo, but the coordinates and resolutions do not corresponding well.
Any suggestions? Any way to suggest for reading text from a particular AREA form pdf?
Thanks.
Code: Select all
Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button1.Click
Dim x1 As Int16, x2 As Int16, y1 As Int16, y2 As Int16
Dim vTesto As String = ""
Dim vArea As SizeF
Dim vRect As RectangleF
Try
_fileStream = New FileStream(edPdf1.Text, FileMode.Open, FileAccess.Read)
_document = PdfDocumentController.OpenDocument(_fileStream)
vTesto = _document.Pages(0).TextRegion.TextContent
edTxt1.Text = vTesto
vArea = _document.Pages(0).GetPageSizeInPixels(_document.Pages(0).DefaultResolution)
x1 = Convert.ToInt16(edX1.Text)
x2 = Convert.ToInt16(edX2.Text)
y1 = Convert.ToInt16(edY1.Text)
y2 = Convert.ToInt16(edY2.Text)
If x1 <> 0 Or x2 <> 0 Or y1 <> 0 Or y2 <> 0 Then
vRect = New RectangleF(x1, y1, x2, y2)
vTesto = _document.Pages(0).TextRegion.GetSubregion(vRect).TextContent
End If
Catch ex As Exception
End Try