Do you own a Debenu Quick PDF Library version 7, 8, 9, 10, 11, 12, 13 or iSEDQuickPDF license? Upgrade to Debenu Quick PDF Library 14 today!

Debenu Quick PDF Library - PDF SDK Community Forum Homepage
Forum Home Forum Home > For Users of the Library > I need help - I can help
  New Posts New Posts RSS Feed - Having difficulty with steps to extract text
  FAQ FAQ  Forum Search   Register Register  Login Login

Having difficulty with steps to extract text

 Post Reply Post Reply
Author
Message
sdumont View Drop Down
Beginner
Beginner


Joined: 29 Nov 18
Location: Buffalo, NY, US
Status: Offline
Points: 6
Post Options Post Options   Thanks (0) Thanks(0)   Quote sdumont Quote  Post ReplyReply Direct Link To This Post Topic: Having difficulty with steps to extract text
    Posted: 29 Nov 18 at 4:49PM

I am trying to work on extracting the text from a sample document I created with just a couple plainly typed sentences but every time I execute my code I get an empty string.

I think I'm missing some step, but it's difficult to tell exactly what order the steps should be completed in, and much of the documentation refers to functions which don't exist in my version.
 
I am using the .dll file loaded into a vb.net desktop application, and the version seems to be 7.xx. Here's the code I'm using(when I create a new PDFBuilder it creates a PDFLibrary and unlocks it):
 

Dim pdf As String = "C:\Users\sdumont\Desktop\testpdf.pdf"

Dim pdftester As New PDFBuilder()

Dim result As String = ""

Try

Select Case pdftester.QPD.LoadFromFile(pdf)

Case 1

MsgBox("The file was loaded successfully!")

Case 0

MsgBox("The file could not be read or processed")

MsgBox(pdftester.QPD.LastErrorCode)

End Select

Select Case pdftester.QPD.SelectPage(1)

Case 1

MsgBox("The page was selected successfully!")

Case 0

MsgBox("The page could not be found")

MsgBox(pdftester.QPD.LastErrorCode)

End Select

pdftester.QPD.SetOrigin(1)

MsgBox(pdftester.QPD.GetPageText(7))

Catch ex As Exception

MsgBox(ex.Message)

End Try

Back to Top
Ingo View Drop Down
Moderator Group
Moderator Group
Avatar

Joined: 29 Oct 05
Status: Offline
Points: 2799
Post Options Post Options   Thanks (0) Thanks(0)   Quote Ingo Quote  Post ReplyReply Direct Link To This Post Posted: 29 Nov 18 at 7:26PM
Hi S,

in my old archive i've found version 7.26... ;-)
In this old version the LoadFromFile works without entering a password... okay.
In this old version GetPageText oesn't offer option 7 - option 6 is the last one... this can be the problem.
A LastErrorCode after GetPageText would help ;-)
A Decrypt after LoadFromFile will help, too.
You'll know that your version of the library doesn't support the actual pf-specifications?

Cheers and welcome here,
Ingo



Edited by Ingo - 29 Nov 18 at 7:27PM
Cheers,
Ingo

Back to Top
sdumont View Drop Down
Beginner
Beginner


Joined: 29 Nov 18
Location: Buffalo, NY, US
Status: Offline
Points: 6
Post Options Post Options   Thanks (0) Thanks(0)   Quote sdumont Quote  Post ReplyReply Direct Link To This Post Posted: 29 Nov 18 at 7:49PM
Thanks for the welcome, I'm glad to see there is documentation and a community for this code even all these years later.

I've been spending some more time on this since I posted and have a few updates to report.

The version I have reports it is 7.25
I did this using option 0 and it worked! At least for basic text on a plain white background. It doesn't produce any results for a more complicated pdf however.
I had put a lasterrorcode after getpagetext but it came back as 0, which isn't in the lasterrorcode documentation, I assumed it meant no error.
 
Could you clarify what you meant with your last point? Does this version not offer some features for text extraction which later versions offer? I noticed some functions have been added since then, but I wonder if there are any missing features which are essential to this task?
Back to Top
Ingo View Drop Down
Moderator Group
Moderator Group
Avatar

Joined: 29 Oct 05
Status: Offline
Points: 2799
Post Options Post Options   Thanks (0) Thanks(0)   Quote Ingo Quote  Post ReplyReply Direct Link To This Post Posted: 29 Nov 18 at 8:10PM
Hi S,

if you're working with newer pdf-documents they can be encrypted with AES 256 and this standard isn't supported by lib-version 7.25.
For YOUR GetPageText you can use option 0 up to option 6.
If you're missing a developer- and reference-guide i can send you one belonging to version 7.26.
If you want them you can send the pdfs to your email-adress you've inserted here in the forum.

Cheers,
Ingo

Back to Top
sdumont View Drop Down
Beginner
Beginner


Joined: 29 Nov 18
Location: Buffalo, NY, US
Status: Offline
Points: 6
Post Options Post Options   Thanks (0) Thanks(0)   Quote sdumont Quote  Post ReplyReply Direct Link To This Post Posted: 29 Nov 18 at 8:42PM

Sure I'll take copies of those documents, there's always a chance that they will help.

Thanks for your help with troubleshooting this, I think I understand what I can/can't do at this point.
Back to Top
Ingo View Drop Down
Moderator Group
Moderator Group
Avatar

Joined: 29 Oct 05
Status: Offline
Points: 2799
Post Options Post Options   Thanks (0) Thanks(0)   Quote Ingo Quote  Post ReplyReply Direct Link To This Post Posted: 29 Nov 18 at 11:06PM
you've got it... now ;-)

Cheers,
Ingo

Back to Top
 Post Reply Post Reply
  Share Topic   

Forum Jump Forum Permissions View Drop Down

Forum Software by Web Wiz Forums® version 11.01
Copyright ©2001-2014 Web Wiz Ltd.

Copyright © 2017 Debenu. Debenu Quick PDF Library is a PDF SDK. All rights reserved. AboutContactBlogSupportOnline Store