Do you own a Debenu Quick PDF Library version 7, 8, 9, 10, 11, 12, 13 or iSEDQuickPDF license? Upgrade to Debenu Quick PDF Library 14 today!
![]() |
ExtractFilePageText hangs on many PDF's |
Post Reply ![]() |
Author | |
Franciscus ![]() Beginner ![]() Joined: 17 Dec 19 Status: Offline Points: 9 |
![]() ![]() ![]() ![]() ![]() Posted: 17 Dec 19 at 10:45AM |
I am batch-processing 24 TB of PDF's (1.6 million). Using v. 16.12. The problem is that all the functions I have used so far sometimes hang indefinitely on legitimate PDF's, making Quik PDF Library unusable. I have purchased a Delphi Source license but no sources were supplied so I am unable to pinpoint the bug or fix it myself. ExtractFilePageText hangs on
many PDFs, including this one: http://fdg.am/UNTITLED1029.pdf So this hangs forever: s := PDFLibrary.ExtractFilePageText('UNTITLED1029.pdf', '', 1, 0); I don't expect anyone having a solution but who knows. Thanks for your suggestions (a workaround would be greatly appreciated). Also, what does the Delphi sourcecode license entail? The actual sources for this function, in Delphi?
|
|
![]() |
|
Franciscus ![]() Beginner ![]() Joined: 17 Dec 19 Status: Offline Points: 9 |
![]() ![]() ![]() ![]() ![]() |
FIXED! The problem was that the PDF's were encrypted. if PDFLibrary.EncryptionStatus > 0 then PDFLibrary.Decrypt; Of course, the function should do this by itself, automatically instead of hanging...
Edited by Franciscus - 17 Dec 19 at 11:32AM |
|
![]() |
|
Ingo ![]() Moderator Group ![]() ![]() Joined: 29 Oct 05 Status: Offline Points: 3529 |
![]() ![]() ![]() ![]() ![]() |
Hi Franciscus,
if the real owner of a pdf encrypt it then he wants it encrypted and untouched. So it's good to have this functionality separated. BTW: If you've done a look into the samples before then you would have known ;-) If you've so many pdfs to check you should think about using the DAfunctions... Here's the developer-guide: https://www.debenu.com/products/development/debenu-pdf-library/help/developer-guide/ Cheers and welcome here, Ingo |
|
Cheers,
Ingo |
|
![]() |
|
Franciscus ![]() Beginner ![]() Joined: 17 Dec 19 Status: Offline Points: 9 |
![]() ![]() ![]() ![]() ![]() |
Hi Ingo, Thanks. I am amazed by the ability of the library to hack encryption - esp. on-the-fly! How does it guess the password used to encrypt the PDF? Wow. Chapeau. Chinese hackers are good, obviously. Alternatively, I do not understand what PDF encryption means. Yet I disagree - the Library works for me, the buyer of the library. Not for the author of the PDF. When I ask the lib to get me text from a PDF and the lib is able to do it, it should do it and not fail silently when it sees it needs to "decrypt" - whatever that means here - first. Anyway - not the most important thing here. More serious is that there are many issues with the library that become apparent when processing 1,000,000+ PDF's made over the past decades by tens of thousands of people and dozens to hundreds of PDF-generators, so perhaps I'll not be able to use it for my purposes. They should give me the Delphi sourcecode, which I paid for but never received. A memory leak all the way to half a TB was fixble by instantiating and freeing the lib for every PDF - fortunately that is quick - but AV's and hangs on malformed and also proper PDF's is causing serious delays and countless restarts of hung processes so I'll end up writing code to extract what I need myself, I'm afraid... Again a reminder why I strongly adhere to the "not invented here" principle, whenever possible and reasonable. I'd love to have the sources - do you know whether the lib was written in Delphi?
I have the offending PDF's saved for FoxIt so I'll send them to them. Edited by Franciscus - 18 Dec 19 at 7:47AM |
|
![]() |
|
Ingo ![]() Moderator Group ![]() ![]() Joined: 29 Oct 05 Status: Offline Points: 3529 |
![]() ![]() ![]() ![]() ![]() |
Hi, why not check the pdfs before processing? Directly after load you can set "LastErrorCode" to check if the load was okay. Additionally you can check if there is textcontent with "HasFontResources". Additionally you can check if a user-password is set... ebooks can make problems as well. All these things you can check before processing. If there are performance issues while processing you should change your code using the DA-functions. The library source is written in pure Delphi. |
|
Cheers,
Ingo |
|
![]() |
|
Franciscus ![]() Beginner ![]() Joined: 17 Dec 19 Status: Offline Points: 9 |
![]() ![]() ![]() ![]() ![]() |
Thank you!
|
|
![]() |
Post Reply ![]() |
|
Tweet
|
Forum Jump | Forum Permissions ![]() You cannot post new topics in this forum You cannot reply to topics in this forum You cannot delete your posts in this forum You cannot edit your posts in this forum You cannot create polls in this forum You cannot vote in polls in this forum |
Copyright © 2017 Debenu. Debenu Quick PDF Library is a PDF SDK. All rights reserved. About — Contact — Blog — Support — Online Store