Do you own a Debenu Quick PDF Library version 7, 8, 9, 10, 11, 12, 13 or iSEDQuickPDF license? Upgrade to Debenu Quick PDF Library 14 today!
DPL.ExtractFilePageText's error |
Post Reply |
Author | |
Jimmy Wu.
Beginner Joined: 16 Feb 19 Location: Taiwan Status: Offline Points: 3 |
Post Options
Thanks(0)
Posted: 16 Feb 19 at 9:04AM |
Dear Sir :
I'm coding using Delphi,I need to convert PDF to Text file using cmd " DPL.ExtractFilePageText" ,and just get an empty text file ,please kindly help me, source are as follows: //INITIAL DPL DPL := TDebenuPDFLibrary1411.Create; //check authority if NOT DPL.UnlockKey(edtLicenseKey.Text) = 1 then begin showmessage('The key is invalid,exit now!'); EXIT; end; DPL.LoadFromFile(PDF_File,''); iNumPages := DPL.PageCount; strText := ''; getdir(0,Cur_Path); TextF:=Cur_Path+'\'+'PDF2TextF.txt'; AssignFile(f,TextF); if FileExists(TextF) then Erase(f); Rewrite(f); For nPage := 1 to iNumPages do Begin strText:= strText + DPL.ExtractFilePageText(PDF_File, '', nPage, 3); // Write all the data to a file Writeln(f,strText); End; CloseFile(f); |
|
Ingo
Moderator Group Joined: 29 Oct 05 Status: Offline Points: 3524 |
Post Options
Thanks(0)
|
Hi Jimmy,
you should read the description about ExtractFilePageText: https://www.debenu.com/docs/pdf_library_reference/ExtractFilePageText.php "...internally use of DAfunctionality..." LoadFromFile needs a lot memory - ExtractFilePageText not. You destroy the advantages of ExtractFilePageText while using LoadFromFile. Please read more about DA functionalities using QuickPDF: https://www.debenu.com/docs/pdf_library_reference/DirectAccessFunctionality.php The reason for your issue could be an encrypted pdf-file. Before loading and processing a pdf document you should decrypted it. Here's a sample-code. You should use it directly after LoadFromFile: // . . . QP := TDebenuPDFLibrary1611.Create; try QP.UnlockKey('my_reg_key_i_have_to_insert_here'); QP.LoadFromFile(fnew, ''); If ( QP.EncryptionStatus > 0 ) Then QP.Decrypt; QP.SaveToFile(fnew + '.save.pdf'); finally QP.Free; end; // . . . For extracting text from pdf you can use code like this as well: // . . . for i := 1 to QP.PageCount Do begin QP.SelectPage(i); QP.SetOrigin(1); QP.CombineContentStreams; STR := STR + Trim(QP.GetPageText(8)); end; // . . . Cheers and welcome here, Ingo |
|
Cheers,
Ingo |
|
Jimmy Wu.
Beginner Joined: 16 Feb 19 Location: Taiwan Status: Offline Points: 3 |
Post Options
Thanks(0)
|
Yes,just encrypt error,that's not problem now !The problem is one string:"聖品脂肪抹醬(16KG 彩鐵 金黃蓋" that is splited into 4 string fields by DPL.ExtractFilePageText
as : 聖品脂肪抹醬(統清)G 彩鐵 金黃蓋 1 6 K are there any ways to solove it? Thanks! <PS>: After ExtractFilePageText ,the string should still be one field: as::"聖品脂肪抹醬(16KG 彩鐵 金黃蓋"
|
|
Ingo
Moderator Group Joined: 29 Oct 05 Status: Offline Points: 3524 |
Post Options
Thanks(0)
|
I can't read your post - seams to be asian character set... But i know what you're meaning. The extraction works the other way round like inserted. What's inserted last will be extracted first. And if there was a short insertion after complete textcreation these last insertion will be extracted first. There are several options using textextraction try option 0, 7 or 8 to get a human readable result. |
|
Cheers,
Ingo |
|
Post Reply | |
Tweet
|
Forum Jump | Forum Permissions You cannot post new topics in this forum You cannot reply to topics in this forum You cannot delete your posts in this forum You cannot edit your posts in this forum You cannot create polls in this forum You cannot vote in polls in this forum |
Copyright © 2017 Debenu. Debenu Quick PDF Library is a PDF SDK. All rights reserved. About — Contact — Blog — Support — Online Store