Do you own a Debenu Quick PDF Library version 7, 8, 9, 10, 11, 12, 13 or iSEDQuickPDF license? Upgrade to Debenu Quick PDF Library 14 today!

Debenu Quick PDF Library - PDF SDK Community Forum Homepage
Forum Home Forum Home > For Users of the Library > I need help - I can help
  New Posts New Posts RSS Feed - DPL.ExtractFilePageText's error
  FAQ FAQ  Forum Search   Register Register  Login Login

DPL.ExtractFilePageText's error

 Post Reply Post Reply
Author
Message
Jimmy Wu. View Drop Down
Beginner
Beginner


Joined: 16 Feb 19
Location: Taiwan
Status: Offline
Points: 3
Post Options Post Options   Thanks (0) Thanks(0)   Quote Jimmy Wu. Quote  Post ReplyReply Direct Link To This Post Topic: DPL.ExtractFilePageText's error
    Posted: 16 Feb 19 at 9:04AM
Smile Dear Sir :
 
I'm coding using Delphi,I need to convert PDF to Text file using cmd " DPL.ExtractFilePageText"
,and just get an empty text file ,please kindly help me, source are as follows:
 
//INITIAL DPL
   DPL := TDebenuPDFLibrary1411.Create;
   //check authority
   if NOT DPL.UnlockKey(edtLicenseKey.Text) = 1 then
   begin
     showmessage('The key is invalid,exit now!');
     EXIT;
   end;
   DPL.LoadFromFile(PDF_File,'');
   iNumPages := DPL.PageCount;
   strText   := '';
   getdir(0,Cur_Path);
   TextF:=Cur_Path+'\'+'PDF2TextF.txt';
   AssignFile(f,TextF);
   if FileExists(TextF) then Erase(f);
   Rewrite(f);
  
   For nPage := 1 to iNumPages do
   Begin
      strText:= strText + DPL.ExtractFilePageText(PDF_File, '', nPage, 3);
      // Write all the data to a file
      Writeln(f,strText);
   End;
   CloseFile(f);
 
 
Back to Top
Ingo View Drop Down
Moderator Group
Moderator Group
Avatar

Joined: 29 Oct 05
Status: Offline
Points: 3524
Post Options Post Options   Thanks (0) Thanks(0)   Quote Ingo Quote  Post ReplyReply Direct Link To This Post Posted: 16 Feb 19 at 2:54PM
Hi Jimmy,

you should read the description about ExtractFilePageText:
https://www.debenu.com/docs/pdf_library_reference/ExtractFilePageText.php
"...internally use of DAfunctionality..."
LoadFromFile needs a lot memory - ExtractFilePageText not.
You destroy the advantages of ExtractFilePageText while using LoadFromFile.
Please read more about DA functionalities using QuickPDF:
https://www.debenu.com/docs/pdf_library_reference/DirectAccessFunctionality.php

The reason for your issue could be an encrypted pdf-file.
Before loading and processing a pdf document you should decrypted it.
Here's a sample-code. You should use it directly after LoadFromFile:
// . . .
   QP := TDebenuPDFLibrary1611.Create;
   try
      QP.UnlockKey('my_reg_key_i_have_to_insert_here');
      QP.LoadFromFile(fnew, '');
      If ( QP.EncryptionStatus > 0 ) Then
         QP.Decrypt;
      QP.SaveToFile(fnew + '.save.pdf');
   finally
      QP.Free;
   end;
// . . .

For extracting text from pdf you can use code like this as well:
// . . .
       for i := 1 to QP.PageCount Do
       begin
          QP.SelectPage(i);
          QP.SetOrigin(1);
          QP.CombineContentStreams;
          STR := STR + Trim(QP.GetPageText(8));
       end;
// . . .

Cheers and welcome here,
Ingo


Cheers,
Ingo

Back to Top
Jimmy Wu. View Drop Down
Beginner
Beginner


Joined: 16 Feb 19
Location: Taiwan
Status: Offline
Points: 3
Post Options Post Options   Thanks (0) Thanks(0)   Quote Jimmy Wu. Quote  Post ReplyReply Direct Link To This Post Posted: 18 Feb 19 at 7:26AM
Yes,just encrypt error,that's not problem now !
The problem is  one string:"聖品脂肪抹醬(16KG 彩鐵 金黃蓋"
     that is splited into 4 string fields by DPL.ExtractFilePageText
 
     as :  聖品脂肪抹醬(統清)G 彩鐵 金黃蓋
            1
              
            K
 
     are there any ways to solove it? 
 
                       Thanks!
 
   <PS>: After ExtractFilePageText ,the string should still be one field:
              as::"聖品脂肪抹醬(16KG 彩鐵 金黃蓋"
Back to Top
Ingo View Drop Down
Moderator Group
Moderator Group
Avatar

Joined: 29 Oct 05
Status: Offline
Points: 3524
Post Options Post Options   Thanks (0) Thanks(0)   Quote Ingo Quote  Post ReplyReply Direct Link To This Post Posted: 18 Feb 19 at 8:02AM
I can't read your post - seams to be asian character set...
But i know what you're meaning.
The extraction works the other way round like inserted.
What's inserted last will be extracted first.
And if there was a short insertion after complete textcreation these last insertion will be extracted first.
There are several options using textextraction try option 0, 7 or 8 to get a human readable result.

Cheers,
Ingo

Back to Top
 Post Reply Post Reply
  Share Topic   

Forum Jump Forum Permissions View Drop Down

Forum Software by Web Wiz Forums® version 11.01
Copyright ©2001-2014 Web Wiz Ltd.

Copyright © 2017 Debenu. Debenu Quick PDF Library is a PDF SDK. All rights reserved. AboutContactBlogSupportOnline Store