Print Page | Close Window

GetPageText() problem with Acrobat 4.x

Printed From: Debenu Quick PDF Library - PDF SDK Community Forum
Category: For Users of the Library
Forum Name: I need help - I can help
Forum Description: Problems and solutions while programming with the Debenu Quick PDF Library and Debenu PDF Viewer SDK
URL: http://www.quickpdf.org/forum/forum_posts.asp?TID=267
Printed Date: 12 May 24 at 8:17PM
Software Version: Web Wiz Forums 11.01 - http://www.webwizforums.com


Topic: GetPageText() problem with Acrobat 4.x
Posted By: ixm7
Subject: GetPageText() problem with Acrobat 4.x
Date Posted: 16 Jan 06 at 11:41AM
By mistake, I posted this as a follow-up on a previous thread. I figured I should start a new threat with it:

Using GetPageText(3) works perfectly with documents that are PDF Version 1.2 (Acrobat 3.x).

However, the csv information returned from documents that are PDF Version 1.3 (Acrobat 4.x) fails to return the full text content. It seems to return only the first 1 or 2 characters from each text object.

Is this a known issue? Any solution?

Many Thanks!
- Ido



Replies:
Posted By: ixm7
Date Posted: 16 Jan 06 at 3:53PM
By the way, using the SaveToFile / LoadFromFile approach (that was suggested in an earlier thread describing a similar problem) doesn't fix the issue for me.

- Ido


Posted By: swb1
Date Posted: 16 Jan 06 at 4:55PM

Posted in the other thread as well...

I am using GetPageText(3) in v1.3 docs with no issues at all. In fact, GetPageText(3) was the single most compelling reason for my purchase of QuickPDF.

How are your PDFs created? Can you post an example?

sb



Posted By: ixm7
Date Posted: 16 Jan 06 at 5:05PM
They are created through Crystal Reports.

The 1.2 (Acrobat 3) version is from Crystal 9. Here's a sample that works perfectly with GetPageText(3):
http://www.milletsoftware.com/Download/visual_cut_9.pdf

The 1.3 (Acrobat 4) version is from Crystal XI. Here's a sample that fails to return the full text with GetPageText(3):
http://www.milletsoftware.com/Download/visual_cut_11.pdf

Doing LoadFromFile(originalfile)... SaveToFile(workfile)... LoadFromFile(workfile)... to try to work around the problem doesn't make a difference. The GetPageText()still returns only the first 1-2 characters from each text object.

- Ido


Posted By: swb1
Date Posted: 16 Jan 06 at 6:00PM

I am getting the same results as you. Sometimes the first letter and sometimes no text at all. I am however getting the proper bounding rectangle of the text.

 

This problem likely has more to do with the file creation utility than with fact that it is v1.3.  

 

I’m sorry that I do not have more to offer.

 

sb


Posted By: ixm7
Date Posted: 16 Jan 06 at 6:07PM
Thanks for the confirmation - at least I know I'm not dreaming... :o)

- Ido


Posted By: DELBEKE
Date Posted: 17 Jan 06 at 12:56AM

I have also got some problems with this function, not the sames, but try these

http://www.quickpdf.org/forum/forum_posts.asp?TID=264&PN=1 - http://www.quickpdf.org/forum/forum_posts.asp?TID=264&PN=1

 



Posted By: swb1
Date Posted: 17 Jan 06 at 10:02AM

This Code...

 quickPDF.SplitPageText(0);
 While quickPDF.LayerCount > 1 do
   begin
   quickPDF.SelectLayer(1);
   quickPDF.DeleteLayer;
   end;
  Memo1.Lines.Text := quickPDF.GetPageText( 3 );

...gives the same results as before - only a few of the first characters.



Posted By: ixm7
Date Posted: 17 Jan 06 at 10:45AM
Thanks! So we know that trying to play with Layers doesn't fix the issue.

It's strange that the text is clearly there in the pdf, and the function finds it, but then it fails to recognize all of it.

Many thanks for the detective work...

- Ido



Print Page | Close Window

Forum Software by Web Wiz Forums® version 11.01 - http://www.webwizforums.com
Copyright ©2001-2014 Web Wiz Ltd. - http://www.webwiz.co.uk