QP.GetPageText(3) works much better, why?
Printed From: Debenu Quick PDF Library - PDF SDK Community Forum
Category: For Users of the Library
Forum Name: I need help - I can help
Forum Description: Problems and solutions while programming with the Debenu Quick PDF Library and Debenu PDF Viewer SDK
URL: http://www.quickpdf.org/forum/forum_posts.asp?TID=873
Printed Date: 14 May 24 at 5:45PM Software Version: Web Wiz Forums 11.01 - http://www.webwizforums.com
Topic: QP.GetPageText(3) works much better, why?
Posted By: PGU_78
Subject: QP.GetPageText(3) works much better, why?
Date Posted: 05 Mar 08 at 10:07AM
Let's see one PDF file, I uploaded it here:
http://depositfiles.com/files/3945338
Let's select the 3rd page and call GetPageText twice.
GetPageText(0) returns an empty string, but GetPageText(3) works much better, it returns:
"TimesNewRoman",#000000,12.00,46.7520,755.6305,46.7520,773.9953,33.4680,773.9953,33.4680,755.6305,"3/4 " "Century",#000000,10.50,545.6880,56.6906,545.6880,59.6096,533.0775,59.6096,533.0775,56.6906," " "DEDPCC+Arial",#000000,12.00,72.4920,73.9706,72.4920,77.3066,59.1000,77.3066,59.1000,73.9706," " "DEDPCC+Arial",#000000,12.00,90.4920,61.9706,90.4920,532.1882,77.1000,532.1882,77.1000,61.9706,"4. Correction for speaker loudness to maximum sometimes when turning on the power. " "DEDPCC+Arial",#000000,12.00,108.4920,73.9106,108.4920,77.2466,95.1000,77.2466,95.1000,73.9106," " "DEDPCC+Arial",#000000,12.00,126.4920,61.9706,126.4920,514.2554,113.1000,514.2554,113.1000,61.9706,"5. Correction for even if the SIG-Center value of 7009 is edited in the service menu, " "DEDPCC+Arial",#000000,12.00,144.4920,67.9706,144.4920,354.2066,131.1000,354.2066,131.1000,67.9706," the set value is not reflected when powered on next. " "DEDPCC+Arial",#000000,12.00,162.4920,73.9106,162.4920,123.4754,149.1000,123.4754,149.1000,73.9106," " "DEDPCC+Arial",#000000,12.00,180.4920,61.9706,180.4920,65.3066,167.1000,65.3066,167.1000,61.9706," " "DEDPCC+Arial",#000000,12.00,198.4920,56.6906,198.4920,99.3782,185.1000,99.3782,185.1000,56.6906,"<Note> " "DEDPCC+Arial",#000000,12.00,216.4920,56.6906,216.4920,620.6726,203.1000,620.6726,203.1000,56.6906,"In this particular model, PW IC features CPU function so only PW software update is needed and enough. " "DEDPCC+Arial",#000000,12.00,234.4920,56.6906,234.4920,761.6174,221.1000,761.6174,221.1000,56.6906,"The user setting will be getting back to shipping condition by software update, however the Lamp timer and adjusted data won’t be " "DEDPCC+Arial",#000000,12.00,252.4920,56.6906,252.4920,118.1258,239.1000,118.1258,239.1000,56.6906,"changed. " "DEDPCC+Arial",#000000,12.00,270.4920,56.6906,270.4920,60.0266,257.1000,60.0266,257.1000,56.6906," " "DEDPCC+Arial",#000000,12.00,288.4920,56.6906,288.4920,175.9286,275.1000,175.9286,275.1000,56.6906,"<Supplemental Note> " "DEDPCC+Arial",#000000,12.00,306.4920,56.6906,306.4920,554.5022,293.1000,554.5022,293.1000,56.6906," The software is posted on the service department web site and will be available for 6 months. " "DEDPCC+Arial",#000000,12.00,324.4920,56.6906,324.4920,60.0266,311.1000,60.0266,311.1000,56.6906," "
Why GetPageText(0) gives not good results?
Many thanks in advance.
------------- Don't be afraid. Be very afraid (The Fly, 1986)
|
Replies:
Posted By: Ingo
Date Posted: 05 Mar 08 at 10:28AM
Hi! GetPageText(0) is faster ... but older and more buggy. GetPageText(3) is not so fast but not so buggy... and it offers more informations - so take "3"... It's like it is ;-) Best regards, Ingo
|
Posted By: PGU_78
Date Posted: 07 Mar 08 at 3:47AM
Thank you, Ingo (our pdf expert).
I have a PDF file, and GetPageText(3) returns an empty string, but GetPageText(4) extacts the text from that PDF. Is the 4th parameter better than the 3rd parameter?
Is there a guarantee method of a text extraction? As I understand, GetPageText(0) isn't recommended.
------------- Don't be afraid. Be very afraid (The Fly, 1986)
|
Posted By: Ingo
Date Posted: 07 Mar 08 at 5:14AM
Hi!
I'm not the expert ... but (it seems to me) the only one looking here in a regularely way ;-) I've never used option 4 but you can see that the developements 0-4 went different ways ;-) I prefer option 3.
best regards, Ingo
|
|