Do you own a Debenu Quick PDF Library version 7, 8, 9, 10, 11, 12, 13 or iSEDQuickPDF license? Upgrade to Debenu Quick PDF Library 14 today!
![]() |
ExtractFilePageText - Underscores |
Post Reply
|
| Author | |
masterofdesaster
Beginner
Joined: 28 Jun 10 Location: Switzerland Status: Offline Points: 3 |
Post Options
Thanks(0)
Quote Reply
Topic: ExtractFilePageText - UnderscoresPosted: 28 Jun 10 at 9:04PM |
|
Hi, I'm using the DLL Version. I have a PDF that contains "OBET_2007" - the ExtractFilePageText method splits this to 2 words: OBET and 2007 which is not what I want. Is there a setting/dictionary so I get the whole word? thank you very much Hanspeter Stutz |
|
![]() |
|
Ingo
Moderator Group
Joined: 29 Oct 05 Status: Offline Points: 3530 |
Post Options
Thanks(0)
Quote Reply
Posted: 28 Jun 10 at 10:46PM |
|
Hi!
Which are the options for extraction? If you're using the string by string and the "2007" was inserted later then this string is at a completely different part of the filecontent (but with the correct position data). What i want to say... try different options... are there differences? You should try option 0. This has nothing to do with the library. First in first out and so on... you'll know what i mean ;-) Cheers and welcome here, Ingo Edited by Ingo - 28 Jun 10 at 10:47PM |
|
![]() |
|
masterofdesaster
Beginner
Joined: 28 Jun 10 Location: Switzerland Status: Offline Points: 3 |
Post Options
Thanks(0)
Quote Reply
Posted: 29 Jun 10 at 4:19PM |
|
Hi Ingo,
Thanks for the welcome, appreciated! First I have to say I have absolutely no experience how this library or in general stuff like this works - so apologize for dumb questions :-) I tried with different options and for my usage 3 or 4 is best. I have to identify the single page based on a number which is always on the same line - except this OBET_2007. I can easily workaround this but I am just curious why it happens. Can you recommend something I have to read to understand better? cheers Hanspeter |
|
![]() |
|
Ingo
Moderator Group
Joined: 29 Oct 05 Status: Offline Points: 3530 |
Post Options
Thanks(0)
Quote Reply
Posted: 29 Jun 10 at 8:03PM |
|
Hi HP!
If you want to use option 3/4 then you can't do anything against it. Did you see the described behavior with other underscores, too? Perhaps the page was created time ago with "OBET_2006"... and the extracted string could have position (as an example) line 5, column 6, Arial, 10, "OBET_2006". The complete page is finished but now the "2006" shall be replaced by a "2007". Our sample-string will be line 5, column 6, Arial, 10, "OBET_" now. At the end of the textcontent there's a new string with line 5, column 11, Arial, 10, "2007". First in - first out / last in - last out. While displaying a page with a pdf-reader, the reader catch all strings of a page together and put them regarding the position data into the correct sequence. If you're using option 3/4 for extraction the sequence of the strings can be different. Using option 0, QuickPDF thinks for you and put the strings in the correct sequence but then there are other disadvantages - It's your choice. Cheers, Ingo |
|
![]() |
|
masterofdesaster
Beginner
Joined: 28 Jun 10 Location: Switzerland Status: Offline Points: 3 |
Post Options
Thanks(0)
Quote Reply
Posted: 06 Jul 10 at 4:29PM |
|
Hi Ingo,
Ok I understand now. I have it working now - thanks for your help cheers Hanspeter |
|
![]() |
|
Post Reply
|
|
|
Tweet
|
| Forum Jump | Forum Permissions ![]() You cannot post new topics in this forum You cannot reply to topics in this forum You cannot delete your posts in this forum You cannot edit your posts in this forum You cannot create polls in this forum You cannot vote in polls in this forum |
Copyright © 2017 Debenu. Debenu Quick PDF Library is a PDF SDK. All rights reserved. About — Contact — Blog — Support — Online Store