Do you own a Debenu Quick PDF Library version 7, 8, 9, 10, 11, 12, 13 or iSEDQuickPDF license? Upgrade to Debenu Quick PDF Library 14 today!

Debenu Quick PDF Library - PDF SDK Community Forum Homepage
Forum Home Forum Home > For Users of the Library > I need help - I can help
  New Posts New Posts RSS Feed - ExtractFilePageText - Underscores
  FAQ FAQ  Forum Search   Register Register  Login Login

ExtractFilePageText - Underscores

 Post Reply Post Reply
Author
Message
masterofdesaster View Drop Down
Beginner
Beginner


Joined: 28 Jun 10
Location: Switzerland
Status: Offline
Points: 3
Post Options Post Options   Thanks (0) Thanks(0)   Quote masterofdesaster Quote  Post ReplyReply Direct Link To This Post Topic: ExtractFilePageText - Underscores
    Posted: 28 Jun 10 at 9:04PM
Hi, I'm using the DLL Version.

I have a PDF that contains "OBET_2007" - the ExtractFilePageText method splits this to 2 words: OBET and 2007 which is not what I want.

Is there a setting/dictionary so I get the whole word?

thank you very much
Hanspeter Stutz 

Back to Top
Ingo View Drop Down
Moderator Group
Moderator Group
Avatar

Joined: 29 Oct 05
Status: Offline
Points: 3530
Post Options Post Options   Thanks (0) Thanks(0)   Quote Ingo Quote  Post ReplyReply Direct Link To This Post Posted: 28 Jun 10 at 10:46PM
Hi!

Which are the options for extraction?
If you're using the string by string and the "2007" was inserted later then this string is at a completely different part of the filecontent (but with the correct position data).
What i want to say... try different options... are there differences? You should try option 0.
This has nothing to do with the library. First in first out and so on... you'll know what i mean ;-)

Cheers and welcome here, Ingo


Edited by Ingo - 28 Jun 10 at 10:47PM
Back to Top
masterofdesaster View Drop Down
Beginner
Beginner


Joined: 28 Jun 10
Location: Switzerland
Status: Offline
Points: 3
Post Options Post Options   Thanks (0) Thanks(0)   Quote masterofdesaster Quote  Post ReplyReply Direct Link To This Post Posted: 29 Jun 10 at 4:19PM
Hi Ingo,

Thanks for the welcome, appreciated!

First I have to say I have absolutely no experience how this library or in general stuff like this works - so apologize for dumb questions :-)

I tried with different options and for my usage 3 or 4 is best. I have to identify the single page based on a number which is always on the same line - except this OBET_2007.

I can easily workaround this but I am just curious why it happens. Can you recommend something I have to read to understand better?

cheers
Hanspeter

Back to Top
Ingo View Drop Down
Moderator Group
Moderator Group
Avatar

Joined: 29 Oct 05
Status: Offline
Points: 3530
Post Options Post Options   Thanks (0) Thanks(0)   Quote Ingo Quote  Post ReplyReply Direct Link To This Post Posted: 29 Jun 10 at 8:03PM
Hi HP!

If you want to use option 3/4 then you can't do anything against it.
Did you see the described behavior with other underscores, too?
Perhaps the page was created time ago with "OBET_2006"...
and the extracted string could have position (as an example) line 5,
column 6, Arial, 10, "OBET_2006".
The complete page is finished but now the "2006" shall be replaced
by a "2007". Our sample-string will be line 5, column 6, Arial, 10, "OBET_"
now. At the end of the textcontent there's a new string with line 5,
column 11, Arial, 10, "2007".
First in - first out / last in - last out.
While displaying a page with a pdf-reader, the reader catch all strings of a page
together and put them regarding the position data into the correct sequence.
If you're using option 3/4 for extraction the sequence of the strings can be
different.
Using option 0, QuickPDF thinks for you and put the strings in the correct sequence
but then there are other disadvantages - It's your choice.

Cheers, Ingo
Back to Top
masterofdesaster View Drop Down
Beginner
Beginner


Joined: 28 Jun 10
Location: Switzerland
Status: Offline
Points: 3
Post Options Post Options   Thanks (0) Thanks(0)   Quote masterofdesaster Quote  Post ReplyReply Direct Link To This Post Posted: 06 Jul 10 at 4:29PM
Hi Ingo,

Ok I understand now. I have it working now - thanks for your help

cheers
Hanspeter

Back to Top
 Post Reply Post Reply
  Share Topic   

Forum Jump Forum Permissions View Drop Down

Forum Software by Web Wiz Forums® version 11.01
Copyright ©2001-2014 Web Wiz Ltd.

Copyright © 2017 Debenu. Debenu Quick PDF Library is a PDF SDK. All rights reserved. AboutContactBlogSupportOnline Store