Print Page | Close Window

Encryption in GetPageText?

Printed From: Debenu Quick PDF Library - PDF SDK Community Forum
Category: For Users of the Library
Forum Name: I need help - I can help
Forum Description: Problems and solutions while programming with the Debenu Quick PDF Library and Debenu PDF Viewer SDK
URL: http://www.quickpdf.org/forum/forum_posts.asp?TID=2885
Printed Date: 12 Jun 25 at 12:19PM
Software Version: Web Wiz Forums 11.01 - http://www.webwizforums.com


Topic: Encryption in GetPageText?
Posted By: jwinkl
Subject: Encryption in GetPageText?
Date Posted: 01 May 14 at 4:53PM
I'm completely at a loss with GetPageText. When using it with the Parameter 2, I get - for example on page 1 of the "GettingStarted.pdf" document from Debenu itself - the firste line

250.53,715.64,#000000,12,"GBONLS+Verdana [Bold]","Delphi Edition"

which is fairly clear to me. When using the parameter 3, it is

"GBONLS+Verdana [Bold]",#000000,12,250.5258,128.7699,344.7498,128.7699,344.7498,114.1899,250.5258,114.1899,"'HOSKL (GLWLRQ"

so it seems that "Delphi Edition" is somehow (not very intelligently) encrypted to "'HOSKL (GLWLRQ".

Now I would be obliged if anyone could tell me why this is so and how the clear text could be retrieved. For my purposes parameter 3 is mandatory, because I need the bounds rectangle of the text as well as the text itself.

I'm using version 10.13



Replies:
Posted By: AndrewC
Date Posted: 02 May 14 at 9:59AM
jwinkl,

Can you please send me the PDF file to support@debenu.com.  

Text extraction is complex and it looks like this PDF is using encoding tables.  It is strange that Option 2 is working better than Option 3 as it is usually the other way around.  Option 3 does a lot more work and can extract quite complex encoded fonts.

Andrew.



Print Page | Close Window

Forum Software by Web Wiz Forums® version 11.01 - http://www.webwizforums.com
Copyright ©2001-2014 Web Wiz Ltd. - http://www.webwiz.co.uk