Print Page | Close Window

GetPageText get repeat char

Printed From: Debenu Quick PDF Library - PDF SDK Community Forum
Category: For Users of the Library
Forum Name: I need help - I can help
Forum Description: Problems and solutions while programming with the Debenu Quick PDF Library and Debenu PDF Viewer SDK
URL: http://www.quickpdf.org/forum/forum_posts.asp?TID=2577
Printed Date: 17 Oct 25 at 6:41AM
Software Version: Web Wiz Forums 11.01 - http://www.webwizforums.com


Topic: GetPageText get repeat char
Posted By: purple
Subject: GetPageText get repeat char
Date Posted: 25 Mar 13 at 10:41AM

Hi, I'v got a file from customer, using qp.GetPageText to get text, each word is repeat all char 3 times, and copy text from adobe reader is ok. Have any ideas?




Replies:
Posted By: AndrewC
Date Posted: 25 Mar 13 at 11:03PM
Some PDF libraries use a Normal font and draw it 3 or 4 times at a small offset to simulate Bold font.  This is the most likely reason for multiple repeated character.

You should try calling either/or 

  QP.SetTextExtractionOptions(7, 1);

and / or 

  QP.SetTextExtractionOptions(8, 1); or 


Andrew.



Posted By: purple
Date Posted: 27 Mar 13 at 6:07AM
The version I use is 8.16,  seems this option 7 and 8 is new feature in 9.xx?
I 'v got another file, extract mess text like 'ÁÃÃÖäÕã âäÔÔÁÙè',  the origin text is 'ACCOUNT SUMMARY', can be copy from adobe reader and that is OK.
can this be fix?
Thanks
purple.



Print Page | Close Window

Forum Software by Web Wiz Forums® version 11.01 - http://www.webwizforums.com
Copyright ©2001-2014 Web Wiz Ltd. - http://www.webwiz.co.uk