Do you own a Debenu Quick PDF Library version 7, 8, 9, 10, 11, 12, 13 or iSEDQuickPDF license? Upgrade to Debenu Quick PDF Library 14 today!
![]() |
Extracting text from different charsets |
Post Reply ![]() |
Author | |
miguele ![]() Beginner ![]() Joined: 21 May 12 Location: Brazil Status: Offline Points: 3 |
![]() ![]() ![]() ![]() ![]() Posted: 21 May 12 at 11:20AM |
Hi. I'm using v.7.26 under Delphi XE to extract from different charset PDF files (usually with accentuation characters). Usually the "ExtractFilePageText" with option 0 reads accurately, but for some older PDF files the accentuated characters are extracted wrongly. Is there a way to prevent this? Can you provide some sample code?
Thanks!
|
|
![]() |
|
AndrewC ![]() Moderator Group ![]() ![]() Joined: 08 Dec 10 Location: Geelong, Aust Status: Offline Points: 841 |
![]() ![]() ![]() ![]() ![]() |
Can you try using ExtractPageText(3) or ExtractPageText(7) to see if the text is extracted correctly. Option 0 is a very fast extraction method but it is not aware of font encodings.
In QPL 8.xx we have added option 8 which outputs the text using the same format as option 0 but can handle various font encodings and mappings. Andrew. Andrew.
|
|
![]() |
Post Reply ![]() |
|
Tweet
|
Forum Jump | Forum Permissions ![]() You cannot post new topics in this forum You cannot reply to topics in this forum You cannot delete your posts in this forum You cannot edit your posts in this forum You cannot create polls in this forum You cannot vote in polls in this forum |
Copyright © 2017 Debenu. Debenu Quick PDF Library is a PDF SDK. All rights reserved. About — Contact — Blog — Support — Online Store