Do you own a Debenu Quick PDF Library version 7, 8, 9, 10, 11, 12, 13 or iSEDQuickPDF license? Upgrade to Debenu Quick PDF Library 14 today!
![]() |
GetPageText(4) returns cryptic characters |
Post Reply
|
| Author | |
christoph81
Beginner
Joined: 08 Aug 11 Location: Germany Status: Offline Points: 1 |
Post Options
Thanks(0)
Quote Reply
Topic: GetPageText(4) returns cryptic charactersPosted: 08 Aug 11 at 11:14AM |
|
Hello everybody,
im trying your sdk and with most of my pdf file i get good result. Now i tried another pdf file with embedded text and i get cryptic results for the words. After that i tried to copy the text out of the pdf with the foxit pdf viewer. And i also get the same strange results. A customer of us said it could be related to the fonts used in the pdf file. Best regards Christoph |
|
![]() |
|
Rowan
Moderator Group
Joined: 10 Jan 09 Status: Offline Points: 398 |
Post Options
Thanks(0)
Quote Reply
Posted: 08 Aug 11 at 2:38PM |
|
Hi Christoph,
There's a small chance that the text that you're trying to extract text from contains Unicode characters and these aren't being decoded when you get the results? The result is encoded using UTF-8 in the Delphi and DLL editions of the library. However, if you cannot copy the text out of the PDF using Foxit PDF Viewer either, then that means that there's probably an issue with the cmap, character lookup map, which means that the PDF is somewhat corrupt and you won't be able to extract text from it until it is repaired (using Acrobat or a similar tool which can repair PDFs). Cheers, - Rowan.
|
|
![]() |
|
Post Reply
|
|
|
Tweet
|
| Forum Jump | Forum Permissions ![]() You cannot post new topics in this forum You cannot reply to topics in this forum You cannot delete your posts in this forum You cannot edit your posts in this forum You cannot create polls in this forum You cannot vote in polls in this forum |
Copyright © 2017 Debenu. Debenu Quick PDF Library is a PDF SDK. All rights reserved. About — Contact — Blog — Support — Online Store