Debenu Quick PDF Library - PDF SDK Community Forum : Extracting text from different charsets

Debenu Quick PDF Library - PDF SDK Community Forum : Extracting text from different charsets http://www.quickpdf.org/forum/ Copyright (c) 2006-2013 Web Wiz Forums - All Rights Reserved. Tue, 21 Jul 2026 21:55:37 +0000 Tue, 22 May 2012 12:21:31 +0000 http://blogs.law.harvard.edu/tech/rss Web Wiz Forums 11.01 360 www.quickpdf.org/forum/RSS_post_feed.asp?TID=2271 <![CDATA[Debenu Quick PDF Library - PDF SDK Community Forum]]> http://www.quickpdf.org/forum/forum_images/QPDF_Forum_Title.png http://www.quickpdf.org/forum/ <![CDATA[Extracting text from different charsets : Can you try using ExtractPageText(3)...]]> http://www.quickpdf.org/forum/extracting-text-from-different-charsets_topic2271_post9628.html#9628 Author: AndrewC
Subject: 2271
Posted: 22 May 12 at 12:21PM

Can you try using ExtractPageText(3) or ExtractPageText(7) to see if the text is extracted correctly. Option 0 is a very fast extraction method but it is not aware of font encodings.

In QPL 8.xx we have added option 8 which outputs the text using the same format as option 0 but can handle various font encodings and mappings.

Andrew.

]]> Tue, 22 May 2012 12:21:31 +0000 http://www.quickpdf.org/forum/extracting-text-from-different-charsets_topic2271_post9628.html#9628 <![CDATA[Extracting text from different charsets : Hi. I'm using v.7.26 under...]]> http://www.quickpdf.org/forum/extracting-text-from-different-charsets_topic2271_post9621.html#9621 Author: miguele
Subject: 2271
Posted: 21 May 12 at 11:20AM

Hi. I'm using v.7.26 under Delphi XE to extract from different charset PDF files (usually with accentuation characters). Usually the "ExtractFilePageText" with option 0 reads accurately, but for some older PDF files the accentuated characters are extracted wrongly. Is there a way to prevent this? Can you provide some sample code?

Thanks!

]]> Mon, 21 May 2012 11:20:55 +0000 http://www.quickpdf.org/forum/extracting-text-from-different-charsets_topic2271_post9621.html#9621