<?xml version="1.0" encoding="utf-8" ?>
<?xml-stylesheet type="text/xsl" href="RSS_xslt_style.asp" version="1.0" ?>
<rss version="2.0" xmlns:WebWizForums="http://syndication.webwiz.co.uk/rss_namespace/">
 <channel>
  <title>Debenu Quick PDF Library - PDF SDK Community Forum : Extract Text (2)</title>
  <link>http://www.quickpdf.org/forum/</link>
  <description><![CDATA[This is an XML content feed of; Debenu Quick PDF Library - PDF SDK Community Forum : I need help - I can help : Extract Text (2)]]></description>
  <copyright>Copyright (c) 2006-2013 Web Wiz Forums - All Rights Reserved.</copyright>
  <pubDate>Sun, 05 Apr 2026 19:21:19 +0000</pubDate>
  <lastBuildDate>Wed, 12 Jul 2006 13:27:35 +0000</lastBuildDate>
  <docs>http://blogs.law.harvard.edu/tech/rss</docs>
  <generator>Web Wiz Forums 11.01</generator>
  <ttl>360</ttl>
  <WebWizForums:feedURL>www.quickpdf.org/forum/RSS_post_feed.asp?TID=440</WebWizForums:feedURL>
  <image>
   <title><![CDATA[Debenu Quick PDF Library - PDF SDK Community Forum]]></title>
   <url>http://www.quickpdf.org/forum/forum_images/QPDF_Forum_Title.png</url>
   <link>http://www.quickpdf.org/forum/</link>
  </image>
  <item>
   <title><![CDATA[Extract Text (2) : Hi tren,  did You ever try &amp;#039;ExtractFilePageText&amp;#039;...]]></title>
   <link>http://www.quickpdf.org/forum/extract-text-2_topic440_post2056.html#2056</link>
   <description>
    <![CDATA[<strong>Author:</strong> <a href="http://www.quickpdf.org/forum/member_profile.asp?PF=361">ECPVFR</a><br /><strong>Subject:</strong> 440<br /><strong>Posted:</strong> 12 Jul 06 at 1:27PM<br /><br />Hi tren,<br /><br />did You ever try 'ExtractFilePageText' with option '4'?<br />This does the whole job extracting every piece of text and option '4' also extracts all included information (font, color, text size, position and the text).<br />]]>
   </description>
   <pubDate>Wed, 12 Jul 2006 13:27:35 +0000</pubDate>
   <guid isPermaLink="true">http://www.quickpdf.org/forum/extract-text-2_topic440_post2056.html#2056</guid>
  </item> 
  <item>
   <title><![CDATA[Extract Text (2) : Chicks, you&amp;#039;re a legend.  I&amp;#039;ve...]]></title>
   <link>http://www.quickpdf.org/forum/extract-text-2_topic440_post1952.html#1952</link>
   <description>
    <![CDATA[<strong>Author:</strong> <a href="http://www.quickpdf.org/forum/member_profile.asp?PF=276">tren</a><br /><strong>Subject:</strong> 440<br /><strong>Posted:</strong> 13 Jun 06 at 11:51PM<br /><br />Chicks, you're a legend.<br /><br />I've done a bit of XSLT in the past so should be ok with this. To get the absolute position of words, I imagine I'm going to have to calculate the width of each word on the line based on font/font size?<br /><br />Thanks again for the help -- btw quickpdf rules.]]>
   </description>
   <pubDate>Tue, 13 Jun 2006 23:51:59 +0000</pubDate>
   <guid isPermaLink="true">http://www.quickpdf.org/forum/extract-text-2_topic440_post1952.html#1952</guid>
  </item> 
  <item>
   <title><![CDATA[Extract Text (2) : http://pdftohtml.sourceforge.net/  This...]]></title>
   <link>http://www.quickpdf.org/forum/extract-text-2_topic440_post1951.html#1951</link>
   <description>
    <![CDATA[<strong>Author:</strong> <a href="http://www.quickpdf.org/forum/member_profile.asp?PF=115">chicks</a><br /><strong>Subject:</strong> 440<br /><strong>Posted:</strong> 13 Jun 06 at 11:14PM<br /><br />http://pdftohtml.sourceforge.net/<br /><br />This generally does an excellent job, and it's free.  Command-line only, unless you know C really well...<br /><br />Use the XML output option to get the text, font and positonal info.  You will probably need to do an XSL transform to get the output in a really usable form - if you need help, I can share some examples.<br />]]>
   </description>
   <pubDate>Tue, 13 Jun 2006 23:14:08 +0000</pubDate>
   <guid isPermaLink="true">http://www.quickpdf.org/forum/extract-text-2_topic440_post1951.html#1951</guid>
  </item> 
  <item>
   <title><![CDATA[Extract Text (2) : Hi There,  I was hoping someone...]]></title>
   <link>http://www.quickpdf.org/forum/extract-text-2_topic440_post1950.html#1950</link>
   <description>
    <![CDATA[<strong>Author:</strong> <a href="http://www.quickpdf.org/forum/member_profile.asp?PF=276">tren</a><br /><strong>Subject:</strong> 440<br /><strong>Posted:</strong> 13 Jun 06 at 10:24PM<br /><br />Hi There,<br /><br />I was hoping someone could help me with a problem I'm having or at least point me in the right direction. I'm trying to extract the co-ordinates of every word on the page of PDF programatically. I've tried to use GetPageText(4), in conjunction with GetPageText(1) -- because GetPageText(4) gives extremely corrupted results. In some cases I've got it to work, but in others, words are merged into one another or letters in the wrong order making it hard to compare the two results.<br /><br />Does anyone know of another package that could let me do this? The price of the product doesn't matter too much, as long as it's reliable.<br /><br />Thank you]]>
   </description>
   <pubDate>Tue, 13 Jun 2006 22:24:43 +0000</pubDate>
   <guid isPermaLink="true">http://www.quickpdf.org/forum/extract-text-2_topic440_post1950.html#1950</guid>
  </item> 
 </channel>
</rss>