<?xml version="1.0" encoding="utf-8" ?>
<?xml-stylesheet type="text/xsl" href="RSS_xslt_style.asp" version="1.0" ?>
<rss version="2.0" xmlns:WebWizForums="http://syndication.webwiz.co.uk/rss_namespace/">
 <channel>
  <title>Debenu Quick PDF Library - PDF SDK Community Forum : Extract non-formatted Tabular Text</title>
  <link>http://www.quickpdf.org/forum/</link>
  <description><![CDATA[This is an XML content feed of; Debenu Quick PDF Library - PDF SDK Community Forum : I need help - I can help : Extract non-formatted Tabular Text]]></description>
  <copyright>Copyright (c) 2006-2013 Web Wiz Forums - All Rights Reserved.</copyright>
  <pubDate>Sun, 05 Apr 2026 05:07:11 +0000</pubDate>
  <lastBuildDate>Wed, 04 Feb 2015 10:17:24 +0000</lastBuildDate>
  <docs>http://blogs.law.harvard.edu/tech/rss</docs>
  <generator>Web Wiz Forums 11.01</generator>
  <ttl>360</ttl>
  <WebWizForums:feedURL>www.quickpdf.org/forum/RSS_post_feed.asp?TID=3057</WebWizForums:feedURL>
  <image>
   <title><![CDATA[Debenu Quick PDF Library - PDF SDK Community Forum]]></title>
   <url>http://www.quickpdf.org/forum/forum_images/QPDF_Forum_Title.png</url>
   <link>http://www.quickpdf.org/forum/</link>
  </image>
  <item>
   <title><![CDATA[Extract non-formatted Tabular Text : Sorry Andrew I was too quick with...]]></title>
   <link>http://www.quickpdf.org/forum/extract-nonformatted-tabular-text_topic3057_post12289.html#12289</link>
   <description>
    <![CDATA[<strong>Author:</strong> <a href="http://www.quickpdf.org/forum/member_profile.asp?PF=2357">chrisreed</a><br /><strong>Subject:</strong> 3057<br /><strong>Posted:</strong> 04 Feb 15 at 10:17AM<br /><br />Sorry Andrew I was too quick with my reply.<DIV>&nbsp;</DIV><DIV>Yes if I use Option 7 it matches very well what is on the PDF file - thanks for your help.</DIV><DIV>&nbsp;</DIV><DIV>Chris</DIV>]]>
   </description>
   <pubDate>Wed, 04 Feb 2015 10:17:24 +0000</pubDate>
   <guid isPermaLink="true">http://www.quickpdf.org/forum/extract-nonformatted-tabular-text_topic3057_post12289.html#12289</guid>
  </item> 
  <item>
   <title><![CDATA[Extract non-formatted Tabular Text : Hi Andrew, Sorry for the lateness...]]></title>
   <link>http://www.quickpdf.org/forum/extract-nonformatted-tabular-text_topic3057_post12288.html#12288</link>
   <description>
    <![CDATA[<strong>Author:</strong> <a href="http://www.quickpdf.org/forum/member_profile.asp?PF=2357">chrisreed</a><br /><strong>Subject:</strong> 3057<br /><strong>Posted:</strong> 04 Feb 15 at 7:10AM<br /><br />Hi Andrew,<DIV>Sorry for the lateness in&nbsp;my reply, but I never received an e-mail that you had posted a reply <img src="http://www.quickpdf.org/forum/smileys/smiley6.gif" height="17" width="17" border="0" alt="Unhappy" title="Unhappy" /></DIV><DIV>&nbsp;</DIV><DIV>Believe me I tried all the Extraction Options (from 1 to 11) and none of them were any good.&nbsp; So instead of having the fields/values go across the page I just had them going down the page as follows:</DIV><DIV>&nbsp;</DIV><DIV>&lt;Field Name&gt; &lt;Field Value&gt;</DIV><DIV>Surname:&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Tester</DIV><DIV>Firstname:&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Kenneth</DIV><DIV>DOB:&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 29 Mar 1928</DIV><DIV>Exam Date:&nbsp;&nbsp;&nbsp;&nbsp; 30 Jan 2015 07:46</DIV><DIV>Site ID:&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; RPH&nbsp;&nbsp;&nbsp; etc....</DIV><DIV>&nbsp;</DIV><DIV>and used the Extraction Option (5) - Sort text blocks based on top left position.</DIV><DIV>&nbsp;</DIV><DIV>This worked a lot better, in that this option returned most of the &lt;Field Names&gt; first and then the &lt;Field Values&gt; next, but some still got mixed up so that I couldn't associate all the correct &lt;Field Name&gt; with the matching &lt;Field Value&gt;.</DIV>]]>
   </description>
   <pubDate>Wed, 04 Feb 2015 07:10:50 +0000</pubDate>
   <guid isPermaLink="true">http://www.quickpdf.org/forum/extract-nonformatted-tabular-text_topic3057_post12288.html#12288</guid>
  </item> 
  <item>
   <title><![CDATA[Extract non-formatted Tabular Text : Chris,PDF&amp;#039;s file do not have...]]></title>
   <link>http://www.quickpdf.org/forum/extract-nonformatted-tabular-text_topic3057_post12280.html#12280</link>
   <description>
    <![CDATA[<strong>Author:</strong> <a href="http://www.quickpdf.org/forum/member_profile.asp?PF=1483">AndrewC</a><br /><strong>Subject:</strong> 3057<br /><strong>Posted:</strong> 27 Jan 15 at 10:18AM<br /><br />Chris,<div><br></div><div>PDF's file do not have TAB characters, words, sentences or paragraphs. &nbsp;Text is drawn at a specific x and y location. &nbsp;Extraction attempts to collect all the drawn text &nbsp;but is not always perfect.</div><div><br></div><div>GetPageText of DAExtractPageText using option 7 will be your best chance.</div><div><br></div><div>Andrew.</div>]]>
   </description>
   <pubDate>Tue, 27 Jan 2015 10:18:19 +0000</pubDate>
   <guid isPermaLink="true">http://www.quickpdf.org/forum/extract-nonformatted-tabular-text_topic3057_post12280.html#12280</guid>
  </item> 
  <item>
   <title><![CDATA[Extract non-formatted Tabular Text : Can&amp;#039;t find any site to upload...]]></title>
   <link>http://www.quickpdf.org/forum/extract-nonformatted-tabular-text_topic3057_post12266.html#12266</link>
   <description>
    <![CDATA[<strong>Author:</strong> <a href="http://www.quickpdf.org/forum/member_profile.asp?PF=2357">chrisreed</a><br /><strong>Subject:</strong> 3057<br /><strong>Posted:</strong> 21 Jan 15 at 10:16AM<br /><br />Can't find any site to upload the example PDF that I'm trying to process without our Firewall blocking it (tried docdroid, scribd, dropbox) so the best I can do is upload an image.<DIV>&nbsp;</DIV><DIV><a href="http://s5.postimg.org/5hncgugsn/Example_PDF.jpg" target="_blank" rel="nofollow">http://s5.postimg.org/5hncgugsn/Example_PDF.jpg</A></DIV><DIV>&nbsp;</DIV><DIV>The text "looks" like it is separated by TABS, but there is no formatting.&nbsp; When I try to use the DAExtractPageText and DAExtractBlockText functions,&nbsp;instead of the &lt;Field Name&gt;: &lt;Field Value&gt; aligning with each, they are all over the place.</DIV><DIV>&nbsp;</DIV><DIV>I also tried all the differenet options in DASetTextExtractionOptions to no avail.</DIV><DIV>&nbsp;</DIV><DIV>How can I extract this unformatted&nbsp;text so the &lt;Field Name&gt;: &lt;Field Value&gt; align with each other</DIV><DIV>eg. &nbsp;Surname: TEST etc.</DIV><DIV><BR>Thanks Chris.</DIV>]]>
   </description>
   <pubDate>Wed, 21 Jan 2015 10:16:41 +0000</pubDate>
   <guid isPermaLink="true">http://www.quickpdf.org/forum/extract-nonformatted-tabular-text_topic3057_post12266.html#12266</guid>
  </item> 
 </channel>
</rss>