<?xml version="1.0" encoding="utf-8" ?>
<?xml-stylesheet type="text/xsl" href="RSS_xslt_style.asp" version="1.0" ?>
<rss version="2.0" xmlns:WebWizForums="http://syndication.webwiz.co.uk/rss_namespace/">
 <channel>
  <title>Debenu Quick PDF Library - PDF SDK Community Forum : DASetTextExtractionArea with different origin</title>
  <link>http://www.quickpdf.org/forum/</link>
  <description><![CDATA[This is an XML content feed of; Debenu Quick PDF Library - PDF SDK Community Forum : I need help - I can help : DASetTextExtractionArea with different origin]]></description>
  <copyright>Copyright (c) 2006-2013 Web Wiz Forums - All Rights Reserved.</copyright>
  <pubDate>Sun, 12 Apr 2026 04:36:16 +0000</pubDate>
  <lastBuildDate>Wed, 19 Mar 2014 11:03:48 +0000</lastBuildDate>
  <docs>http://blogs.law.harvard.edu/tech/rss</docs>
  <generator>Web Wiz Forums 11.01</generator>
  <ttl>360</ttl>
  <WebWizForums:feedURL>www.quickpdf.org/forum/RSS_post_feed.asp?TID=2850</WebWizForums:feedURL>
  <image>
   <title><![CDATA[Debenu Quick PDF Library - PDF SDK Community Forum]]></title>
   <url>http://www.quickpdf.org/forum/forum_images/QPDF_Forum_Title.png</url>
   <link>http://www.quickpdf.org/forum/</link>
  </image>
  <item>
   <title><![CDATA[DASetTextExtractionArea with different origin :   AndrewC wrote:You need to adjust...]]></title>
   <link>http://www.quickpdf.org/forum/dasettextextractionarea-with-different-origin_topic2850_post11572.html#11572</link>
   <description>
    <![CDATA[<strong>Author:</strong> <a href="http://www.quickpdf.org/forum/member_profile.asp?PF=2542">Cirunz</a><br /><strong>Subject:</strong> 2850<br /><strong>Posted:</strong> 19 Mar 14 at 11:03AM<br /><br /><table width="99%"><tr><td class="BBquote"><img src="forum_images/quote_box.png" title="Originally posted by AndrewC" alt="Originally posted by AndrewC" style="vertical-align: text-bottom;" /> <strong>AndrewC wrote:</strong><br /><br /><div><span style="line-height: 1.4;">You need to adjust the Y position by calling&nbsp;</span></div><div><br></div><div>YPos := QP.DAGetPageHeight(dahandle, dapageref) - YPos;</div><div><br></div><div>Andrew.</div></td></tr></table><div><br></div><div>Thank you Andrew, this is really helpfull.</div><div>I have a mixed scenario, so I will use this function to adjust the coordinates, depending on the case.</div><div><br></div><div>Thanks again.</div><div>Fabio.</div>]]>
   </description>
   <pubDate>Wed, 19 Mar 2014 11:03:48 +0000</pubDate>
   <guid isPermaLink="true">http://www.quickpdf.org/forum/dasettextextractionarea-with-different-origin_topic2850_post11572.html#11572</guid>
  </item> 
  <item>
   <title><![CDATA[DASetTextExtractionArea with different origin : Cirunz, Yes. It is a complex...]]></title>
   <link>http://www.quickpdf.org/forum/dasettextextractionarea-with-different-origin_topic2850_post11570.html#11570</link>
   <description>
    <![CDATA[<strong>Author:</strong> <a href="http://www.quickpdf.org/forum/member_profile.asp?PF=1483">AndrewC</a><br /><strong>Subject:</strong> 2850<br /><strong>Posted:</strong> 19 Mar 14 at 10:58AM<br /><br /><div>Cirunz,</div><div>Yes. &nbsp;It is a complex thing to explain. &nbsp;The DA functions do not support the SetOrigin function as SetOrigin is not a DA supported functions. &nbsp;You cannot normally mix DA and non DA functions as they use different functions to process the file. The exception to this rule are that most of the Extract* functions do use the DA code and and not the non DA functions.</div><div><br></div><div>You need to adjust the Y position by calling&nbsp;</div><div><br></div><div>YPos := QP.DAGetPageHeight(dahandle, dapageref) - YPos;</div><div><br></div><div>Andrew.</div>]]>
   </description>
   <pubDate>Wed, 19 Mar 2014 10:58:43 +0000</pubDate>
   <guid isPermaLink="true">http://www.quickpdf.org/forum/dasettextextractionarea-with-different-origin_topic2850_post11570.html#11570</guid>
  </item> 
  <item>
   <title><![CDATA[DASetTextExtractionArea with different origin : Hi, I&amp;#039;m trying to extract...]]></title>
   <link>http://www.quickpdf.org/forum/dasettextextractionarea-with-different-origin_topic2850_post11568.html#11568</link>
   <description>
    <![CDATA[<strong>Author:</strong> <a href="http://www.quickpdf.org/forum/member_profile.asp?PF=2542">Cirunz</a><br /><strong>Subject:</strong> 2850<br /><strong>Posted:</strong> 18 Mar 14 at 2:32PM<br /><br />Hi, I'm trying to extract text in a specific area, on a large number of pdf files.<div>My first approach is to loop for every file, open the file, select the page and proceed to extract the text with GetPageText:</div><div></div><div>//Code to initialize dll reference DPDF</div><div><table width="99%"><tr><td><pre class="BBcode"></div><div><div>int i = 0;</div><div>int mode = 7;</div><div>List&lt;string&gt; foundlines = new List&lt;string&gt;();</div><div>for (; i &lt; pdffiles.Length; i++)</div><div>{</div><div><span ="Apple-tab-span" style="white-space:pre">	</span>if (DPDF.LoadFromFile(pdffiles<em>, "") != 0)</div><div><span ="Apple-tab-span" style="white-space:pre">	</span>{</div><div><span ="Apple-tab-span" style="white-space:pre">		</span>if (DPDF.SelectPage(1) != 0)//I'm always searching in the first page</div><div><span ="Apple-tab-span" style="white-space:pre">		</span>{</div><div><span ="Apple-tab-span" style="white-space:pre">			</span>DPDF.SetMeasurementUnits(1);//Millimeters</div><div><span ="Apple-tab-span" style="white-space:pre">			</span>DPDF.SetOrigin(1);//Left-Top margin</div><div><br></div><div><span ="Apple-tab-span" style="white-space:pre">			</span>//field contains extraction area data</div><div><span ="Apple-tab-span" style="white-space:pre">			</span>if (DPDF.SetTextExtractionArea(field.Left, field.Top, field.Width, field.Height) == 1)</div><div><span ="Apple-tab-span" style="white-space:pre">			</span>{</div><div><span ="Apple-tab-span" style="white-space:pre">				</span>foundlines.Add(DPDF.GetPageText(mode).ToString().Trim());</div><div><span ="Apple-tab-span" style="white-space:pre">			</span>}</div><div><span ="Apple-tab-span" style="white-space:pre">			</span></div><div><span ="Apple-tab-span" style="white-space:pre">			</span>DPDF.RemoveDocument(DPDF.SelectedDocument());</div><div><span ="Apple-tab-span" style="white-space:pre">		</span>}</div><div><span ="Apple-tab-span" style="white-space:pre">		</span>else</div><div><span ="Apple-tab-span" style="white-space:pre">		</span>{</div><div><span ="Apple-tab-span" style="white-space:pre">			</span>errormessage = "SelectPage: " + pdffiles<em>;</div><div><span ="Apple-tab-span" style="white-space:pre">			</span>break;</div><div><span ="Apple-tab-span" style="white-space:pre">		</span>}</div><div><span ="Apple-tab-span" style="white-space:pre">	</span>}</div><div><span ="Apple-tab-span" style="white-space:pre">	</span>else</div><div><span ="Apple-tab-span" style="white-space:pre">	</span>{</div><div><span ="Apple-tab-span" style="white-space:pre">		</span>errormessage = "LoadFromFile: " + pdffiles<em>;</div><div><span ="Apple-tab-span" style="white-space:pre">		</span>break;</div><div><span ="Apple-tab-span" style="white-space:pre">	</span>}</div><div>}//Extraction cycle end here</div></div><div><br></div><div><div>if (string.IsNullOrEmpty(errormessage))</div><div>{</div><div><span ="Apple-tab-span" style="white-space:pre">	</span>if (foundlines != null &amp;&amp; foundlines.Count &gt; 0)</div><div><span ="Apple-tab-span" style="white-space:pre">	</span>{</div><div><span ="Apple-tab-span" style="white-space:pre">		</span>File.WriteAllLines(@"C:\resultlines.txt", foundlines.ToArray());</div><div><span ="Apple-tab-span" style="white-space:pre">		</span>result = true;</div><div><span ="Apple-tab-span" style="white-space:pre">	</span>}</div><div>}</div></div><div></pre></td></tr></table></div><div><br></div><div>It works fine, but it's not very fast, and it uses lot of memory.</div><div><span style="line-height: 1.4;">Worried by this results, I choosed to give a try to the&nbsp;</span>ExtractFilePageText, so to keep low CPU and memory occupation.</div><div>So I've changed the above cycle in this way:</div><div><div><table width="99%"><tr><td><pre class="BBcode"></div><div><div>int i = 0;</div><div>int mode = 7;</div><div>List&lt;string&gt; foundlines = new List&lt;string&gt;();</div><div>DPDF.SetMeasurementUnits(1);//Millimeters</div><div>DPDF.SetOrigin(1);//Left-Top margin</div><div>for (; i &lt; pdffiles.Length; i++)</div><div>{</div><div><span ="Apple-tab-span" style="white-space:pre">	</span>//field contains extraction area data</div><div><span ="Apple-tab-span" style="white-space:pre">	</span>if (DPDF.DASetTextExtractionArea(field.Left, field.Top, field.Width, field.Height) == 1)</div><div><span ="Apple-tab-span" style="white-space:pre">	</span>{</div><div><span ="Apple-tab-span" style="white-space:pre">		</span>foundlines.Add(DPDF.ExtractFilePageText(pdffiles<em>, "", 1, mode).ToString().Trim());</div><div><span ="Apple-tab-span" style="white-space:pre">	</span>}</div><div>}//Extraction cycle end here</div><div><br></div><div>if (foundlines != null &amp;&amp; foundlines.Count &gt; 0)</div><div>{</div><div><span ="Apple-tab-span" style="white-space:pre">	</span>File.WriteAllLines(@"C:\resultlines.txt", foundlines.ToArray());</div><div><span ="Apple-tab-span" style="white-space:pre">	</span>result = true;</div><div>}</div></div><div></pre></td></tr></table></div></div><div><br></div><div>This does not find anything.</div><div>There is a simple explanation for this: Documentation&nbsp;<span style="line-height: 1.4;">says&nbsp;</span><a href="http://www.debenu.com/docs/pdf_library_reference/DASetTextExtracti&#111;nArea.php" target="_blank" rel="nofollow">DASetTextExtractionArea</a>&nbsp;is relative to the bottom left corner of the page, and do no mention a way to make the SetOrigin (or the&nbsp;<span style="line-height: 1.4;">SetMeasurementUnits), affect this function.</span></div><div><span style="line-height: 1.4;"><br></span></div><div>There is not a way to do so? The&nbsp;<span style="line-height: 1.4;">ExtractFilePageText can be only used with the default origin?</span></div><div><span style="line-height: 1.4;"><br></span></div><div><span style="line-height: 1.4;">Thank you.</span></div><div><br></div>]]>
   </description>
   <pubDate>Tue, 18 Mar 2014 14:32:53 +0000</pubDate>
   <guid isPermaLink="true">http://www.quickpdf.org/forum/dasettextextractionarea-with-different-origin_topic2850_post11568.html#11568</guid>
  </item> 
 </channel>
</rss>