Print Page | Close Window

full line text extraction

Printed From: Debenu Quick PDF Library - PDF SDK Community Forum
Category: For Users of the Library
Forum Name: I need help - I can help
Forum Description: Problems and solutions while programming with the Debenu Quick PDF Library and Debenu PDF Viewer SDK
URL: http://www.quickpdf.org/forum/forum_posts.asp?TID=1651
Printed Date: 08 Nov 25 at 4:26PM
Software Version: Web Wiz Forums 11.01 - http://www.webwizforums.com


Topic: full line text extraction
Posted By: alinux
Subject: full line text extraction
Date Posted: 25 Nov 10 at 10:39AM
Hi,

I'm using text extraction functions (activex v7.22 b2) to extract words with coordinates.
I'll need to extract full-line text with line coordinates. Do I have any solution for doing this?

Alin



Replies:
Posted By: Ingo
Date Posted: 25 Nov 10 at 3:34PM
Hi Alin!

QuickPDF doesn't have this option but it will support you...
You can extract complete pages, strings like they were inserted and single words.
Take the word-option and concatenate the complete lines regarding the position data of each word.
If you have this algorithm completed please insert it here in the samples-section 'cause i need it, too ;-)

Cheers and thanks in advance,
Ingo



Posted By: alinux
Date Posted: 26 Nov 10 at 8:11AM
Hi Ingo,

Thanks for you answer.  I'll post the algo soon; I have some problems with the tables (the OCR engine "read" tables by line/column about a random?! criteria).

Alin


Posted By: alinux
Date Posted: 26 Nov 10 at 8:04PM
Hi Ingo,

I've posted a basic sample (see code sample) of text assembling lines based on Y1 or Y2 coordinate of words (GetPageText(4) function); for a more accurate result, I think that it'll need a control variable for Y coordinate different values for the words of the same line.

Cheers,
Alin




Print Page | Close Window

Forum Software by Web Wiz Forums® version 11.01 - http://www.webwizforums.com
Copyright ©2001-2014 Web Wiz Ltd. - http://www.webwiz.co.uk