Do you own a Debenu Quick PDF Library version 7, 8, 9, 10, 11, 12, 13 or iSEDQuickPDF license? Upgrade to Debenu Quick PDF Library 14 today!
![]() |
How to get x,y lines coordinates around a text |
Post Reply ![]() |
Author | |
eddypi ![]() Beginner ![]() Joined: 02 Jul 20 Status: Offline Points: 4 |
![]() ![]() ![]() ![]() ![]() Posted: 15 Sep 20 at 5:10AM |
I am looking for a code to
get (x,y) coordinates of the line(s), which surrounds a text. E.g. in the picture below,
if we know coordinates for text “WEST LAKES LIBRARY” then which function from
Quick PDF library can provide (x,y) coordinates for Top, Bottom, Left &
Right lines? Top Line ___________________________ | | | WEST LAKES LIBRARY | <--Right Line Left Line --> | | | | ___________________________ Bottom Line Edited by eddypi - 15 Sep 20 at 5:12AM |
|
![]() |
|
Ingo ![]() Moderator Group ![]() ![]() Joined: 29 Oct 05 Status: Offline Points: 3529 |
![]() ![]() ![]() ![]() ![]() |
Good Morning Eddy, the extract functionalities will feed your needs: https://www.debenu.com/docs/pdf_library_reference/Extraction.php The ExtractOptions 2, 3, 4 and 5 will return with a csv-string and this string includes the relevant rectangle coordinates and font data. Cheers and welcome here, Ingo |
|
Cheers,
Ingo |
|
![]() |
|
eddypi ![]() Beginner ![]() Joined: 02 Jul 20 Status: Offline Points: 4 |
![]() ![]() ![]() ![]() ![]() |
Hi Ingo,
Thank you for prompt reply. I have no problem finding text box coordinates with GetTextBlockBound or similar function from the Debenu library. The problem is how by knowing text box coordinate find x,y coordinate of the line(s) on each side of the text. E.g referring to original picture and using QP.SetOrigin (7) 'Top left of media box QP.SetMeasurementUnits (1) ' Set the measurement unit in millimetres. Now we want to find out what are coordiantes of the line drawn on top of the this text (marked with "Top Line"). Visualy we knew that "Top Line" will be at x1=500-x & y1=600-y, but I am looking for a function which will give us that "Top Line" starts at x1=450 & y2=650 and ends at x2=550 & y2=650
Edited by eddypi - 15 Sep 20 at 10:08AM |
|
![]() |
|
tfrost ![]() Senior Member ![]() Joined: 06 Sep 10 Location: UK Status: Offline Points: 437 |
![]() ![]() ![]() ![]() ![]() |
I assume that you are working with a PDF that you did not draw yourself, otherwise you would know where the "lines" are. And it follows that they might be drawn in dozens of different ways, such as on an image, together as a 'box', as four individual lines, or somewhat as you have done, with underlines and vertical bar single characters.
The simplest way of cutting though all this is to render the page to a bitmap, start at the known position of your text, and walk outwards through the scan lines in four directions until you find a change of pixel colour. Yes, it is tedious to do this, but it is simple and reliable, unless your text overlays an image, of course.
|
|
![]() |
|
Ingo ![]() Moderator Group ![]() ![]() Joined: 29 Oct 05 Status: Offline Points: 3529 |
![]() ![]() ![]() ![]() ![]() |
There isn't a solution inside QuickPDF for this.
You have to create your own algo for this ... perhaps with th help of QuickPDF ;-) Please keep in mind this is a user forum here (from user ... to user). No official Debenu/Foxit-support here... For this (technical questions) you should use the official contact page. |
|
Cheers,
Ingo |
|
![]() |
|
eddypi ![]() Beginner ![]() Joined: 02 Jul 20 Status: Offline Points: 4 |
![]() ![]() ![]() ![]() ![]() |
tfrost is right I am working
with PDF created by various draftsman in AutoCAD. I checked with
FoxitPhantom and PDF has line as a separate object with x,y + width, height
parameters.
here are sample files https://www.dropbox.com/sh/7ywhw6fysswbaa7/AADQLIRWMeUYRCRhByQgMGtza?dl=0
tfrost are you able to provide reference
to a function in QuickPDF or sample code, which I need to look for implementing
“…bitmap walk through …. Pixel color” approach.
Also, I do realise that it
is not a straight forward problem, but trust if solved can of great benefit to
wide QuickPDF community.
Let me explain full problem
in details based on sample PDF files at the above link.
1) a) 1003.pdf
and 1004.pdf files are original files provided by a draftsman (normally, full set of pdf files may have 20-100
files similar to 1003 & 1004)
2) b) All files have same layout title block which is
located in this example at the bottom right corner of each page (other
draftsman may use different layout an positon
of the title block)
Task:
Read all supplied PDFs and extract e.g. “DRAWING TITLE”, “DRAWING NUMBER” &
REVISION and etc.
Note: in our files “DRAWING TITLE” header is not even provided by a draftsman, but it can be seen under text “CHARLES STREET” Proposed
solution: 1) use QuickPDF
to number all text position inside one of the pdf file i.e. “1004.pdf text
blocks.pdf” file and then opens in Adobe Reader for user to review. 2) User will enter 224,227 & 228, which are
correspond to “DRAWING TITLE”, “DRAWING NUMBER” & REVISION
back in to software.
Note: 224,227 & 228 at the same time represent (x1,y1) coordinates of the left bottom part of the text. E.g. 224=(788,572) 3) Software base on that info runs through rest of the files and extracts values for “DRAWING TITLE”, “DRAWING NUMBER” & REVISION and etc. The code for 3 steps works
if all text is left align, but if it is centred then software would not pick up text at expected position due to text moving left and right depending on
number of characters. Implementing something like look in between y1-5mm and
y1+5mm would not work in case there text is to long and close to a text on its left. So y1-5mm may overlap with the text at the left of 224 and give incorrect result.
|
|
![]() |
|
tfrost ![]() Senior Member ![]() Joined: 06 Sep 10 Location: UK Status: Offline Points: 437 |
![]() ![]() ![]() ![]() ![]() |
The QPDF functions are DARenderPageToFile or RenderPageToFile. Or another function in the Rendering section of the reference guide - for example I use Delphi so I would choose RenderPageToStream and open the bitmap from the stream, to avoid using a file.
Once you have the bitmap you are operating outside QPDF and you need whatever your language provides to work with bitmaps. Remember that the rendering scale and origin you use in QPDF will require conversion between the PDF co-ordinates of your found text and the BMP co-ordinates. I agree it is not at all straightforward, but not that it is of much general interest. If it was my problem, I think I would tell the originator either to encode the title, number and revision in the filename, or hide them in a fixed position in the page margin, if necessary in "white ink"! |
|
![]() |
Post Reply ![]() |
|
Tweet
|
Forum Jump | Forum Permissions ![]() You cannot post new topics in this forum You cannot reply to topics in this forum You cannot delete your posts in this forum You cannot edit your posts in this forum You cannot create polls in this forum You cannot vote in polls in this forum |
Copyright © 2017 Debenu. Debenu Quick PDF Library is a PDF SDK. All rights reserved. About — Contact — Blog — Support — Online Store