Do you own a Debenu Quick PDF Library version 7, 8, 9, 10, 11, 12, 13 or iSEDQuickPDF license? Upgrade to Debenu Quick PDF Library 14 today!

Debenu Quick PDF Library - PDF SDK Community Forum Homepage
Forum Home Forum Home > For Users of the Library > I need help - I can help
  New Posts New Posts RSS Feed - GetPageText
  FAQ FAQ  Forum Search   Register Register  Login Login

GetPageText

 Post Reply Post Reply
Author
Message Reverse Sort Order
Ingo View Drop Down
Moderator Group
Moderator Group
Avatar

Joined: 29 Oct 05
Status: Offline
Points: 3096
Post Options Post Options   Thanks (0) Thanks(0)   Quote Ingo Quote  Post ReplyReply Direct Link To This Post Topic: GetPageText
    Posted: 23 Feb 20 at 2:05PM
Hi,

i think reading the reference should make it clear?
You put in for example 1 (for the read value) and you'll get back a float value between 0,0... and 1,0.
A value of 1,0 ist identical to the max value 255 which you know from rgb-color-value:
https://www.debenu.com/docs/pdf_library_reference/GetTextBlockColor.php

To your text extraction question:
There are different and well described options what type of extraction and results are possible.
I think for your question option 8 should be the best:
https://www.debenu.com/docs/pdf_library_reference/GetPageText.php

Cheers,
Ingo

Back to Top
eranhadad View Drop Down
Beginner
Beginner


Joined: 30 Dec 19
Location: Israel
Status: Offline
Points: 8
Post Options Post Options   Thanks (0) Thanks(0)   Quote eranhadad Quote  Post ReplyReply Direct Link To This Post Posted: 23 Feb 20 at 6:51AM
In Addition what is the expcepted result from GetTextBlockColor function when the text color is WHITE?

[I dont understand what's the Color Component  parameter for this function and what is the result?]
Back to Top
eranhadad View Drop Down
Beginner
Beginner


Joined: 30 Dec 19
Location: Israel
Status: Offline
Points: 8
Post Options Post Options   Thanks (0) Thanks(0)   Quote eranhadad Quote  Post ReplyReply Direct Link To This Post Posted: 22 Feb 20 at 2:11PM
Thanks Smile 
You wrote that there is a way of extract naked text from PDF file.
How ?

Back to Top
Ingo View Drop Down
Moderator Group
Moderator Group
Avatar

Joined: 29 Oct 05
Status: Offline
Points: 3096
Post Options Post Options   Thanks (0) Thanks(0)   Quote Ingo Quote  Post ReplyReply Direct Link To This Post Posted: 12 Feb 20 at 6:11PM
Hi Eran,

if you need additional functionality your search should start from here:
https://www.debenu.com/docs/pdf_library_reference/Search.php
The results will come from the kb, the online reference or the forum-threads...

The other point:
It's not possible to remove any font style attributes which are already in the pdf.
What you can try is to extract the naked text and compose a new pdf-page using the coordinates from the extraction.

One important hint from me:
You've written DAGetTextBlockText and GetTextBlockFontSize...
Using functions like described together won't work proper.
NEVER mix DA- with NonDA-functionalities - There are different technics behind the scenes which will make you stumbeling... ;-)

Cheers and welcome here,
Ingo

Cheers,
Ingo

Back to Top
eranhadad View Drop Down
Beginner
Beginner


Joined: 30 Dec 19
Location: Israel
Status: Offline
Points: 8
Post Options Post Options   Thanks (0) Thanks(0)   Quote eranhadad Quote  Post ReplyReply Direct Link To This Post Posted: 12 Feb 20 at 3:09PM
I found a solution that helps me to indicate the "Placeholders" above by using 
DAGetTextBlockText
GetTextBlockFontSize
GetTextBlockFontName

is there any way to get TextColor using Block's function ?
If needed , I can upload my code


Edited by eranhadad - 12 Feb 20 at 3:09PM
Back to Top
eranhadad View Drop Down
Beginner
Beginner


Joined: 30 Dec 19
Location: Israel
Status: Offline
Points: 8
Post Options Post Options   Thanks (0) Thanks(0)   Quote eranhadad Quote  Post ReplyReply Direct Link To This Post Posted: 12 Feb 20 at 7:11AM
Hey, Smile

After using GetPageText with 3 paramter (to get CSV) , I get this result - 

"WYOIAS+David,Bold",#FFFFFF,11,349.5,38.6744,354.78,38.6744,354.78,49.4984,349.5,49.4984,"{"
"WYOIAS+David,Bold",#FFFFFF,11,479.25,38.6744,484.53,38.6744,484.53,49.4984,479.25,49.4984,"}"
"WYOIAS+David,Bold",#FFFFFF,11,71.25,38.6744,77.85,38.6744,77.85,49.4984,71.25,49.4984,"<"
"WYOIAS+David,Bold",#FFFFFF,11,182.25,38.6744,188.85,38.6744,188.85,49.4984,182.25,49.4984,">"

Is there any way to remove the Bold attribute from the PDF ? 
Any function to remove any BOLD|ITALIC|UNDER-LINE from this specific chars ?

I ask this question because I create object with this properties split by comma.

Thanks,
Eran Hadad
Back to Top
 Post Reply Post Reply
  Share Topic   

Forum Jump Forum Permissions View Drop Down

Forum Software by Web Wiz Forums® version 11.01
Copyright ©2001-2014 Web Wiz Ltd.

Copyright © 2017 Debenu. Debenu Quick PDF Library is a PDF SDK. All rights reserved. AboutContactBlogSupportOnline Store