Do you own a Debenu Quick PDF Library version 7, 8, 9, 10, 11, 12, 13 or iSEDQuickPDF license? Upgrade to Debenu Quick PDF Library 14 today!
![]() |
ExtractFilePageText - what is the color in CMYK |
Post Reply ![]() |
Author | |
edvoigt ![]() Senior Member ![]() ![]() Joined: 26 Mar 11 Location: Berlin, Germany Status: Offline Points: 111 |
![]() ![]() ![]() ![]() ![]() Posted: 28 Sep 11 at 9:54AM |
I would like to use ExtractFilePageText in an existing PDF with the goal, to get all text of a page written with one color, which the user of my program may determine. For example I want to get by filtering the CSV only all yellow text.
In the CSV-string I find color-codes like #231F20 for a pure black or #EC008C for a pure magenta. I know the used color (I wrote the PDF before with QuickPDF using SetTextColorCMYK, so I'm sure about this). I miss the possibility to get from the colorcodes used by QuickPDF in the CSV back to the real definition of the color in CMYk or RGB, depending from the used operator in the pdf. |
|
![]() |
|
edvoigt ![]() Senior Member ![]() ![]() Joined: 26 Mar 11 Location: Berlin, Germany Status: Offline Points: 111 |
![]() ![]() ![]() ![]() ![]() |
The color-code looks like RGB in hex-notation as in HTML, CSS ...
And it is a RGB-value. But how identify this #231F20 as a CMYK-value of (0 0 0 1)? It has to go wrong, latest, if there some different gray colors. |
|
![]() |
|
AndrewC ![]() Moderator Group ![]() ![]() Joined: 08 Dec 10 Location: Geelong, Aust Status: Offline Points: 841 |
![]() ![]() ![]() ![]() ![]() |
The CMYK values are converted to RGB during text extraction and are currently not stored anywhere as a CMYK value.
You either need a RGB to CMYK conversion routine by the sounds of things or if you know what the colours are then your could have make your own RGB ->CMYK conversion table if you know all the colours that you have used. |
|
![]() |
|
edvoigt ![]() Senior Member ![]() ![]() Joined: 26 Mar 11 Location: Berlin, Germany Status: Offline Points: 111 |
![]() ![]() ![]() ![]() ![]() |
Thanks,
but impossible, the PDFs come normally from costumers. Only in the test-case I know the used colors. The best solution would be a separate function, which does no colorconversion, because the conversion is not 1:1. That means more than one CMYK-quadruple gives th same RGB-value. You may compose for example a gray bei using only K or by a mixture of CMY. You get two diffenrent colors which will look for human eye and for RGB to be the same, but are not. Therefor no chance to be sure. But thanks for your idea. |
|
![]() |
|
edvoigt ![]() Senior Member ![]() ![]() Joined: 26 Mar 11 Location: Berlin, Germany Status: Offline Points: 111 |
![]() ![]() ![]() ![]() ![]() |
The problem is solved.
In version 8.12 beta 2 is a new way opened for this. Using QP.LoadFromFile(filename, ''); QP.SetTextExtractionOptions(4, 1); CSV := QP.GetPageText(3); where CSV is a string-var, we get lines like this: "MicrosoftSansSerif",FF000000,17.01,85.0394,47....4054,"CYAN" "MicrosoftSansSerif",00FF0000,17.01,198.4252,4....4054,"MAGENTA" "MicrosoftSansSerif",0000FF00,17.01,311.811,47....4054,"YELLOW" "MicrosoftSansSerif",000000FF,17.01,425.1969,4....4054,"KEY" This is: every CMYK-part is in a byte, where $FF means 1.0 Andrew, thanks for this solution. Werner |
|
![]() |
Post Reply ![]() |
|
Tweet
|
Forum Jump | Forum Permissions ![]() You cannot post new topics in this forum You cannot reply to topics in this forum You cannot delete your posts in this forum You cannot edit your posts in this forum You cannot create polls in this forum You cannot vote in polls in this forum |
Copyright © 2017 Debenu. Debenu Quick PDF Library is a PDF SDK. All rights reserved. About — Contact — Blog — Support — Online Store