Print Page | Close Window

Get Text only from specyfic Layer/ContentStream

Printed From: Debenu Quick PDF Library - PDF SDK Community Forum
Category: For Users of the Library
Forum Name: I need help - I can help
Forum Description: Problems and solutions while programming with the Debenu Quick PDF Library and Debenu PDF Viewer SDK
URL: http://www.quickpdf.org/forum/forum_posts.asp?TID=3511
Printed Date: 19 Apr 24 at 4:33PM
Software Version: Web Wiz Forums 11.01 - http://www.webwizforums.com


Topic: Get Text only from specyfic Layer/ContentStream
Posted By: mLipok
Subject: Get Text only from specyfic Layer/ContentStream
Date Posted: 05 Oct 17 at 8:31PM
Normaly I'm using this :
$oQP.GetPageText($iExtractOptions)

to get Page Text.

But in some solutions I have PDF's with few ContentStream's / Layers

I know how to enumerate them, I use this functions:

Func _QPDF_GetOptionalContentGroupInformation(ByRef $oQP)
Local $iContentGorupCount = $oQP.OptionalContentGroupCount()
Local $aResult[$iContentGorupCount][4]
For $iGroup_idx = 1 To $iContentGorupCount
$aResult[$iGroup_idx - 1][0] = $oQP.GetOptionalContentGroupID($iGroup_idx)
$aResult[$iGroup_idx - 1][1] = $oQP.GetOptionalContentGroupName($aResult[$iGroup_idx - 1][0])
$aResult[$iGroup_idx - 1][2] = $oQP.GetOptionalContentGroupPrintable($aResult[$iGroup_idx - 1][0])
$aResult[$iGroup_idx - 1][3] = $oQP.GetOptionalContentGroupVisible($aResult[$iGroup_idx - 1][0])
Next
Return SetExtended($iContentGorupCount, $aResult)
EndFunc   ;==>_QPDF_GetOptionalContentGroupInformation


My question is how I can get text only from specyfic Layer/ContentStream ?

Regards,
mLipok



-------------
Here you can find description how to test my examples:
http://www.quickpdf.org/forum/forum_posts.asp?TID=2932&PID=12600&title=drawcapturedpagematrix-matrix-howto#12600



Replies:
Posted By: Ingo
Date Posted: 06 Oct 17 at 6:45PM
Hi :)

perhaps this is the link you're looking for:
http://www.quickpdf.org/forum/how-to-preview-selected-content-stream_topic2305.html
BTW: You can't delete text from a selected contentstream/layer ... but you're able to select and delete a contentstream with the textcontent on it.
If you have a pdf with four layers i would make four copies.
In the copies i'll delete the first layer... in the second copy the second layer and so on.
Then i would start textextraction from the four copies.
Then i would compare the four textcontent.

Cheers and a nice weekend to you :)
Ingo



-------------
Cheers,
Ingo



Posted By: mLipok
Date Posted: 06 Oct 17 at 9:50PM
Your advice sounds reasonable.
I will follow that path.

Thanks and have a nice weekend, ...... unfortunately, in our country, so far it is raining.

Cheers,
mLipok


-------------
Here you can find description how to test my examples:
http://www.quickpdf.org/forum/forum_posts.asp?TID=2932&PID=12600&title=drawcapturedpagematrix-matrix-howto#12600



Print Page | Close Window

Forum Software by Web Wiz Forums® version 11.01 - http://www.webwizforums.com
Copyright ©2001-2014 Web Wiz Ltd. - http://www.webwiz.co.uk