Do you own a Debenu Quick PDF Library version 7, 8, 9, 10, 11, 12, 13 or iSEDQuickPDF license? Upgrade to Debenu Quick PDF Library 14 today!
![]() |
Help with detecting visible OCGs and removing OCGs |
Post Reply
|
| Author | |
smleino
Beginner
Joined: 03 Aug 15 Status: Offline Points: 5 |
Post Options
Thanks(0)
Quote Reply
Topic: Help with detecting visible OCGs and removing OCGsPosted: 03 Aug 15 at 6:41PM |
|
I am working with QuickPDF Library 11.15 using the ActiveX C# interface on a PDF that has both visible and non-visible Optional Content Groups (OCGs) and my goal is:
1. Determine the OCGs present and which ones are visible and non-visible 2. Remove the non-visible OCGs and their content from the document 3. Save the PDF which now has just the visible OCGs remaining Based on the Library documentation, this seemed simple enough, but I am running into various problems. The source PDF I am testing with has one page and two OCGS - a visible OCG containing English text and a non-visible OCG containing French text. Based on the code snippet below, I expect that QuickPDF will tell me that the first OCG is visible and the second one non-visible: string sourceDoc = @"..\..\Test Files\samplepdfwithlayers.pdf"; qp.LoadFromFile(sourceDoc, ""); // Count OCGs int OCGCount = qp.OptionalContentGroupCount(); // Loop through each OCG and delete it for (int i = 1; i <= OCGCount; i++) { int OCGID = qp.GetOptionalContentGroupID(i); int visible = qp.GetOptionalContentGroupVisible(OCGID); } Okay, so far, so good, QuickPDF does indeed indicate on visible and one non-visible OCG. Next, I want to add a call to Remove the non-visible OCG and then save what should now be a PDF with just the single remaining visible, English text, OCG: string sourceDoc = @"..\..\Test Files\samplepdfwithlayers.pdf"; string destfileName = @"..\..\Test Files\convertedPDFA.pdf"; qp.LoadFromFile(sourceDoc, ""); // Count OCGs int OCGCount = qp.OptionalContentGroupCount(); //should be 2 initially // Loop through each OCG and delete any that are non-visible for (int i = 1; i <= OCGCount; i++) { int OCGID = qp.GetOptionalContentGroupID(i); int visible = qp.GetOptionalContentGroupVisible(OCGID); if (visible == 0) //if invisible, delete the OCG { qp.DeleteOptionalContentGroup(OCGID); } } // Count OCGs again OCGCount = qp.OptionalContentGroupCount(); //should be 1 now qp.SaveToFile(destfileName); So, here is the problem - the new PDF does have just one remaining OCG but instead of only showing the English text from the original visible OCG, it shows both the English AND French text on top of each other! Can someone tell me why the French text is even still in the PDF and why it has been put into the remaining OCG? How do I make sure that content from a removed OCG is removed from the PDF? Thanks!
|
|
![]() |
|
smleino
Beginner
Joined: 03 Aug 15 Status: Offline Points: 5 |
Post Options
Thanks(0)
Quote Reply
Posted: 04 Aug 15 at 2:25PM |
|
Does anyone know if this is working as designed or if this is a bug?
|
|
![]() |
|
Rowan
Moderator Group
Joined: 10 Jan 09 Status: Offline Points: 398 |
Post Options
Thanks(0)
Quote Reply
Posted: 05 Aug 15 at 9:14PM |
|
Optional Content Groups don't contain any content themselves, rather it's a way of grouping content streams in the document. So removing one OCG just means that the remaining OCG has to contain all of the content streams. Or to put it another way, the text that was in the OCG which was deleted is shown on the page because it no longer belongs to an OCG telling it to be invisible.
What you want to do is delete the OCG and its associated content streams, however, this can be fraught with danger as not all content streams are safe to delete (i.e. might mess up your document in unexpected ways), so it requires testing with your documents. I will put together some sample code for you tomorrow. The function you use to delete content streams is DeleteContentStream but the trick is determining which content stream the OCG is assigned to (if you already know this then it's obviously easier and you can probably work it out yourself).
|
|
![]() |
|
smleino
Beginner
Joined: 03 Aug 15 Status: Offline Points: 5 |
Post Options
Thanks(0)
Quote Reply
Posted: 06 Aug 15 at 2:20PM |
|
I have tried to see whether the content from the layers is in separate content streams using the following code:
int xPageCount = qp.PageCount(); // Go through each page and encapsulate content streams for (int i = 1; i <= xPageCount; i++) { result = qp.SelectPage(i); int xContentStreamCount = qp.ContentStreamCount(); for (int x = 1; x <= xContentStreamCount; x++) { result = qp.SelectContentStream(x); byte [] var = (byte[]) qp.GetContentStreamToVariant(); string contentString = Encoding.UTF8.GetString(var, 0, var.Length); } } The source doc has a single page with two OCGs: the first, visible one contains English text; the second, non-visible one contains French text. Unfortunately, even though the code above does find two content streams, it seems that the first stream has all the text, both English and French, while the second stream appears to be empty. Thoughts?
|
|
![]() |
|
Rowan
Moderator Group
Joined: 10 Jan 09 Status: Offline Points: 398 |
Post Options
Thanks(0)
Quote Reply
Posted: 07 Sep 15 at 6:53AM |
|
There is just the one content stream and it has this form:
/OC /MC0 BDC [French content] EMC /OC /MC1 BDC [English content] EMC So it should be possible to split the page content just by deleting from the /OC to the EMC tag. We've tried doing that and the first part works okay. But when we try the same thing with the second part it results in an invalid page content stream and Acrobat gives an error when rendering the page. It looks like when Acrobat hides marked content it still processes all of the commands but just doesn't make any output on the page. So to split the content we would need to: 1. Identify all the OCG parts of the content stream, looking for BDC and EMC tags. 2. Process all of the page commands between BDC and EMC 3. Delete or otherwise disable any command that causes output As you can see, it should be possible to accomplish what you are trying to do but due to the fact that each OCG doesn't have its own content stream it becomes complicated.
|
|
![]() |
|
smleino
Beginner
Joined: 03 Aug 15 Status: Offline Points: 5 |
Post Options
Thanks(0)
Quote Reply
Posted: 08 Sep 15 at 12:55PM |
|
Thanks for the information - I had noticed some of what you reported but did not have enough PDF internals knowledge to fully explain and diagnose the problem.
Will Quick PDF be able to handle this now or will it require changes to the library? Or do I need to look at handling this some other way? Thanks!
|
|
![]() |
|
Post Reply
|
|
|
Tweet
|
| Forum Jump | Forum Permissions ![]() You cannot post new topics in this forum You cannot reply to topics in this forum You cannot delete your posts in this forum You cannot edit your posts in this forum You cannot create polls in this forum You cannot vote in polls in this forum |
Copyright © 2017 Debenu. Debenu Quick PDF Library is a PDF SDK. All rights reserved. About — Contact — Blog — Support — Online Store