Print Page | Close Window

Add text content and remove later

Printed From: Debenu Quick PDF Library - PDF SDK Community Forum
Category: For Users of the Library
Forum Name: I need help - I can help
Forum Description: Problems and solutions while programming with the Debenu Quick PDF Library and Debenu PDF Viewer SDK
URL: http://www.quickpdf.org/forum/forum_posts.asp?TID=2991
Printed Date: 05 May 24 at 8:48AM
Software Version: Web Wiz Forums 11.01 - http://www.webwizforums.com


Topic: Add text content and remove later
Posted By: mLipok
Subject: Add text content and remove later
Date Posted: 14 Oct 14 at 6:20PM
I want to add text and some graph to existing PDF.
I know How to do this.

But If I want to be able to delete added content in a future.
So i found NewContentStream and SelectContentStream and DeleteContentStream

My question is When I use GetPageText How I can get only from one selected ContentStream ?
I need to control which ContentStream are used when I get Page Text.
Then I be able to delete this one which I need.






-------------
Here you can find description how to test my examples:
http://www.quickpdf.org/forum/forum_posts.asp?TID=2932&PID=12600&title=drawcapturedpagematrix-matrix-howto#12600



Replies:
Posted By: erico
Date Posted: 14 Oct 14 at 6:49PM
Try looking at ContentStreamSafe();

If this returns true, the content stream was created by the QuickPDF library and can be safely manipulated/deleted


Posted By: mLipok
Date Posted: 14 Oct 14 at 7:39PM
I get it how to do this.
1. ContentStreamCount()
2. now in Loop get all ContentStream to array
3. Delete all ContentStream
4. now add one by one and make a TEXT analysis

in this way I think I be able to check which contentstream contain text which I added before.
So I be able to delete this selected ContentStream.

What you think ?

EDIT:
I have not checked yet



-------------
Here you can find description how to test my examples:
http://www.quickpdf.org/forum/forum_posts.asp?TID=2932&PID=12600&title=drawcapturedpagematrix-matrix-howto#12600


Posted By: mLipok
Date Posted: 14 Oct 14 at 7:48PM
btw.
I read ref.guide for ContentStreamSafe()

And I have question about this:

Only content stream parts created by Quick PDF Library should be considered "safe" to drawn on.
If a content stream part is not safe it would be best to combine all the content stream parts using
the CombineContentStreams function before drawing on the page to prevent later errors in the
document.

Is this mean that it is good practice to always create my own ContentStream when I want for example make a stamp on Scaned+OCR (ABBY) PDF ?

I have also other PDF created form Word by NovaPDFv7.5

and second question:
Can you cite an example - the "Disasters" associated with non-compliance with this rule?



-------------
Here you can find description how to test my examples:
http://www.quickpdf.org/forum/forum_posts.asp?TID=2932&PID=12600&title=drawcapturedpagematrix-matrix-howto#12600


Posted By: erico
Date Posted: 14 Oct 14 at 7:50PM
Will you always delete the entire content you have added?

If so, then rather than delete every stream and re-analyze, use ContentStreamSafe() or EditableContentStream() to determine whether it is original content or was added by QuickPDF.

Only delete those that you add.

You could make your job even easier: before you draw use this

   streamId = NewContentStream();   // you will draw only in this stream
   MoveContentStrem(streamId,1);     // make it the first on the page

this way, you always draw in content stream 1 and you do not need to loop at all.

if ContentStreamSafe(1) is false, then you did not draw on this page.

Did not see the end of your post. As you can see from my "points", I'm pretty new. I would suspect that what is drawn may be applied in an unknown context, resulting in unpredictable results.


Posted By: mLipok
Date Posted: 14 Oct 14 at 9:12PM
My question was theoretical and practical at the same time because I will have to implement them in practice. 
In theory you can assume that the PDF file has been processed several times including at least two times with QuickPDF then will be required to verify whether the ContentStream contains the desired text. 

But I think it best to check each ContetnStream using ContentStreamSafe (), and if more than one, then use my method. 

I first thought it should be quick and at the same time certain. 

But then I considered another case. 

Imagine a plain scanned PDF, then I added ContentStream and some text. Then someone else added his text but to my ContentStream. Both texts were added with checking ContentStreamSafe (), but both were added at the same ContentStream. 

So in my practical case, unfortunately, I have to analyze the text content of the ConentStream. After further reflection I found that it still can not be sure that the ContentStream contains only my data. 



-------------
Here you can find description how to test my examples:
http://www.quickpdf.org/forum/forum_posts.asp?TID=2932&PID=12600&title=drawcapturedpagematrix-matrix-howto#12600


Posted By: erico
Date Posted: 14 Oct 14 at 9:34PM
If under your/your company control, set policy to always add a new content stream and move it first. That way all QPDF streams occur at the beginning of a page, and each edit cycle is separate.

If this is not possible, it becomes much harder to remove things. Perhaps examine how ExtractPageTextBlocks() may work. But I don't see an easy means to change them without manipulating the content stream at a lower level such as Get/Set ContentStreamFromString

Suggestion: Place an invisible text string at a known location on the page which identifies your content stream, the use SetTextExtractionArea() before GetPageText(). If your invisible key is there, this is your content stream. As before use "Safe" and stop looking when the streams become original content.


-------------
Eric O


Posted By: AndrewC
Date Posted: 16 Oct 14 at 7:04AM
Have you thought about adding the extra commands into an OCG (OCG Layer).  You can retrieve it later by name and then delete it.

You may need to do a little more research.  You may also be able to add a PDF comment to the start of the content stream so you can locate it later on.  A pdf comment starts with a % sign and terminates with a line feed.  You can use the SetContentStreamFromString function after NewContentSteam to set the initial text of the stream to something you can parse later.

  char LineFeed = 0x0A;
  NewContentStream();
  SetContentStreamFromString(" % CustomContentStringID#1" + LineFeed);

Here is some sample C# code for working with OCG's. 

            QP.NewDocument();

            // Create four new optional content groups

            int OCG1 = QP.NewOptionalContentGroup("Layer 1");
            int OCG2 = QP.NewOptionalContentGroup("Layer 2");
            int OCG3 = QP.NewOptionalContentGroup("Layer 3");
            int OCG4 = QP.NewOptionalContentGroup("Layer 4");

            // Select the page that you want the layers to be
            // associated with.

            QP.SelectPage(1);

            // Specify top left corner for starting point
            // of all drawing functions.

            QP.SetOrigin(1);

            // Add OCG 1
            //QP.NewContentStream();
            QP.SelectContentStream(1);
            QP.DrawText(100, 100, "Layer 1");
            QP.SetContentStreamOptional(OCG1);
            QP.SetOptionalContentGroupVisible(OCG1, 1);

            // Add OCG 2
            QP.NewContentStream();
            QP.SelectContentStream(2);
            QP.DrawText(200, 100, "Layer 2");
            QP.SetContentStreamOptional(OCG2);
            QP.SetOptionalContentGroupVisible(OCG2, 1);

            // Add OCG 3
            QP.NewContentStream();
            QP.SelectContentStream(3);
            QP.DrawText(300, 100, "Layer 3");
            QP.SetContentStreamOptional(OCG3);
            QP.SetOptionalContentGroupVisible(OCG3, 1);

            // Add OCG 4
            QP.NewContentStream();
            QP.SelectContentStream(4);
            QP.DrawText(400, 100, "Layer 4");
            QP.SetContentStreamOptional(OCG4);
            QP.SetOptionalContentGroupVisible(OCG4, 1);

            // Save file to disk with new layers

            int configCount = QP.GetOptionalContentConfigCount();
            if (configCount == 1)
            {
                QP.SetOptionalContentConfigLocked(1, OCG1, 1);
                QP.SetOptionalContentConfigLocked(1, OCG3, 1);

                QP.SetOptionalContentGroupPrintable(OCG2, 0);
                QP.SetOptionalContentGroupPrintable(OCG3, 0);
            }

            QP.SaveToFile("out.pdf");
            Process.Start(@"out.pdf");
            QP.RemoveDocument(QP.SelectedDocument());

2nd example 

            QP.LoadFromFile("PDFParameters.pdf", "");

            int OCG1 = QP.NewOptionalContentGroup("Layer 1");
            QP.NewContentStream();

            int cs = QP.ContentStreamCount();
            QP.SetOrigin(1);
            QP.SelectContentStream(cs);   // Select the last content stream.

            int id = QP.AddImageFromFile("layer_eye.jpg", 0);
            QP.SelectImage(id);
            QP.DrawImage(10, 10, 200, 200);

            QP.SetContentStreamOptional(OCG1);
            QP.SetOptionalContentGroupVisible(OCG1, 1);

            QP.MoveContentStream(cs, 1);        // Move to the bottom layer

            QP.SaveToFile("out.pdf");
            Process.Start("out.pdf");
            QP.RemoveDocument(QP.SelectedDocument());





Posted By: mLipok
Date Posted: 14 Nov 14 at 8:58PM

Why after
$oQP.DeleteContentStream()
$oQP.SaveToFile('Z:\test.2.pdf')

Z:\test.2.pdf still contain a drawed line ?

And why 
$oQP.SetOrigin(1);
do not work in this case ?

What I'am doing here wrong ?
  

$oQP.LoadFromFile("z:\original.pdf", '')

$oQP.SetMeasurementUnits($__eQPDF_MUNITS_Milimeters)
$oQP.SetOrigin(1);

Local $iIDContentStream = $oQP.NewContentStream()
_QPDF_LastErrorCode($oQP)
$oQP.SelectContentStream($iIDContentStream)
_QPDF_LastErrorCode($oQP)
$oQP.SetMeasurementUnits($__eQPDF_MUNITS_Milimeters)
_QPDF_LastErrorCode($oQP)
$oQP.SetOrigin(1);
_QPDF_LastErrorCode($oQP)
$oQP.SetLineWidth(4)
_QPDF_LastErrorCode($oQP)
$oQP.DrawLine(0, 0, 100, 100)
_QPDF_LastErrorCode($oQP)
$oQP.SaveToFile('Z:\test.1.pdf')
$oQP.SelectContentStream($iIDContentStream)
_QPDF_LastErrorCode($oQP)
$oQP.DeleteContentStream()
$oQP.SaveToFile('Z:\test.2.pdf')
_QPDF_LastErrorCode($oQP)



-------------
Here you can find description how to test my examples:
http://www.quickpdf.org/forum/forum_posts.asp?TID=2932&PID=12600&title=drawcapturedpagematrix-matrix-howto#12600



Print Page | Close Window

Forum Software by Web Wiz Forums® version 11.01 - http://www.webwizforums.com
Copyright ©2001-2014 Web Wiz Ltd. - http://www.webwiz.co.uk