Print Page | Close Window

Enumerating text objects

Printed From: Debenu Quick PDF Library - PDF SDK Community Forum
Category: For Users of the Library
Forum Name: I need help - I can help
Forum Description: Problems and solutions while programming with the Debenu Quick PDF Library and Debenu PDF Viewer SDK
URL: http://www.quickpdf.org/forum/forum_posts.asp?TID=450
Printed Date: 19 May 24 at 1:43AM
Software Version: Web Wiz Forums 11.01 - http://www.webwizforums.com


Topic: Enumerating text objects
Posted By: beki
Subject: Enumerating text objects
Date Posted: 24 Jun 06 at 1:33AM
Does the QuickPDF library has a way to enumerate all text objects on a page and modify them?

Tomislav



Replies:
Posted By: beki
Date Posted: 25 Jun 06 at 12:57AM
Ok, GetPageText(2) will do the job. But is there a way to modify text through QuickPDF right in the PDF?

Tomislav


Posted By: chicks
Date Posted: 26 Jun 06 at 7:11PM
There is, but it's not easy. PDF really isn't meant to be edited this way. There can be all sorts of markup within a word or phrase, making it very difficult to pick out.

Here's a simple example (vbscript) that will change the word "spouse" to "mate" in the IRS' f1040ez.pdf document. Note that some gaps are left in a couple of places, because each chunk of text has its own positioning information - it would be nearly impossible to close the gaps without a huge amount of effort.

Try to change "Spouse" to "Mate", and you'll begin to see the difficulties.

Dim QP, i, p, content
Dim inFileName, outFileName, QuickPDFKey


'-- Init iSedQuickPDF
Set QP = CreateObject("iSED.QuickPDF")

'-- Get QuickPDF Key
QuickPDFKey = WScript.Arguments(0)

'-- Set file names
inFileName = "f1040ez.pdf"
outFileName = "out.pdf"

'-- Unlock the PDF library
Call QP.UnlockKey(QuickPDFKey)

'-- Load the input PDF
Call QP.LoadFromFile(inFileName)

For p = 1 to QP.PageCount()
    Call QP.SelectPage(p)
    WScript.Echo "Page: " & p
    For i = 0 to QP.LayerCount()-1
        Call QP.SelectLayer(i)
        WScript.Echo "Layer: " & i
        content = QP.GetPageContent()
        If InStr(content, "spouse") Then
            WScript.Echo "Found"
            content = Replace(content, "spouse", "mate")
            Call QP.SetPageContent(content)
        End If
    Next
Next

Call QP.SaveToFile(outFileName)

'-- Clean up
Set QP = Nothing




Posted By: beki
Date Posted: 27 Jun 06 at 12:55AM
Thanx chicks, this makes your point clear.

So it is something I will not be going to do, finding the text items on the page, but your code is very usefully describing the page structure.

Tomislav


Posted By: Ingo
Date Posted: 27 Jun 06 at 2:16AM
Hi Chicks!

Your sample shows how to go through each layer ...
I've thought with CombineLayers i've the content of all layers together immediately without calling each layer...?

Best regards,
Ingo


Posted By: beki
Date Posted: 30 Jun 06 at 9:46AM
Do you know of a method to remove parts of text on a specific page? Based on a rectangle?

Tomislav


Posted By: Ingo
Date Posted: 30 Jun 06 at 6:09PM
Hi Tomislav!

Here's a code snippet that should help:

QP.SetMeasurementUnits(0);
QP.SetFillColor(1,1,1);
for i := 1 to QP.PageCount do
    begin
       QP.SelectPage(i);
       QP.CombineLayers;
       QP.DrawBox(1, QP.PageHeight - pixels, QP.PageWidth, pixels, 1);
//     . . .
    end;

Best regards,
Ingo



Print Page | Close Window

Forum Software by Web Wiz Forums® version 11.01 - http://www.webwizforums.com
Copyright ©2001-2014 Web Wiz Ltd. - http://www.webwiz.co.uk