Do you own a Debenu Quick PDF Library version 7, 8, 9, 10, 11, 12, 13 or iSEDQuickPDF license? Upgrade to Debenu Quick PDF Library 14 today!

Debenu Quick PDF Library - PDF SDK Community Forum Homepage
Forum Home Forum Home > For Users of the Library > I need help - I can help
  New Posts New Posts RSS Feed - Determine page type?
  FAQ FAQ  Forum Search   Register Register  Login Login

Determine page type?

 Post Reply Post Reply
Author
Message
Sam View Drop Down
Beginner
Beginner
Avatar

Joined: 21 Aug 09
Location: MN
Status: Offline
Points: 2
Post Options Post Options   Thanks (0) Thanks(0)   Quote Sam Quote  Post ReplyReply Direct Link To This Post Topic: Determine page type?
    Posted: 28 Aug 09 at 8:43PM
Greetings,

I'm working on a PDF redaction product that will be used by our clients.  This is a secure redaction process and the only way I know to do this is to render the PDF page to image, draw on the image, then save it back to PDF.  This is working fine, but it's incredibly slow.  I've got a few questions I'm hoping someone here can answer.

I've tried working differently depending on the document type.  I try to determine the make up of the document by finding images and if images = number of pages I assume these are scanned pages and extract the Tiff.  Otherwise, I render document to file then convert the rendered pages to tiff.  My redaction software creates thumbnails and the preview image to draw on from tiff files (I'm using Imageman).

I'm using an old version of QuickPDF (4.49) and am wondering if some of the bugs and slowness have been improved in the latest version.

FindImages seems to extract jpegs into the PDF's directory, this is quite annoying, anyone know anything about that?

FindImages is really slow, is there a better/faster way to determine page type (text/image or scanned image)?

Lastly, has PDF rendering been improved?  I've found the only acceptable text -> image rendering is png at 150dpi.  Anything less than that and there's too much detail lost in the text.

Here is the code I'm using (yes, it's vb5):
    'QP is the QuickPDF ActiveX component

    If QP.LoadFromFile(sPDFFileName) = 1 Then    'loads document
        lTotal = QP.PageCount
       
        notTiff = True
        rendered = False
        sPDFShortName = Mid$(sPDFFileName, 1, Len(sPDFFileName) - 4)
        If lTotal > 0 Then
            images = QP.FindImages
             If images = lTotal Then
                notTiff = False
                For page = 1 To images
                    imgID = QP.ImageID(page)
                    QP.SelectImage (imgID)
                    If QP.ImageType = 3 And QP.ImageWidth > 400 And QP.ImageHeight > 600 Then
                        QP.SaveImageToFile sPDFShortName & page & ".tif"
                    Else
                        notTiff = True
                    End If
                Next page
            End If
            If notTiff Then
                QP.RenderDocumentToFile 150, 1, lTotal, 5, sPDFShortName & ".png"
                For iC = 1 To lTotal
                    frmUploadDocs.imPreview.Picture = sPDFShortName & iC & ".png"
                    sTempFile = sPDFShortName & iC & ".tif"
                    If FileExist(sTempFile) Then Kill sTempFile
                    frmUploadDocs.imPreview.SaveAs sPDFShortName & iC & ".tif"
                    Kill sPDFShortName & iC & ".png"
                Next iC
                rendered = True
            End If
        End If
        ConvertPDFtoTIF = lTotal ' success
    End If

Thanks for any help

Sam

Back to Top
Michel_K17 View Drop Down
Newbie
Newbie
Avatar
www.exp-systems.com

Joined: 25 Jan 03
Status: Offline
Points: 297
Post Options Post Options   Thanks (0) Thanks(0)   Quote Michel_K17 Quote  Post ReplyReply Direct Link To This Post Posted: 29 Aug 09 at 2:38PM
Hello Sam,

   I can answer a few of your questions (but not the one about finding images - sorry).

   The library has absolutely improved dramatically since v4.49, in two ways. Today, we are at v7.15 which reflects the last of the improvements from iSed, the improvements from the user community, and the work done by Debenu and their programmers. By "improvements", I mean that a large number of bug fixes have been addressed as well as improved compatibility with PDF content. You mentioned rendering in particular, and yes, that portion of the code is far better - with rendering that now matches the rendering to Adobe's Reader in terms of quality.

   Finally, Debenu is steadily adding new features for which I am very thankful for as it brings the library back in line with the new technology being brought to the PDF format. For example, this includes the ability to digitally sign documents, and so much more.

   To be sure, it's a never ending task, but Debenu has been really pro-active at regular updates and addressing specific requests from the users when they can.

   Hopefully, someone else can address your image question.

   But, there is no doubt that you should upgrade, as a minimum, to the last version that iSed published with the modifications from the users. This would be a free upgrade for you. It's available [here].

   There is a list [here] of all the improvements by Debenu since v5.11 that you should take a look at. I believe that the offer to upgrade to the v7.xx series is still available to the users of the old version (you will need to provide proof of ownership). As I recall, they offer a $100 discount. The purchase page is [here].

   I hope that helps.

   Cheers!

Michel
Michel
Back to Top
Shotgun Tom View Drop Down
Senior Member
Senior Member
Avatar

Joined: 14 Aug 09
Location: Phoenix, AZ
Status: Offline
Points: 53
Post Options Post Options   Thanks (0) Thanks(0)   Quote Shotgun Tom Quote  Post ReplyReply Direct Link To This Post Posted: 29 Aug 09 at 5:32PM
A couple of thoughts for you, Sam.
 
1.  HasFontResources is a fairly quick way to determine if the entire document consists of images.  From the QuickPDF Manual: Determines if the selected document has font resources.  If the document does not it can be assumed to be an image only PDF. 
 
2.  I'm not all that familar with Imageman... however there is an ActiveX component called GdPicture Imaging SDK at www.gdpicture.com.  At one point it directly supported the ised library.  This component has a method that quickly converts a pdf (and pdf/a) to multipage tiff and also multipage tiff to pdf or pdf/a.  The package includes a viewer that renders pdf and multipage tiff very quickly.  In combination with the latest QuickPDF library you would have a very powerful pdf/tiff toolbox.
Back to Top
Sam View Drop Down
Beginner
Beginner
Avatar

Joined: 21 Aug 09
Location: MN
Status: Offline
Points: 2
Post Options Post Options   Thanks (0) Thanks(0)   Quote Sam Quote  Post ReplyReply Direct Link To This Post Posted: 03 Sep 09 at 5:40PM
Thanks for the help.  I upgraded to 7.15 and it is indeed quite a bit faster.  I was also able to reduce DPI which made the resulting image files smaller.

Tom, if I have time I may play with hasfontresources.  I don't know how that would work though, if a page is made up of multiple images, or if that's even common enough to worry about. 
Back to Top
Ingo View Drop Down
Moderator Group
Moderator Group
Avatar

Joined: 29 Oct 05
Status: Offline
Points: 3524
Post Options Post Options   Thanks (0) Thanks(0)   Quote Ingo Quote  Post ReplyReply Direct Link To This Post Posted: 03 Sep 09 at 6:56PM
Hi!

It's not a must that in an "only-image-pdf" there are no fontresources. I have a sample as really "only-image-pdf" with helvetica.

What you can do is to extract the textcontent. If there isn't any textcontent and if there are embedded images (function FindImages) then you can be pretty sure that it's a scanned or image-only-pdf.

Cheers, Ingo

Back to Top
 Post Reply Post Reply
  Share Topic   

Forum Jump Forum Permissions View Drop Down

Forum Software by Web Wiz Forums® version 11.01
Copyright ©2001-2014 Web Wiz Ltd.

Copyright © 2017 Debenu. Debenu Quick PDF Library is a PDF SDK. All rights reserved. AboutContactBlogSupportOnline Store