Do you own a Debenu Quick PDF Library version 7, 8, 9, 10, 11, 12, 13 or iSEDQuickPDF license? Upgrade to Debenu Quick PDF Library 14 today!
Determine page type? |
Post Reply |
Author | |
Sam
Beginner Joined: 21 Aug 09 Location: MN Status: Offline Points: 2 |
Post Options
Thanks(0)
Posted: 28 Aug 09 at 8:43PM |
Greetings,
I'm working on a PDF redaction product that will be used by our clients. This is a secure redaction process and the only way I know to do this is to render the PDF page to image, draw on the image, then save it back to PDF. This is working fine, but it's incredibly slow. I've got a few questions I'm hoping someone here can answer. I've tried working differently depending on the document type. I try to determine the make up of the document by finding images and if images = number of pages I assume these are scanned pages and extract the Tiff. Otherwise, I render document to file then convert the rendered pages to tiff. My redaction software creates thumbnails and the preview image to draw on from tiff files (I'm using Imageman). I'm using an old version of QuickPDF (4.49) and am wondering if some of the bugs and slowness have been improved in the latest version. FindImages seems to extract jpegs into the PDF's directory, this is quite annoying, anyone know anything about that? FindImages is really slow, is there a better/faster way to determine page type (text/image or scanned image)? Lastly, has PDF rendering been improved? I've found the only acceptable text -> image rendering is png at 150dpi. Anything less than that and there's too much detail lost in the text. Here is the code I'm using (yes, it's vb5): 'QP is the QuickPDF ActiveX component If QP.LoadFromFile(sPDFFileName) = 1 Then 'loads document lTotal = QP.PageCount notTiff = True rendered = False sPDFShortName = Mid$(sPDFFileName, 1, Len(sPDFFileName) - 4) If lTotal > 0 Then images = QP.FindImages If images = lTotal Then notTiff = False For page = 1 To images imgID = QP.ImageID(page) QP.SelectImage (imgID) If QP.ImageType = 3 And QP.ImageWidth > 400 And QP.ImageHeight > 600 Then QP.SaveImageToFile sPDFShortName & page & ".tif" Else notTiff = True End If Next page End If If notTiff Then QP.RenderDocumentToFile 150, 1, lTotal, 5, sPDFShortName & ".png" For iC = 1 To lTotal frmUploadDocs.imPreview.Picture = sPDFShortName & iC & ".png" sTempFile = sPDFShortName & iC & ".tif" If FileExist(sTempFile) Then Kill sTempFile frmUploadDocs.imPreview.SaveAs sPDFShortName & iC & ".tif" Kill sPDFShortName & iC & ".png" Next iC rendered = True End If End If ConvertPDFtoTIF = lTotal ' success End If Thanks for any help Sam |
|
Michel_K17
Newbie www.exp-systems.com Joined: 25 Jan 03 Status: Offline Points: 297 |
Post Options
Thanks(0)
|
Hello Sam,
I can answer a few of your questions (but not the one about finding images - sorry). The library has absolutely improved dramatically since v4.49, in two ways. Today, we are at v7.15 which reflects the last of the improvements from iSed, the improvements from the user community, and the work done by Debenu and their programmers. By "improvements", I mean that a large number of bug fixes have been addressed as well as improved compatibility with PDF content. You mentioned rendering in particular, and yes, that portion of the code is far better - with rendering that now matches the rendering to Adobe's Reader in terms of quality. Finally, Debenu is steadily adding new features for which I am very thankful for as it brings the library back in line with the new technology being brought to the PDF format. For example, this includes the ability to digitally sign documents, and so much more. To be sure, it's a never ending task, but Debenu has been really pro-active at regular updates and addressing specific requests from the users when they can. Hopefully, someone else can address your image question. But, there is no doubt that you should upgrade, as a minimum, to the last version that iSed published with the modifications from the users. This would be a free upgrade for you. It's available [here]. There is a list [here] of all the improvements by Debenu since v5.11 that you should take a look at. I believe that the offer to upgrade to the v7.xx series is still available to the users of the old version (you will need to provide proof of ownership). As I recall, they offer a $100 discount. The purchase page is [here]. I hope that helps. Cheers! Michel |
|
Michel
|
|
Shotgun Tom
Senior Member Joined: 14 Aug 09 Location: Phoenix, AZ Status: Offline Points: 53 |
Post Options
Thanks(0)
|
A couple of thoughts for you, Sam.
1. HasFontResources is a fairly quick way to determine if the entire document consists of images. From the QuickPDF Manual: Determines if the selected document has font resources. If the document does not it can be assumed to be an image only PDF.
2. I'm not all that familar with Imageman... however there is an ActiveX component called GdPicture Imaging SDK at www.gdpicture.com. At one point it directly supported the ised library. This component has a method that quickly converts a pdf (and pdf/a) to multipage tiff and also multipage tiff to pdf or pdf/a. The package includes a viewer that renders pdf and multipage tiff very quickly. In combination with the latest QuickPDF library you would have a very powerful pdf/tiff toolbox.
|
|
Sam
Beginner Joined: 21 Aug 09 Location: MN Status: Offline Points: 2 |
Post Options
Thanks(0)
|
Thanks for the help. I upgraded to 7.15 and it is indeed quite a bit faster. I was also able to reduce DPI which made the resulting image files smaller.
Tom, if I have time I may play with hasfontresources. I don't know how that would work though, if a page is made up of multiple images, or if that's even common enough to worry about. |
|
Ingo
Moderator Group Joined: 29 Oct 05 Status: Offline Points: 3524 |
Post Options
Thanks(0)
|
Hi!
It's not a must that in an "only-image-pdf" there are no fontresources. I have a sample as really "only-image-pdf" with helvetica. What you can do is to extract the textcontent. If there isn't any textcontent and if there are embedded images (function FindImages) then you can be pretty sure that it's a scanned or image-only-pdf. Cheers, Ingo |
|
Post Reply | |
Tweet
|
Forum Jump | Forum Permissions You cannot post new topics in this forum You cannot reply to topics in this forum You cannot delete your posts in this forum You cannot edit your posts in this forum You cannot create polls in this forum You cannot vote in polls in this forum |
Copyright © 2017 Debenu. Debenu Quick PDF Library is a PDF SDK. All rights reserved. About — Contact — Blog — Support — Online Store