Print Page | Close Window

Extracting images (png/tiff issues)

Printed From: Debenu Quick PDF Library - PDF SDK Community Forum
Category: For Users of the Library
Forum Name: I need help - I can help
Forum Description: Problems and solutions while programming with the Debenu Quick PDF Library and Debenu PDF Viewer SDK
URL: http://www.quickpdf.org/forum/forum_posts.asp?TID=1732
Printed Date: 25 Apr 25 at 11:19AM
Software Version: Web Wiz Forums 11.01 - http://www.webwizforums.com


Topic: Extracting images (png/tiff issues)
Posted By: samb
Subject: Extracting images (png/tiff issues)
Date Posted: 09 Feb 11 at 1:11AM
Environment:
C# .Net 2.0
Quick PDF 7.23 DLL

I am attempting to create and read image only PDF files. 
Creating the files seems to be working fine using all image formats that I have tried (PNG, JPEG, & TIFF).  I can open the files in Acrobat reader and everything looks fine, but reading them through code is giving me some trouble.

Code for writing the PDF files (formatting and non-relevant code removed):

foreach (byte[] imageContents in imageList)
{
int imageID = pdf.AddImageFromString(imageContents, 0);
if (imageID == 0) throw new Exception("Error on AddImageFromString");

if (pdf.DrawImage(0, 0, (float)pageWidth, (float)pageHeight) == 0) throw new Exception("Error on DrawImage");

pdf.NewPage();
}
pdf.SaveToFile(filePath);
pdf.RemoveDocument(outputFileHandle);


Current code for extracting the images

int pdfHandle = pdf.LoadFromFile(filePath);
pdf.SetFindImagesMode(1);
int imageCount = pdf.FindImages();
for (int i=1;i<=imageCount;i++)
{
int imageID = pdf.GetImageID(i);
pdf.SelectImage(imageID);
byte[] img = pdf.SaveImageToString();
File.WriteAllBytes(@"C:\Testing\out" + i + fileExt, img);
}


If the PDF file contains only JPEG images, I receive the images as expected.
If the PDF file contains only TIFF (CCIT Group 4) images, I receive a ton of ~35 pixel high slices of the images.
If the PDF file contains only PNG or TIFF (LZW) images, I receive the images as 24 bit bitmaps (even if the original PNG was monochrome)

I see that PNG is not defined in ImageType, which would explain why it doesn't work.

So, my question:
  1. Is it possible to retrieve the original PNG images using some other method? 

    I saw that the renderpage function can return a PNG image, but because I may be scaling the original images when drawing them on the pages, I don't know that it would be an accurate representation of the original image.

    PNG is my preferred image format.  If there is no way, I may consider using JPG for color images and CCIT Group 4 TIFF for black and white images if I can figure out how to retrieve full images instead of slices for the TIFFs.
Thanks,



Replies:
Posted By: samb
Date Posted: 09 Feb 11 at 10:04PM
I've made some progress.............

Somehow I missed the image extraction code sample:
http://www.quickpdf.org/forum/extract-text-and-images-and-insert-into-new-pdf_topic1308.html

From that example, it looks like I may be better off importing and exporting as bitmap and letting quickpdf deal with the compression.  Is this the preferred method?

Some new issues:
  1. In all cases, images retrieved seem to have their original DPI setting stripped.  When adding images to a document, I use the DPI to calculate the page size.  I should be able to get around this by manually setting the DPI based on the size of the image on the pdf page.
    I assume that a PDF file has no use for an image's DPI, which would explain why it is absent, but figured I would see if there was a way to get the original value.
    [Edit]Found in the docs that DPI is only available for some image types[/Edit]

  2. If I generate a PDF file from a monochrome bitmap, I can extract it as a monochrome bitmap.  If I generate a PDF file from a 24 bit color bitmap, I can extract it as a 24 bit color bitmap.
    Anything in between (256 color, 8 bit grayscale, etc) gets extracted as a 24 bit color bitmap.
    Is there some way to either extract an image with the original color depth, or at least query the image to see what the color depth is and manually convert it?
Thanks again,


Posted By: vinod_pathak
Date Posted: 16 May 11 at 7:24AM
Hi!

how to exrtact image from the layer of pdf file using QP in C#


Posted By: Ingo
Date Posted: 16 May 11 at 9:48AM
Hi Vinod!

I've already told that it's not possible to extract images layer-relevant.
You can extract the images but on which layer they are isn't relevant for extraction.

Cheers, Ingo
 


Posted By: samb
Date Posted: 16 May 11 at 1:23PM
As to the original issue:
I've learned more about pdf and image formats, and have settled on jpeg for color.  I found methods to perform lossless rotation on jpeg images, so that format has become acceptable.  I have still not been able to get striped CCITg4 (which .net defaults to in windows 7) images to work correctly, but it looks like the quick pdf team has been doing some serious changes on CCIT handling since 7.23 so I'm hoping to get some free time to try out the latest version.  In the mean time, using black and white bitmaps with quickpdf compression has been working fine.

I have been using some DA methods to get the image's drawn size to calculate the DPI, but again it looks like some of the new features may be able to do this with an in memory file.


Posted By: AndrewC
Date Posted: 01 Jul 11 at 3:06PM
We have been changing the importing options for TIFF images.  If a TIFF image is compressed with Group 3 Fax / Group 4 Fax or LZW then we import the data into the PDF as is without having to decompress and recompress it.  This means that G4 tiffs are stored efficiently using G4 compression in the PDF. 

Image extract is an area we need to do some more work on as we currently only export 24 bit images regardless of the pixel depth.

The 3 pixel high image strips come from PDF's that are imported from striped TIFF images.  QPL supports this type of format but so do other PDF libraries.  It would take a little bit of work to put all the strips back together again.  It might be something we will do in a future version.

Andrew.



Print Page | Close Window

Forum Software by Web Wiz Forums® version 11.01 - http://www.webwizforums.com
Copyright ©2001-2014 Web Wiz Ltd. - http://www.webwiz.co.uk