Do you own a Debenu Quick PDF Library version 7, 8, 9, 10, 11, 12, 13 or iSEDQuickPDF license? Upgrade to Debenu Quick PDF Library 14 today!

Debenu Quick PDF Library - PDF SDK Community Forum Homepage
Forum Home Forum Home > For Users of the Library > I need help - I can help
  New Posts New Posts RSS Feed - Extracting images (png/tiff issues)
  FAQ FAQ  Forum Search   Register Register  Login Login

Extracting images (png/tiff issues)

 Post Reply Post Reply
Author
Message
samb View Drop Down
Beginner
Beginner


Joined: 09 Feb 11
Status: Offline
Points: 8
Post Options Post Options   Thanks (0) Thanks(0)   Quote samb Quote  Post ReplyReply Direct Link To This Post Topic: Extracting images (png/tiff issues)
    Posted: 09 Feb 11 at 1:11AM
Environment:
C# .Net 2.0
Quick PDF 7.23 DLL

I am attempting to create and read image only PDF files. 
Creating the files seems to be working fine using all image formats that I have tried (PNG, JPEG, & TIFF).  I can open the files in Acrobat reader and everything looks fine, but reading them through code is giving me some trouble.

Code for writing the PDF files (formatting and non-relevant code removed):

foreach (byte[] imageContents in imageList)
{
int imageID = pdf.AddImageFromString(imageContents, 0);
if (imageID == 0) throw new Exception("Error on AddImageFromString");

if (pdf.DrawImage(0, 0, (float)pageWidth, (float)pageHeight) == 0) throw new Exception("Error on DrawImage");

pdf.NewPage();
}
pdf.SaveToFile(filePath);
pdf.RemoveDocument(outputFileHandle);


Current code for extracting the images

int pdfHandle = pdf.LoadFromFile(filePath);
pdf.SetFindImagesMode(1);
int imageCount = pdf.FindImages();
for (int i=1;i<=imageCount;i++)
{
int imageID = pdf.GetImageID(i);
pdf.SelectImage(imageID);
byte[] img = pdf.SaveImageToString();
File.WriteAllBytes(@"C:\Testing\out" + i + fileExt, img);
}


If the PDF file contains only JPEG images, I receive the images as expected.
If the PDF file contains only TIFF (CCIT Group 4) images, I receive a ton of ~35 pixel high slices of the images.
If the PDF file contains only PNG or TIFF (LZW) images, I receive the images as 24 bit bitmaps (even if the original PNG was monochrome)

I see that PNG is not defined in ImageType, which would explain why it doesn't work.

So, my question:
  1. Is it possible to retrieve the original PNG images using some other method? 

    I saw that the renderpage function can return a PNG image, but because I may be scaling the original images when drawing them on the pages, I don't know that it would be an accurate representation of the original image.

    PNG is my preferred image format.  If there is no way, I may consider using JPG for color images and CCIT Group 4 TIFF for black and white images if I can figure out how to retrieve full images instead of slices for the TIFFs.
Thanks,
Back to Top
samb View Drop Down
Beginner
Beginner


Joined: 09 Feb 11
Status: Offline
Points: 8
Post Options Post Options   Thanks (0) Thanks(0)   Quote samb Quote  Post ReplyReply Direct Link To This Post Posted: 09 Feb 11 at 10:04PM
I've made some progress.............

Somehow I missed the image extraction code sample:
http://www.quickpdf.org/forum/extract-text-and-images-and-insert-into-new-pdf_topic1308.html

From that example, it looks like I may be better off importing and exporting as bitmap and letting quickpdf deal with the compression.  Is this the preferred method?

Some new issues:
  1. In all cases, images retrieved seem to have their original DPI setting stripped.  When adding images to a document, I use the DPI to calculate the page size.  I should be able to get around this by manually setting the DPI based on the size of the image on the pdf page.
    I assume that a PDF file has no use for an image's DPI, which would explain why it is absent, but figured I would see if there was a way to get the original value.
    [Edit]Found in the docs that DPI is only available for some image types[/Edit]

  2. If I generate a PDF file from a monochrome bitmap, I can extract it as a monochrome bitmap.  If I generate a PDF file from a 24 bit color bitmap, I can extract it as a 24 bit color bitmap.
    Anything in between (256 color, 8 bit grayscale, etc) gets extracted as a 24 bit color bitmap.
    Is there some way to either extract an image with the original color depth, or at least query the image to see what the color depth is and manually convert it?
Thanks again,


Edited by samb - 14 Feb 11 at 9:54PM
Back to Top
vinod_pathak View Drop Down
Beginner
Beginner
Avatar

Joined: 14 May 11
Location: india
Status: Offline
Points: 7
Post Options Post Options   Thanks (0) Thanks(0)   Quote vinod_pathak Quote  Post ReplyReply Direct Link To This Post Posted: 16 May 11 at 7:24AM
Hi!

how to exrtact image from the layer of pdf file using QP in C#
Back to Top
Ingo View Drop Down
Moderator Group
Moderator Group
Avatar

Joined: 29 Oct 05
Status: Offline
Points: 3529
Post Options Post Options   Thanks (0) Thanks(0)   Quote Ingo Quote  Post ReplyReply Direct Link To This Post Posted: 16 May 11 at 9:48AM
Hi Vinod!

I've already told that it's not possible to extract images layer-relevant.
You can extract the images but on which layer they are isn't relevant for extraction.

Cheers, Ingo
 
Back to Top
samb View Drop Down
Beginner
Beginner


Joined: 09 Feb 11
Status: Offline
Points: 8
Post Options Post Options   Thanks (0) Thanks(0)   Quote samb Quote  Post ReplyReply Direct Link To This Post Posted: 16 May 11 at 1:23PM
As to the original issue:
I've learned more about pdf and image formats, and have settled on jpeg for color.  I found methods to perform lossless rotation on jpeg images, so that format has become acceptable.  I have still not been able to get striped CCITg4 (which .net defaults to in windows 7) images to work correctly, but it looks like the quick pdf team has been doing some serious changes on CCIT handling since 7.23 so I'm hoping to get some free time to try out the latest version.  In the mean time, using black and white bitmaps with quickpdf compression has been working fine.

I have been using some DA methods to get the image's drawn size to calculate the DPI, but again it looks like some of the new features may be able to do this with an in memory file.
Back to Top
AndrewC View Drop Down
Moderator Group
Moderator Group
Avatar

Joined: 08 Dec 10
Location: Geelong, Aust
Status: Offline
Points: 841
Post Options Post Options   Thanks (0) Thanks(0)   Quote AndrewC Quote  Post ReplyReply Direct Link To This Post Posted: 01 Jul 11 at 3:06PM
We have been changing the importing options for TIFF images.  If a TIFF image is compressed with Group 3 Fax / Group 4 Fax or LZW then we import the data into the PDF as is without having to decompress and recompress it.  This means that G4 tiffs are stored efficiently using G4 compression in the PDF. 

Image extract is an area we need to do some more work on as we currently only export 24 bit images regardless of the pixel depth.

The 3 pixel high image strips come from PDF's that are imported from striped TIFF images.  QPL supports this type of format but so do other PDF libraries.  It would take a little bit of work to put all the strips back together again.  It might be something we will do in a future version.

Andrew.
Back to Top
 Post Reply Post Reply
  Share Topic   

Forum Jump Forum Permissions View Drop Down

Forum Software by Web Wiz Forums® version 11.01
Copyright ©2001-2014 Web Wiz Ltd.

Copyright © 2017 Debenu. Debenu Quick PDF Library is a PDF SDK. All rights reserved. AboutContactBlogSupportOnline Store