Do you own a Debenu Quick PDF Library version 7, 8, 9, 10, 11, 12, 13 or iSEDQuickPDF license? Upgrade to Debenu Quick PDF Library 14 today!

Debenu Quick PDF Library - PDF SDK Community Forum Homepage
Forum Home Forum Home > For Users of the Library > I need help - I can help
  New Posts New Posts RSS Feed - get images and all sizes?
  FAQ FAQ  Forum Search   Register Register  Login Login

get images and all sizes?

 Post Reply Post Reply
Author
Message
johnny View Drop Down
Beginner
Beginner
Avatar

Joined: 08 May 19
Location: Earth
Status: Offline
Points: 17
Post Options Post Options   Thanks (0) Thanks(0)   Quote johnny Quote  Post ReplyReply Direct Link To This Post Topic: get images and all sizes?
    Posted: 18 Jun 19 at 5:31PM
hi all

i want to determin if a pdf uses images as background or in general take all images and check their sizes.
with the FindImages i get the count,
after that what function exist to loop into the found images and check their sizes?


C#

thank you



Back to Top
Ingo View Drop Down
Moderator Group
Moderator Group
Avatar

Joined: 29 Oct 05
Status: Offline
Points: 3524
Post Options Post Options   Thanks (0) Thanks(0)   Quote Ingo Quote  Post ReplyReply Direct Link To This Post Posted: 18 Jun 19 at 9:58PM
Hi Johnny.

this sample could be a kickstart for your first codes...
https://www.debenu.com/kb/extract-images-from-pdf-files-as-the-appropriate-image-type/

...and GetImageListItemIntProperty is the most important function for this:
https://www.debenu.com/docs/pdf_library_reference/GetImageListItemIntProperty.php

Good luck and much success :)

Cheers,
Ingo

Back to Top
johnny View Drop Down
Beginner
Beginner
Avatar

Joined: 08 May 19
Location: Earth
Status: Offline
Points: 17
Post Options Post Options   Thanks (0) Thanks(0)   Quote johnny Quote  Post ReplyReply Direct Link To This Post Posted: 19 Jun 19 at 10:30AM
Originally posted by Ingo Ingo wrote:

Hi Johnny.

this sample could be a kickstart for your first codes...
https://www.debenu.com/kb/extract-images-from-pdf-files-as-the-appropriate-image-type/

...and GetImageListItemIntProperty is the most important function for this:
https://www.debenu.com/docs/pdf_library_reference/GetImageListItemIntProperty.php

Good luck and much success :)

thanks so much IngoStar

the tricky part was to find this
401 = Width in pixels
402 = Height in pixels

there is no properly named properties you need to pass "int code numbers" and that makes is hard during search.... Dead

Back to Top
johnny View Drop Down
Beginner
Beginner
Avatar

Joined: 08 May 19
Location: Earth
Status: Offline
Points: 17
Post Options Post Options   Thanks (0) Thanks(0)   Quote johnny Quote  Post ReplyReply Direct Link To This Post Posted: 19 Jun 19 at 11:27AM
btw why the GetPageImageList() usually start with 0 and not 1?
i mean in other functions the page of your pdf document starts from 1 not index 0, but in this function seems to start from 0.
Back to Top
johnny View Drop Down
Beginner
Beginner
Avatar

Joined: 08 May 19
Location: Earth
Status: Offline
Points: 17
Post Options Post Options   Thanks (0) Thanks(0)   Quote johnny Quote  Post ReplyReply Direct Link To This Post Posted: 19 Jun 19 at 3:23PM
since sharing is caring here is my function in case it will help someone else with similar question in the future.

private Boolean IsValidPDF(String FilePath)
        {
            //force proper handling of all digits and points. 1.2345,67
            CultureInfo.DefaultThreadCurrentCulture = new CultureInfo("el-GR", false);

            ValidationErrorMsg = String.Empty;

            //check if the file is corrupted or in use or something else is wrong with it
            if (DPL.LoadFromFile(FilePath, "") != 1)
            {
                ValidationErrorMsg = string.Format("Error code {0}, while loading file {1}", DPL.LastErrorCode(), FilePath);
                return false;
            }

            //check if the file doesn't contain any text at all so is a 100% image/photo/scan
            if (DPL.HasFontResources() == 0)
            {
                ValidationErrorMsg = "This file doesn't contain any text!";
                return false;
            }

            //check if the file contains images that are big enough (background) to mean that this is not a proper text pdf and should be handled by ocr scan
            Int32 il = DPL.GetPageImageList(0);
            Int32 lc = DPL.GetImageListCount(il);

            for (Int32 i = 1; i <= lc; ++i)
            {
                Int32 imgWidth = DPL.GetImageListItemIntProperty(il, i, 401);
                Int32 imgHeight = DPL.GetImageListItemIntProperty(il, i, 402);

                if (imgWidth > 1500 || imgHeight > 1500)
                {
                    ValidationErrorMsg = "This file seems like an image!";
                    return false;
                }
            }

            return true;
        }
Back to Top
Ingo View Drop Down
Moderator Group
Moderator Group
Avatar

Joined: 29 Oct 05
Status: Offline
Points: 3524
Post Options Post Options   Thanks (0) Thanks(0)   Quote Ingo Quote  Post ReplyReply Direct Link To This Post Posted: 19 Jun 19 at 7:21PM
a small hint:
I've test-pdfs from architects with a lot of text and numbers on the pages and embedded cad images with extreme dimensions.
You can have a text-pdf with standard pdf dimensions with a small image inside which has huge dimensions after extraction.

Cheers,
Ingo

Back to Top
johnny View Drop Down
Beginner
Beginner
Avatar

Joined: 08 May 19
Location: Earth
Status: Offline
Points: 17
Post Options Post Options   Thanks (0) Thanks(0)   Quote johnny Quote  Post ReplyReply Direct Link To This Post Posted: 19 Jun 19 at 7:49PM
is ok...cause i am only interested in pdf that are A4 or similar size invoices for my app. no chance a CAD will be used for this. just invoices that are scanned and imported automatically to many popular Account Application :)
Back to Top
 Post Reply Post Reply
  Share Topic   

Forum Jump Forum Permissions View Drop Down

Forum Software by Web Wiz Forums® version 11.01
Copyright ©2001-2014 Web Wiz Ltd.

Copyright © 2017 Debenu. Debenu Quick PDF Library is a PDF SDK. All rights reserved. AboutContactBlogSupportOnline Store