Do you own a Debenu Quick PDF Library version 7, 8, 9, 10, 11, 12, 13 or iSEDQuickPDF license? Upgrade to Debenu Quick PDF Library 14 today!

Debenu Quick PDF Library - PDF SDK Community Forum Homepage
Forum Home Forum Home > For Users of the Library > I need help - I can help
  New Posts New Posts RSS Feed - Remove Header/Footer from GetPageText()?
  FAQ FAQ  Forum Search   Register Register  Login Login

Remove Header/Footer from GetPageText()?

 Post Reply Post Reply
Author
Message
cdoan View Drop Down
Beginner
Beginner


Joined: 13 Sep 11
Status: Offline
Points: 2
Post Options Post Options   Thanks (0) Thanks(0)   Quote cdoan Quote  Post ReplyReply Direct Link To This Post Topic: Remove Header/Footer from GetPageText()?
    Posted: 13 Sep 11 at 7:21PM
Hello,

   I have a document that has page headers and footers on each page (ie, page number, chapter title, etc).  What I've noticed in using GetPageText() is that the header AND footer are processed first, then the body of the page follows.  Is there any way for QuickPDF to identify the header and footer so I can ignore them using GetPageText()?

   If I can't identify the header/footer easily from GetPageText(), I guess the next best thing would be to physically remove the header and footer from the PDF internally, then run GetPageText().  In which case, can someone point me in the right direction to manipulate the headers/footers?  I checked around for something that looked like a header, but maybe they are called something else in PDFland.  Thanks!


cdoan
Back to Top
AndrewC View Drop Down
Moderator Group
Moderator Group
Avatar

Joined: 08 Dec 10
Location: Geelong, Aust
Status: Offline
Points: 841
Post Options Post Options   Thanks (0) Thanks(0)   Quote AndrewC Quote  Post ReplyReply Direct Link To This Post Posted: 14 Sep 11 at 8:41AM
Most pdf files have no concept of headers, footers, paragraphs or even words.  PDF's are created with a series of drawing commands and can be put together in any randdom order just like a jigsaw puzzle.

If the header and footer are always in the same size and format of the page then you could remove all the text objects on the page that have a bounding box that falls inside some pre-defined areas.
Back to Top
cdoan View Drop Down
Beginner
Beginner


Joined: 13 Sep 11
Status: Offline
Points: 2
Post Options Post Options   Thanks (0) Thanks(0)   Quote cdoan Quote  Post ReplyReply Direct Link To This Post Posted: 14 Sep 11 at 4:28PM
AndrewC,

    That (unfortunately) makes a lot of sense.  I'll give the fixed bounding box idea a go... there might be some corner cases, but I'll deal with them individually.  Thanks!


cdoan.
Back to Top
 Post Reply Post Reply
  Share Topic   

Forum Jump Forum Permissions View Drop Down

Forum Software by Web Wiz Forums® version 11.01
Copyright ©2001-2014 Web Wiz Ltd.

Copyright © 2017 Debenu. Debenu Quick PDF Library is a PDF SDK. All rights reserved. AboutContactBlogSupportOnline Store