Do you own a Debenu Quick PDF Library version 7, 8, 9, 10, 11, 12, 13 or iSEDQuickPDF license? Upgrade to Debenu Quick PDF Library 14 today!

Debenu Quick PDF Library - PDF SDK Community Forum Homepage
Forum Home Forum Home > For Users of the Library > I need help - I can help
  New Posts New Posts RSS Feed - text extraction
  FAQ FAQ  Forum Search   Register Register  Login Login

text extraction

 Post Reply Post Reply
Author
Message
rajeev View Drop Down
Beginner
Beginner
Avatar

Joined: 09 Nov 10
Location: INDIA
Status: Offline
Points: 1
Post Options Post Options   Thanks (0) Thanks(0)   Quote rajeev Quote  Post ReplyReply Direct Link To This Post Topic: text extraction
    Posted: 10 Nov 10 at 2:49PM
Hi,
I used php to successfully read the lines from pdf file of a newspaper. The problem is that it reads char by char or word by word only. I wish to read the file paragraph by paragraph. any help for this?
 
Also i could extract images from pdf, but i also need the coordinates where the image was placed.
any help will be appreciated..
 
 
 
 
Back to Top
Ingo View Drop Down
Moderator Group
Moderator Group
Avatar

Joined: 29 Oct 05
Status: Offline
Points: 3530
Post Options Post Options   Thanks (0) Thanks(0)   Quote Ingo Quote  Post ReplyReply Direct Link To This Post Posted: 10 Nov 10 at 7:32PM
Hi!

With QuickPDF you can do textextraction from pdf word by word, string by string and/or page by page. Have a look in the online reference accessable via www.quickpdf.org.

The image coordinates you can get via the relevant mediaboxes. Read the pdf with QuickPDF, then decryption, then reading the real content (like looking into pdf via notepad).

Cheers and welcome here,
Ingo
Back to Top
 Post Reply Post Reply
  Share Topic   

Forum Jump Forum Permissions View Drop Down

Forum Software by Web Wiz Forums® version 11.01
Copyright ©2001-2014 Web Wiz Ltd.

Copyright © 2017 Debenu. Debenu Quick PDF Library is a PDF SDK. All rights reserved. AboutContactBlogSupportOnline Store