I need help - I can help - OpenOffice PDF-files

Print Page | Close Window

OpenOffice PDF-files

Printed From: Debenu Quick PDF Library - PDF SDK Community Forum
Category: For Users of the Library
Forum Name: I need help - I can help
Forum Description: Problems and solutions while programming with the Debenu Quick PDF Library and Debenu PDF Viewer SDK
URL: http://www.quickpdf.org/forum/forum_posts.asp?TID=464
Printed Date: 11 Jul 26 at 11:47AM
Software Version: Web Wiz Forums 11.01 - http://www.webwizforums.com

Topic: OpenOffice PDF-files

Posted By: Ingo
Subject: OpenOffice PDF-files
Date Posted: 13 Jul 06 at 4:04AM

Hi!

From Ulrich i've got the insight that QuickPDF can't extract textcontent from new OpenOffice-PDFs...
My thoughts in this case: Working from zero on in the Adobe specs or to examine other sources of components which can extract textcontent from OpenOffice-PDFs, too. Is here anybody who can say if iText is the component i mean?

Best regards,
Ingo

Replies:

Posted By: chicks
Date Posted: 13 Jul 06 at 12:09PM

No, iText can't extract text.

The best open-source tool is pdftohtml (which can also extract in XML format). It's based on pdftotext, which is one of the commandline tools available in the xpdf distro.

http://pdftohtml.sourceforge.net/
http://www.foolabs.com/xpdf/

Posted By: Ingo
Date Posted: 13 Jul 06 at 5:41PM

Thanks a lot!
I'll try both components... perhaps there's something we can use in quickpdf.

Best regards,
Ingo