Print Page | Close Window

OpenOffice PDF-files

Printed From: Debenu Quick PDF Library - PDF SDK Community Forum
Category: For Users of the Library
Forum Name: I need help - I can help
Forum Description: Problems and solutions while programming with the Debenu Quick PDF Library and Debenu PDF Viewer SDK
URL: http://www.quickpdf.org/forum/forum_posts.asp?TID=464
Printed Date: 18 May 24 at 11:49PM
Software Version: Web Wiz Forums 11.01 - http://www.webwizforums.com


Topic: OpenOffice PDF-files
Posted By: Ingo
Subject: OpenOffice PDF-files
Date Posted: 13 Jul 06 at 4:04AM
Hi!

From Ulrich i've got the insight that QuickPDF can't extract textcontent from new OpenOffice-PDFs...
My thoughts in this case: Working from zero on in the Adobe specs or to examine other sources of components which can extract textcontent from OpenOffice-PDFs, too. Is here anybody who can say if iText is the component i mean?

Best regards,
Ingo



Replies:
Posted By: chicks
Date Posted: 13 Jul 06 at 12:09PM
No, iText can't extract text.

The best open-source tool is pdftohtml (which can also extract in XML format). It's based on pdftotext, which is one of the commandline tools available in the xpdf distro.

http://pdftohtml.sourceforge.net/
http://www.foolabs.com/xpdf/


Posted By: Ingo
Date Posted: 13 Jul 06 at 5:41PM
Thanks a lot!
I'll try both components... perhaps there's something we can use in quickpdf.

Best regards,
Ingo



Print Page | Close Window

Forum Software by Web Wiz Forums® version 11.01 - http://www.webwizforums.com
Copyright ©2001-2014 Web Wiz Ltd. - http://www.webwiz.co.uk