Print Page | Close Window

ExtractText stopped working

Printed From: Debenu Quick PDF Library - PDF SDK Community Forum
Category: For Users of the Library
Forum Name: I need help - I can help
Forum Description: Problems and solutions while programming with the Debenu Quick PDF Library and Debenu PDF Viewer SDK
URL: http://www.quickpdf.org/forum/forum_posts.asp?TID=193
Printed Date: 03 May 24 at 1:47AM
Software Version: Web Wiz Forums 11.01 - http://www.webwizforums.com


Topic: ExtractText stopped working
Posted By: bernhardz
Subject: ExtractText stopped working
Date Posted: 30 Nov 05 at 5:38PM

Hi,

I posted this message originally with the QuickPDF forum, but decided also to come over here.

I switched from a Crystal 8.5 generated PDF document to Version 10.0 of Crystal and the GetPageText no longer retrieves the proper text information.

Obviously, there must have been some format changes in the Crystal V10.0 PDF format. I have sample documents available. Has anybody else experienced these and possibly know a solution?

TIA

Bernhard




Replies:
Posted By: chicks
Date Posted: 30 Nov 05 at 6:01PM
Don't know for sure, but here are some possibilities:

1. PDF now gets created as image instead of text streams (are they considerably larger than before?)

2. New PDF creation engine uses newer compression algorithm (zbig?) that QuickPDF may not be able to uncompress.

To test, see if the standard freebie commandline tools pdftk and pdftotext (part of xpdf) can uncompress and extract text from them.

Also, after uncompressing with pdftk, view PDFs with a text editor, you should see the text inside unless it's in an image.



Posted By: Ingo
Date Posted: 01 Dec 05 at 1:43AM
Hi Bernhard,

try this workaround: LoadFromFile(originalfile)... SaveToFile(workfile)... LoadFromFile(workfile)... GetPageText...
SaveToFile will write the pdf-content back to disk with quickpdf-technics. Loading the new saved file a second time will solved your problem i think.



-------------
Cheers,
Ingo



Posted By: bernhardz
Date Posted: 01 Dec 05 at 11:10AM

Thanks for the help. THe LoadFromFile / SaveToFile / LoadFromFile did the trick in getting me back into a format that QuickPDF could read.

Bernhard




Print Page | Close Window

Forum Software by Web Wiz Forums® version 11.01 - http://www.webwizforums.com
Copyright ©2001-2014 Web Wiz Ltd. - http://www.webwiz.co.uk