Do you own a Debenu Quick PDF Library version 7, 8, 9, 10, 11, 12, 13 or iSEDQuickPDF license? Upgrade to Debenu Quick PDF Library 14 today!
![]() |
Testing QuickPDF for text extraction performance |
Post Reply ![]() |
Author | |
pcunite ![]() Beginner ![]() Joined: 15 Feb 12 Location: USA Status: Offline Points: 4 |
![]() ![]() ![]() ![]() ![]() Posted: 15 Feb 12 at 8:40PM |
I am evaluating the QuickPDF library (.dll version) for use in a C++ application. The only functionally I need it to extract the text. The entire PDF's text will be placed in memory and then I'll search for keyword terms.
Is QuickPDF suitable for this type of work and offer good performance? |
|
![]() |
|
Ingo ![]() Moderator Group ![]() ![]() Joined: 29 Oct 05 Status: Offline Points: 3529 |
![]() ![]() ![]() ![]() ![]() |
So you didn't read the documents from the original support pages of the publishers ;-)
Hi! The searching could be done with your programming language and the textextraction could be done with QuickPDF with several kinds of options. Your performance-question: It always depends on ... Try it ;-) http://www.quickpdflibrary.com/help/quickpdf/Extraction.php Cheers and welcome here, Ingo |
|
![]() |
|
pcunite ![]() Beginner ![]() Joined: 15 Feb 12 Location: USA Status: Offline Points: 4 |
![]() ![]() ![]() ![]() ![]() |
Well, yes I've read some of the materials. I'm looking at about 5 different solutions and wanted my hand held a little :)
I know how to use you're library, just wanted a fuzzy feeling that it is up to the task for my requirements. Some PDF libraries are more for creation or editing ... I just want the text as fast as I can. Is QuickPDF optimized for this? P.S. I did not find it referenced anywhere, but can the .LIB version work with C++ Builder 2007 or is that for only Visual Studio? The .DLL version is fine, just wondering. |
|
![]() |
|
Ingo ![]() Moderator Group ![]() ![]() Joined: 29 Oct 05 Status: Offline Points: 3529 |
![]() ![]() ![]() ![]() ![]() |
Hi!
This library offers over 500 functions for a low price. Textextraction was already in the first versions many years ago. So this should be stable but it won't be optimized specially for textextraction. Personal opinions will be different so you have to try. Cheers, Ingo |
|
![]() |
|
pcunite ![]() Beginner ![]() Joined: 15 Feb 12 Location: USA Status: Offline Points: 4 |
![]() ![]() ![]() ![]() ![]() |
Thank you for your help. I'm testing the sample function below. Is this the fastest way? I just want to make sure I'm doing all I can. Also, I don't understand DASetTextExtractionOptions ... should I use it to optimize anything?
Edited by pcunite - 15 Feb 12 at 10:21PM |
|
![]() |
|
Ingo ![]() Moderator Group ![]() ![]() Joined: 29 Oct 05 Status: Offline Points: 3529 |
![]() ![]() ![]() ![]() ![]() |
sTxt += QP.DAExtractPageText(FH, PR, 0);
0 should be the fastest. If 0 is useful for you depends on what you wanna do with the text. |
|
![]() |
|
pcunite ![]() Beginner ![]() Joined: 15 Feb 12 Location: USA Status: Offline Points: 4 |
![]() ![]() ![]() ![]() ![]() |
I only want to know if the word "blah" appears in the PDF file. I understand that I can't do this with image only PDF files ... that is okay. Thus I load all the strings into a buffer and then I'll search myself for "blah". |
|
![]() |
|
AndrewC ![]() Moderator Group ![]() ![]() Joined: 08 Dec 10 Location: Geelong, Aust Status: Offline Points: 841 |
![]() ![]() ![]() ![]() ![]() |
Andrew.
|
|
![]() |
Post Reply ![]() |
|
Tweet
|
Forum Jump | Forum Permissions ![]() You cannot post new topics in this forum You cannot reply to topics in this forum You cannot delete your posts in this forum You cannot edit your posts in this forum You cannot create polls in this forum You cannot vote in polls in this forum |
Copyright © 2017 Debenu. Debenu Quick PDF Library is a PDF SDK. All rights reserved. About — Contact — Blog — Support — Online Store