Print Page | Close Window

Extract text from PDF in Database

Printed From: Debenu Quick PDF Library - PDF SDK Community Forum
Category: For Users of the Library
Forum Name: I need help - I can help
Forum Description: Problems and solutions while programming with the Debenu Quick PDF Library and Debenu PDF Viewer SDK
URL: http://www.quickpdf.org/forum/forum_posts.asp?TID=3527
Printed Date: 28 Mar 24 at 3:24PM
Software Version: Web Wiz Forums 11.01 - http://www.webwizforums.com


Topic: Extract text from PDF in Database
Posted By: chrisreed
Subject: Extract text from PDF in Database
Date Posted: 21 Nov 17 at 7:51AM
I have stored the contents of PDF files in a varbinary(max) column in an MS SQL Database.
Is there a function in QuickPDF that allows me to extract text from the string data I return from an SQL Query on this column while it is memory?  ie. directly operate on the string data rather than a PDF file?
 
ie. With the extracted contents of the data from the database I want to avoid having to:
1) Save it as a PDF file.
2) Open the file using say the DAOpenFile function.
3) Extract the text using say DAExtractPageText function.
4) Close the file and then delete it.
 
I have over 14,000 PDF files to check so you can see having to save, open and delete a PDF file for each record in the database will take a lot of time.
 
Chris



Replies:
Posted By: Ingo
Date Posted: 21 Nov 17 at 9:59PM
Hi Chris,

in any case you need a LoadFromString to build the internal pdf-structure.
Then you can work on it with one of the extract-functionalities (no chance to avoid it).
As an option (if you've done some changes) with a SaveToString and a rewrite into your database you can save your changes without creating a new pdf.



-------------
Cheers,
Ingo



Posted By: chrisreed
Date Posted: 22 Nov 17 at 5:12AM
Thanks Ingo, but I could not find the LoadFromString Method.  I have only the following....
LoadFromCanvasDC, LoadFromFile, LoadFromVariant, LoadState
 
I am running version 9.16 so perhaps this is a method in a newer version?
 
Chris


Posted By: mLipok
Date Posted: 23 Nov 17 at 10:51AM
If you are using ActiveX then you should use
LoadFromVariant
SaveToVariant

Just like I do in many project's when I reading / saving PDF content to/from MS SQL Databases.



-------------
Here you can find description how to test my examples:
http://www.quickpdf.org/forum/forum_posts.asp?TID=2932&PID=12600&title=drawcapturedpagematrix-matrix-howto#12600


Posted By: Ingo
Date Posted: 23 Nov 17 at 9:52PM
http://www.debenu.com/docs/pdf_library_reference/LoadFromString.php



-------------
Cheers,
Ingo




Print Page | Close Window

Forum Software by Web Wiz Forums® version 11.01 - http://www.webwizforums.com
Copyright ©2001-2014 Web Wiz Ltd. - http://www.webwiz.co.uk