Print Page | Close Window

GetPageText returns no string

Printed From: Debenu Quick PDF Library - PDF SDK Community Forum
Category: For Users of the Library
Forum Name: I need help - I can help
Forum Description: Problems and solutions while programming with the Debenu Quick PDF Library and Debenu PDF Viewer SDK
URL: http://www.quickpdf.org/forum/forum_posts.asp?TID=1762
Printed Date: 12 May 25 at 11:20PM
Software Version: Web Wiz Forums 11.01 - http://www.webwizforums.com


Topic: GetPageText returns no string
Posted By: essex
Subject: GetPageText returns no string
Date Posted: 09 Mar 11 at 7:28AM
I try to use asp code to extract text from a pdf file :   http://61.153.6.9/qs.pdf - http://61.153.6.9/qs.pdf

Dim QP,ss

Set QP = Server.CreateObject("QuickPDFAX0724.PDFLibrary")
call  QP.UnlockKey("j59XXXXXXXX9b18gf9wu9xj4y")

If QP.LoadFromFile(server.mappath("qs.pdf"))=0 Then
   response.write "canot load file"
   response.End
End If

Call QP.SelectPage(2)    ' the first page is an image.
ss=QP.GetPageText(4)
response.write "text:"&ss
 
The pdf file contains chinese charsets, I guess quickpdf canot detect the correct font type or encoding.
 
how can i do?
wish your help, thanks!
 
 



Replies:
Posted By: Ingo
Date Posted: 09 Mar 11 at 10:24AM
Hi Essex!

I could extract the content without any problems.
To get the correct chinese characters you should
try a unicode conversion (utf8) while extraction.

Cheers and welcome here,
Ingo
 


Posted By: essex
Date Posted: 09 Mar 11 at 11:15AM
Thank Ingo!
However,seems it returned some characters, but it did not return  all the characters.
 
Smile
 


Posted By: Ingo
Date Posted: 09 Mar 11 at 12:02PM
so do the convertion ;-)



Posted By: essex
Date Posted: 10 Mar 11 at 2:29AM
I checked the result later, i found what the code returned were not text content, but the font name, and location info.



Print Page | Close Window

Forum Software by Web Wiz Forums® version 11.01 - http://www.webwizforums.com
Copyright ©2001-2014 Web Wiz Ltd. - http://www.webwiz.co.uk