Do you own a Debenu Quick PDF Library version 7, 8, 9, 10, 11, 12, 13 or iSEDQuickPDF license? Upgrade to Debenu Quick PDF Library 14 today!
Copying/Searching unicode chars in a PDF |
Post Reply |
Author | |
stakon
Team Player Joined: 09 Oct 09 Status: Offline Points: 22 |
Post Options
Thanks(0)
Posted: 08 Jan 10 at 11:49AM |
Good day and happy new year!
Today i came across a problem i had never noticed before: I select some unicode (greek) text from a PDFdocument, copy and paste it in a word or txt document and what i get is some unreadable characters... I guess this is also the reason why a search within the PDF for greek characters returns no result. As a note i used embedded fonts when creating my documents: cour_gr = pdf_dll->AddTrueTypeFont("Courier New {1253}",1);//Add a greek courier font and embed it pdf_dll->SelectFont(cour_gr); Any ideas on the matter would be great! Thanx in advance, stakon |
|
Ingo
Moderator Group Joined: 29 Oct 05 Status: Offline Points: 3524 |
Post Options
Thanks(0)
|
Hi Stakon!
The pdf uses unicode format to display these characters ... Where do you insert these copied unicode-characters? Is this field or textcomponent in a unicode format? If you select a unicode filename (seeing in your explorer with japanese characters) with an old delphi-component there you have only a placeholder (perhaps a box or ?) at the position of japanese characters. Another try: Copy some unicoded text from an arabian pdf-document into notepad and save it as unicode. Open it again... the characters are still the same. Save this notepad document now as ansi. Open it again... the characters aren't the same. Only ugly placeholders. So if you want to check, copy the content of a unicode pdf you need components with the ability to handle unicode content. Have a look at WideString ;-) Cheers, Ingo |
|
stakon
Team Player Joined: 09 Oct 09 Status: Offline Points: 22 |
Post Options
Thanks(0)
|
Hello Ingo,
thanx for the info. Unfortunately nothing i tried works (saving in txt in different formats etc.) As for your question, i am pasting the pdf text in txt files and word files. Even if i paste it here in this reply text box the same weird text is displayed. The text in the PDF appears like this: "ΔΙΑΣΤΑΣΙΟΛΟΓΗΣΗ ΔΟΚΩΝ ΣΤΑΘΜΗΣ" When copying this from the PDF and paste it anywhere : "ÄÉÁÓÔÁÓÉÏËÏÃÇÓÇ ÄÏÊÙÍ ÓÔÁÈÌÇÓ" PS. I am using the dll version of QuickPDF and developing in Visual C++ |
|
manuel76413
Beginner Joined: 31 Dec 09 Status: Offline Points: 4 |
Post Options
Thanks(0)
|
Unicode character is very difficult when use QuickPDF library.
I have the same problem.
|
|
Ingo
Moderator Group Joined: 29 Oct 05 Status: Offline Points: 3524 |
Post Options
Thanks(0)
|
Hi!
Put the resulting values from QuickPDF into WideString-fields and it will work. What version of VC++ you're using? If you're selecting a file which name contains kyrillic or asian characters into your VC++-app... and show this filename in an edit-field (in your app)... what do you see? If you don't see the correct filename then the problem is your ide and not QuickPDF. I'm working with Delphi 2007 (no unicode-support) and Free Pascal/Lazarus (with unicode support). Calling the QuickPDF-routines from Free Pascal with WideStrings works fine for me. Cheers, Ingo |
|
stakon
Team Player Joined: 09 Oct 09 Status: Offline Points: 22 |
Post Options
Thanks(0)
|
Hi again!
I am using Visual C++ 2005 + SP What exactly do you mean with WideString fields? If i simply paste text from my pdf in any texbox, editbox etc. it isn't displayed correctly. Selecting files with cyrillic,greek etc and displaying them in edit-fields works fine. |
|
Ingo
Moderator Group Joined: 29 Oct 05 Status: Offline Points: 3524 |
Post Options
Thanks(0)
|
Hi!
Do you extract the text or do you copy and paste from a pdf-reader? With WideString i mean WideString and not String 'cause formattype String can't show unicode-content. If the edit-fields of your app can show (for example) cyrillic characters you should use fields with the same formattype to get the result of the textextraction from QuickPDF... and it'll work. If not please ask the official supportpage (general section... first steps...). BTW: You should use the last QuickPDF-version... Cheers, Ingo |
|
Wheeley
Senior Member Joined: 30 Oct 05 Location: United States Status: Offline Points: 146 |
Post Options
Thanks(0)
|
The DLL editions does NOT have wide strings. So your solution will not
work Ingo. It does have UTF8 ANSI strings. So hypothetically if you convert the UTF8 string to a wide string you should see your correct text. So maybe you need to paste your text into an editor to convert it to unicode.
Wheeley |
|
Ingo
Moderator Group Joined: 29 Oct 05 Status: Offline Points: 3524 |
Post Options
Thanks(0)
|
Hi Wheeley!
I know that QuickPDF works INSIDE with AnsiString and PAnsiString. If you initiate an external call to a QuickPDF-function and (for example) a filename is needed then you should have this filename (if it contains asian or other characters) in a WideString-field. I've tested it long enough. I'm out of office now. I'll post a codepart later... Cheers, Ingo |
|
Post Reply | |
Tweet
|
Forum Jump | Forum Permissions You cannot post new topics in this forum You cannot reply to topics in this forum You cannot delete your posts in this forum You cannot edit your posts in this forum You cannot create polls in this forum You cannot vote in polls in this forum |
Copyright © 2017 Debenu. Debenu Quick PDF Library is a PDF SDK. All rights reserved. About — Contact — Blog — Support — Online Store