Do you own a Debenu Quick PDF Library version 7, 8, 9, 10, 11, 12, 13 or iSEDQuickPDF license? Upgrade to Debenu Quick PDF Library 14 today!
Codepoints in Subsetted Type1 font |
Post Reply |
Author | |
Franciscus
Beginner Joined: 17 Dec 19 Status: Offline Points: 9 |
Post Options
Thanks(0)
Posted: 24 Sep 20 at 1:00PM |
Dear colleagues,
This is a tricky PDF-to-Text problem. I have a document about Chemistry that contains mathematical glyphs, embedded in subsetted MathematicalPi-One and other fonts. Inside the PDF, the glyphs are referenced not by their actual code points in the font but by their embedded code points#1, #2 etc. My problem is that I need to convert those glyphs in the PDF to their equivalent "known" representation, for ex. their real code points in the full font (in MathematicalPi-One etc.)
I searched the forum and the API documentation but am stuck. I have a source code license and looked at the DrawText function in the hope I could draw the glyph onto a Delphi Canvas and do OCR on it, but drawing text onto a PDF does not involve rendering the actual glyph.. If someone has an idea how to solve this, that would be much appreciated! Edited by Franciscus - 24 Sep 20 at 1:26PM |
|
Franciscus
Beginner Joined: 17 Dec 19 Status: Offline Points: 9 |
Post Options
Thanks(0)
|
It looks like RenderPageToDC could be a solution. Debenu gives me the bounding box of all text so I can then selectively OCR those glyphs that are not identified as any known character.
|
|
Post Reply | |
Tweet
|
Forum Jump | Forum Permissions You cannot post new topics in this forum You cannot reply to topics in this forum You cannot delete your posts in this forum You cannot edit your posts in this forum You cannot create polls in this forum You cannot vote in polls in this forum |
Copyright © 2017 Debenu. Debenu Quick PDF Library is a PDF SDK. All rights reserved. About — Contact — Blog — Support — Online Store