Print Page | Close Window

Codepoints in Subsetted Type1 font

Printed From: Debenu Quick PDF Library - PDF SDK Community Forum
Category: For Users of the Library
Forum Name: I need help - I can help
Forum Description: Problems and solutions while programming with the Debenu Quick PDF Library and Debenu PDF Viewer SDK
URL: http://www.quickpdf.org/forum/forum_posts.asp?TID=3848
Printed Date: 16 Apr 24 at 3:41PM
Software Version: Web Wiz Forums 11.01 - http://www.webwizforums.com


Topic: Codepoints in Subsetted Type1 font
Posted By: Franciscus
Subject: Codepoints in Subsetted Type1 font
Date Posted: 24 Sep 20 at 1:00PM
Dear colleagues,

This is a tricky PDF-to-Text problem.

I have a document about Chemistry that contains mathematical glyphs, embedded in subsetted MathematicalPi-One and other fonts. Inside the PDF, the glyphs are referenced not by their actual code points in the font but by their embedded code points#1, #2 etc.

My problem is that I need to convert those glyphs in the PDF to their equivalent "known" representation, for ex. their real code points in the full font (in MathematicalPi-One etc.)

I searched the forum and the API documentation but am stuck.

I have a source code license and looked at the DrawText function in the hope I could draw the glyph onto a Delphi Canvas and do OCR on it, but drawing text onto a PDF does not involve rendering the actual glyph..

If someone has an idea how to solve this, that would be much appreciated!



Replies:
Posted By: Franciscus
Date Posted: 24 Sep 20 at 2:21PM
It looks like RenderPageToDC could be a solution. Debenu gives me the bounding box of all text so I can then selectively OCR those glyphs that are not identified as any known character.



Print Page | Close Window

Forum Software by Web Wiz Forums® version 11.01 - http://www.webwizforums.com
Copyright ©2001-2014 Web Wiz Ltd. - http://www.webwiz.co.uk