Print Page | Close Window

Render document and keep embedded text

Printed From: Debenu Quick PDF Library - PDF SDK Community Forum
Category: For Users of the Library
Forum Name: I need help - I can help
Forum Description: Problems and solutions while programming with the Debenu Quick PDF Library and Debenu PDF Viewer SDK
URL: http://www.quickpdf.org/forum/forum_posts.asp?TID=3912
Printed Date: 05 May 24 at 12:11PM
Software Version: Web Wiz Forums 11.01 - http://www.webwizforums.com


Topic: Render document and keep embedded text
Posted By: Henjinx
Subject: Render document and keep embedded text
Date Posted: 04 May 21 at 1:51PM
Hi everyone,
I'm having a bit of an unusual problem I believe:

I created a pdf-document with a few pages that have embedded text in them. At this point, I can open the file with Acrobat Reader and select/copy text from those pages.
After creating the file, I want to be able to print it, using an arbitrary driver (including things like Microsoft print to PDF).

For a driver that results in a paper print-out everything works fine, but if I'm using a driver like "print to PDF", the information about the embedded text is lost.

I understand that this is due to rendering creating images of all the pages, but is it somehow possible to work around this?



Replies:
Posted By: Sopracenery
Date Posted: 22 May 21 at 11:24PM
Hi,

I checked this out. If I print a Debenu-PDF via Microsoft driver to a new PDF all the text information is lost and inside the file there is only something like polygons that are good for printing or browsing in Acrobat.

If I print the Debenu-PDF via FreePDF printer to a new PDF all the text information is kept.

So it depends on the system that you are using and we are far away from standards that allow passing a PDF through a chain of readers and get the same output at the end of what was inserted at the beginning...


(I tried both Microsoft typical font Arial and PDF built-in font Helvetica.)


To me it seems that every reader or system has its own dialect and they understand each other only poor. So no way to passthrough Microsoft print to PDF.


Martin


Supplement:

Trial done with Win10 Enterprise 1809 and Acrobat ReaderDC 2015.017 on a source generated with QuickPDF Lib. 18.11

(QuickPDF -> Acrobat -> MS print to PDF)



Posted By: tfrost
Date Posted: 23 May 21 at 3:05PM
I understand it's a limitation (or bug, if you prefer) in some versions of Microsoft Print to PDF; its an issue with creating the PDF, not with PDF readers or the portability of PDFs; if there is only a rendered image of text in the file, you would need OCR to recover the text. I don't know when this started or when it was fixed, but if you are using Windows 10, make sure it is fully up to date.

I can print to Microsoft Print to PDF and select text in the output in Window 10 20H2 and 21H1. Google will find you reports of this not being the case in some earlier releases, but not in some older releases.



Print Page | Close Window

Forum Software by Web Wiz Forums® version 11.01 - http://www.webwizforums.com
Copyright ©2001-2014 Web Wiz Ltd. - http://www.webwiz.co.uk