Print Page | Close Window

Creation of a PDF/A1-b document from an image scan

Printed From: Debenu Quick PDF Library - PDF SDK Community Forum
Category: For Users of the Library
Forum Name: I need help - I can help
Forum Description: Problems and solutions while programming with the Debenu Quick PDF Library and Debenu PDF Viewer SDK
URL: http://www.quickpdf.org/forum/forum_posts.asp?TID=2703
Printed Date: 20 May 24 at 2:59AM
Software Version: Web Wiz Forums 11.01 - http://www.webwizforums.com


Topic: Creation of a PDF/A1-b document from an image scan
Posted By: dsch
Subject: Creation of a PDF/A1-b document from an image scan
Date Posted: 09 Aug 13 at 10:55AM
Hi there,

I have to create a PDF/A1-b doc from a tif file, which comes from a scanner. I've read in the developers guide how to create such a doc from the scratch using SetPDFAMode(2) and DrawText, but I cannot find a description of how to involve the image file.

Of cause, I did an OCR on the image and so I know the characters and words of the scan and also their positions. But what now?

Thanks for help!

Dirk




Replies:
Posted By: Ingo
Date Posted: 09 Aug 13 at 11:05AM
Hi Dirk!
 
You should read here:
AddImageFromFile:
http://www.debenu.com/docs/pdf_library_reference/AddImageFromFile.php" rel="nofollow - http://www.debenu.com/docs/pdf_library_reference/AddImageFromFile.php
With this function you can write all tif-pages into a pdf using a loop.
 
Cheers and welcome here,
Ingo


Posted By: dsch
Date Posted: 09 Aug 13 at 12:28PM
Hi Ingo,

thanks for the warm welcome! I already read about that, but still I'm not sure about the way how to do. Am I right with the following? And is that the best practice?

1. Create a pdf document
2. Calling SetPDFAMode(2)
3. Add scan page with "AddImageFromFile"
4. Add all words (or all characters? What's best?) to the page using "DrawText"
5. continue with 3. until there is no page left

Thank you,

Dirk


Posted By: Ingo
Date Posted: 09 Aug 13 at 12:40PM
Hi Dirk!
 
Don't use visible DrawText 'cause one of the a1-b-specs
tells me not to change the original layout and it
will be very difficult to create absolutely identical
pages.
So you should use your points 1-3 to create similar
visible pages.
But the a1-b-specs wants the searchable textcontent, too.
So "i" would use the DrawText with the same colour as
the background colour on a separate layer/content group.
Another idea: Use the DrawText with 100% transparency.
I don't know if this would be absolute a1-b-specs-like ;-)
 
Perhaps there are experts dealing with PDF/A here to
post a more professional help ;-)
 
Cheers, Ingo
 


Posted By: dsch
Date Posted: 09 Aug 13 at 2:50PM
Hi Ingo,

it seems to me that "SetTextMode(3)" will do the job for invisible text. But I still don't know, if the result will be (really) pdf/a1-b compliant. One Question i.e. is, whether I need a special layer (contentstream/-group) for the text data or not?

It would be nice to have a short (pseudo-) code snippet for the creation of such a type of document. Strange, that no one from quickpdf deals with this subject...

Thank you,

Dirk


Posted By: dsch
Date Posted: 09 Aug 13 at 3:16PM
I just tried some code that seems to be correct (in my opinion). It will result in a searchable pdf but all the online validators don't like the output file and they agree in the fact, that it is NOT a pdf/a1-b file. Any idea what's wrong?

    PDFLib.SetPDFAMode(2);

    iImgID := PDFLib.AddImageFromFile(ImageFileName, 1);
    PDFLib.SelectImage(iImgID);

    iWidth := PDFLib.ImageWidth;
    iHeight := PDFLib.ImageHeight;

    PDFLib.SetPageDimensions(iWidth, iHeight);
    PDFLib.SetOrigin(1);
    PDFLib.DrawImage(0, 0, iWidth, iHeight);

    PDFLib.SetTextMode(3);
    PDFLib.SetTextSize(20);
    PDFLib.DrawText(100, 200, 'Hallo');

    PDFLib.SaveToFile('xxx.pdf');


Posted By: dsch
Date Posted: 09 Aug 13 at 4:02PM
One more interresting thing: the sample code for pdf/a1-b creation, provided by the QuickPDF developers guide, also produces a file, that is NOT pdf/a1-b compliant (as said by various online-validator-tools):

/* Create a new PDF/A document */
// Call SetPDFAMode to tell Quick PDF Library that you want to
// create a PDF/A document. Specify which version of PDF/A it
// should use.
QP.SetPDFAMode(2);
// Add some simple text
QP.DrawText(100, 600, “This document is PDF/A-1b compliant.”);
// Save the PDF/A compliant doc to disk
QP.SaveToFile(“my_pdf-a_doc.pdf”);



Posted By: Ingo
Date Posted: 10 Aug 13 at 9:35PM
Seems there's no other way than read the specs regarding attached files ;-)



Posted By: AndrewC
Date Posted: 12 Aug 13 at 5:12AM
Dirk,

The 14 base fonts are not allowed to be used in a PDF/A1b file.  All fonts need to be embedded in the PDF/A1b file so the following code change should improve things.

    PDFLib.SetPDFAMode(2);

    iImgID := PDFLib.AddImageFromFile(ImageFileName, 1);
    PDFLib.SelectImage(iImgID);

    iWidth := PDFLib.ImageWidth;
    iHeight := PDFLib.ImageHeight;

    PDFLib.SetPageDimensions(iWidth, iHeight);
    PDFLib.SetOrigin(1);
    PDFLib.DrawImage(0, 0, iWidth, iHeight);

    PDFLib.SetTextMode(3);                          // Invisible text mode

    PDFLib.AddTrueTypeFont("Arial", 1);       // ADDED : Embed the Arial font.

    PDFLib.SetTextSize(20);
    PDFLib.DrawText(100, 200, 'Hallo');

    PDFLib.SaveToFile('xxx.pdf');


Andrew.



Posted By: dsch
Date Posted: 12 Aug 13 at 5:51PM
Thank you Andrew and Ingo. It seems that the version 8.15 was unable to produce correct pdf/a-1b files. It's ok with 9.15

Regards,

Dirk



Print Page | Close Window

Forum Software by Web Wiz Forums® version 11.01 - http://www.webwizforums.com
Copyright ©2001-2014 Web Wiz Ltd. - http://www.webwiz.co.uk