Print Page | Close Window

[Bug or what?] Some symbols are dissapeared...

Printed From: Debenu Quick PDF Library - PDF SDK Community Forum
Category: For Users of the Library
Forum Name: I need help - I can help
Forum Description: Problems and solutions while programming with the Debenu Quick PDF Library and Debenu PDF Viewer SDK
URL: http://www.quickpdf.org/forum/forum_posts.asp?TID=604
Printed Date: 24 May 24 at 8:01PM
Software Version: Web Wiz Forums 11.01 - http://www.webwizforums.com


Topic: [Bug or what?] Some symbols are dissapeared...
Posted By: Dmitry
Subject: [Bug or what?] Some symbols are dissapeared...
Date Posted: 17 Jan 07 at 2:59AM

Hi, Ingo!
Here the picture and extracted text with the error.

First

http://img126.imagevenue.com/img.php?image=15942_error1_122_333lo.JPG - Picture

Text:
"VFNVHB+CMR12~38",#000000,11.95,91.9180,216.7194,440.0074,216.7194,440.0074,228.6865,91.9180,228.6865,"was approximated by a Landau-type of polynomial as a function of" "MNKVHB+CMMI12~3e",#000000,11.95,444.3273,216.7313,449.3604,216.7313,449.3604,228.6865,444.3273,228.6865,"c" "VFNVHB+CMR12~38",#000000,11.95,453.5603,216.7194,486.1798,216.7194,486.1798,228.6865,453.5603,228.6865,"and ."

Why after the "and " is empty character ?

Second

http://img131.imagevenue.com/img.php?image=16866_error2_122_477lo.JPG - Picture


Text:
"PDMFNG+CMR10",#000000,10.90,132.5893,355.2330,413.0833,355.2330,413.0833,366.1421,132.5893,366.1421,"0 14. Experiments were performed for indentation loads of"
"JGBFNG+CMMI10",#000000,10.90,137.9784,355.2330,140.9675,355.2330,140.9675,366.1421,137.9784,366.1421,"."
"JGBFNG+CMMI10",#000000,10.90,416.9232,355.2330,423.8723,355.2330,423.8723,366.1421,416.9232,366.1421,"F"
"PDMFNG+CMR10",#000000,10.90,428.7922,355.2330,512.4212,355.2330,512.4212,366.1421,428.7922,366.1421,"= 65N, 100N and"

Why after the "loads of" and before "F" is "." character?



Replies:
Posted By: Ingo
Date Posted: 17 Jan 07 at 3:47AM
Hi Dimitry!

I've only a universal answer... ;-)
The textextraction doesn't extract the textcontent "line by line" (like you read it). The extraction extract first what was insert first. If you have a page where first header and footer was inserted, you'll extract first header and (the end of the page) footer and after this the content from the middle. If you have a text with several columns it's hardly possible to read the text later 'cause first all rows of the first column will be extracted.
 
On the last two values you can see the start positions (row and column) of each string in pixels. So you can see at the second picture that all strings are in the same pixel-row (366) but the columns are very different. The point isn't directly before the F. He is far away on the left (outside of the picture. Normally you'll extract longer strings but if this point was inserted alone later he'll appears as a single string.

Best regards,
Ingo
 




Posted By: Dmitry
Date Posted: 17 Jan 07 at 4:30AM
Thanks, Ingo!



Print Page | Close Window

Forum Software by Web Wiz Forums® version 11.01 - http://www.webwizforums.com
Copyright ©2001-2014 Web Wiz Ltd. - http://www.webwiz.co.uk