Print Page | Close Window

PageContent - Tj and Tm Operators

Printed From: Debenu Quick PDF Library - PDF SDK Community Forum
Category: For Users of the Library
Forum Name: I need help - I can help
Forum Description: Problems and solutions while programming with the Debenu Quick PDF Library and Debenu PDF Viewer SDK
URL: http://www.quickpdf.org/forum/forum_posts.asp?TID=2900
Printed Date: 28 Jun 25 at 1:47PM
Software Version: Web Wiz Forums 11.01 - http://www.webwizforums.com


Topic: PageContent - Tj and Tm Operators
Posted By: mLipok
Subject: PageContent - Tj and Tm Operators
Date Posted: 16 May 14 at 10:41PM
I use GetPageContentToVariant()
and I have that content:

=====================================
0 Tr
/QuickPDFF49c3e5cb 10 Tf
0 0 0 rg
100 Tz
0 Ts
0 Tw 0 Tc
BT
1 0 0 1 28.3465 451.8425 Tm
(Hello world from AutoIt \(Line 1\))Tj
ET
/QuickPDFF49c3e5cb 10 Tf
0 0 0 rg
BT
1 0 0 1 56.6929 395.1496 Tm
(Hello world from AutoIt \(Line 2\))Tj
ET
/QuickPDFF49c3e5cb 10 Tf
0 0.5 1 rg
BT
1 0 0 1 85.0394 338.4567 Tm
(Hello world from AutoIt \(Line 3\))Tj
ET
=====================================

QUESTION: 

Are operators Tm and Tj always follow directly one after the other? 

I'm trying to make a solution similar to: 
http://www.debenu.com/kb/replace-text-pdf/
except that a given limited portion of a page.
 
For this reason, knowledge of the use of these operators is to me essential.

Best regards 
mLipok




Replies:
Posted By: mLipok
Date Posted: 17 May 14 at 4:55AM
QUESTION 2:
=====================================
BT
1 0 0 1 56.6929 395.1496 Tm
(Hello world from AutoIt \(Line 2\))Tj
ET
=====================================

I understand why there is a \ char before (  and  )
but:
Q: what other character need to be preceded by \ char           
?


EDIT:
from
PDF32000_2008.pdf

I found this ones

in: 
=====================================
Table 351 – Characters with special meaning in destinations and fields and their byte values

Character
Byte value
Escape sequence

(nul)
0x00
\0 (0x5c 0x30)

. (PERIOD)
0x2e
\p (0x5c 0x70)

\ (backslash)
0x5c
\\ (0x5c 0x5c)
=====================================
Table 3 – Escape sequences in literal strings

Sequence
Meaning

\n
LINE FEED (0Ah) (LF)

\r
CARRIAGE RETURN (0Dh) (CR)

\t
HORIZONTAL TAB (09h) (HT)

\b
BACKSPACE (08h) (BS)

\f
FORM FEED (FF)

\(
LEFT PARENTHESIS (28h)

\)
RIGHT PARENTHESIS (29h)

\\
REVERSE SOLIDUS (5Ch) (Backslash)

\ddd
Character code ddd (octal)
=====================================


Q: But is there any other special char which must be preceded by \ char
?



Posted By: Ingo
Date Posted: 17 May 14 at 11:42AM
Hi,

You've already found the most ;-)
Additionally there are special characters (like trademark or copyright) you have to display using their three digits octal value (like \174 ).
You should read the chapter "String Objects" in the official pdf reference (third edition, 1.4 from adobe) starting from chapter 3.2.3. (page 29)...

Cheers, Ingo



-------------
Cheers,
Ingo



Posted By: mLipok
Date Posted: 17 May 14 at 12:04PM
Thanks
I think I be able to use this part:

BT
1 0 0 1 85.0394 338.4567 Tm
(Hello world from AutoIt \(Line 3\))Tj
ET


and using RegExp 

to create  Function which ultimately will, change the text (including removal) located in the specified area.

of course, based on:
http://www.debenu.com/kb/replace-text-pdf/



Posted By: Ingo
Date Posted: 17 May 14 at 8:00PM
in this case it's really possible to change text.
But it should be a new text with the same length.
If not you have to change the stream length and the object offsets of the whole document, too.
PDF-Edit is very hard stuff - you should start smart with QuickPDF... at frist you should forget things like this ;-)

Cheers, Ingo


-------------
Cheers,
Ingo



Posted By: mLipok
Date Posted: 17 May 14 at 11:43PM
I do not care about keeping the length and size of the document.
I just have to delete some data because they can not be in the database because of the law.

In such a case, do I can just replace, on an empty STRING ?

The customer can process these data, but within a different part of the database.
Unfortunately, the PDF from the bank on one page contains too many of data.


Posted By: AndrewC
Date Posted: 18 May 14 at 9:21AM
mLipok,

There are many other variations and ways to draw text but if you are only processing documents from one source then you may be able to get a solution working for your particular PDF files.

Many documents draw 1 character at a time and can also draw them out of order.  Then there are XObjects which can be scaled, rotated  and translated and require further processing to position them on the page.

If you only have 1 type of document to process and it is easy then you are very lucky.

Andrew.


Posted By: mLipok
Date Posted: 18 May 14 at 12:51PM
It sounds promising. 
It is true that ultimately I have about dozens of interested customers, but mostly use the same banking institutions, basically I can not narrow it down to 5 different types of documents.

Best regards.
mLipok




Print Page | Close Window

Forum Software by Web Wiz Forums® version 11.01 - http://www.webwizforums.com
Copyright ©2001-2014 Web Wiz Ltd. - http://www.webwiz.co.uk