Do you own a Debenu Quick PDF Library version 7, 8, 9, 10, 11, 12, 13 or iSEDQuickPDF license? Upgrade to Debenu Quick PDF Library 14 today!
![]() |
Find text by font size? |
Post Reply
|
| Author | |
Skylla
Beginner
Joined: 21 May 13 Status: Offline Points: 4 |
Post Options
Thanks(0)
Quote Reply
Topic: Find text by font size?Posted: 21 May 13 at 7:57PM |
|
I am trying to find out how to extract text with "specific font size" from pdf file via C#? For example search pdf file, find 22pt text and extract it. Is there a way to accomplist this via quick pdf? Any ideas or sample codes? Need help from gurus! Thank you!
|
|
![]() |
|
Ingo
Moderator Group
Joined: 29 Oct 05 Status: Offline Points: 3530 |
Post Options
Thanks(0)
Quote Reply
Posted: 21 May 13 at 8:12PM |
|
Hi Skylla!
That's easy stuff so you should succeed by your own ;-) Take a starting code for beginners: http://www.quickpdflibrary.com/help/getting-started-activex.php Insert a LoadFromFile... Insert a PageCount... Then create a loop with PageCount ...and there insert the functionality of ExtractFilePageText with Option 3. Here you can read all about option 3 and then you know how to do: http://www.quickpdflibrary.com/help/quickpdf/ExtractFilePageText.php Cheers and welcome here, Ingo |
|
![]() |
|
Skylla
Beginner
Joined: 21 May 13 Status: Offline Points: 4 |
Post Options
Thanks(0)
Quote Reply
Posted: 21 May 13 at 9:51PM |
|
Thank you for your good starting points!
|
|
![]() |
|
AndrewC
Moderator Group
Joined: 08 Dec 10 Location: Geelong, Aust Status: Offline Points: 841 |
Post Options
Thanks(0)
Quote Reply
Posted: 23 May 13 at 5:25AM |
|
Skylla, QP.LoadFromFile("99pages.pdf", ""); for (int i = 1; i <= QP.PageCount();i++) { QP.SelectPage(i); int id = QP.ExtractPageTextBlocks(3); for (int w=1 ; w<=QP.GetTextBlockCount(id) ; w++) { double size = QP.GetTextBlockFontSize(id, w); if (Math.Round(size) == 22) MessageBox.Show("Page :" + i.ToString() + " Word:" + w.ToString() + "'" + QP.GetTextBlockText(id, w) + "'"); } QP.ReleaseTextBlocks(id); } Andrew. |
|
![]() |
|
Skylla
Beginner
Joined: 21 May 13 Status: Offline Points: 4 |
Post Options
Thanks(0)
Quote Reply
Posted: 23 May 13 at 8:38AM |
|
Hi Andrew.
Thank you for your sample. Tried that code, in runtime i got the following results; id = 1476395009, Result of qp.GetTextBlockCount(id) = 0 so loop in for (int w = 1; w <= qp.GetTextBlockCount(id); w++) not succeed. Do you have an idea what is happening? var qp = new PDFLibrary("C:\\DebenuPDFLibraryDLL0914.dll"); const string licenseKey = "licencekey"; var result = qp.UnlockKey(licenseKey); if (qp.LibraryLoaded()) { if (result == 1) { qp.LoadFromFile("aaa.pdf", ""); for (int i = 1; i <= qp.PageCount(); i++) { qp.SelectPage(i); int id = qp.ExtractPageTextBlocks(3); for (int w = 1; w <= qp.GetTextBlockCount(id); w++) { double size = qp.GetTextBlockFontSize(id, w); if (Math.Round(size) == 22) Response.Write("Page :" + i.ToString(CultureInfo.InvariantCulture) + " Word:" + w.ToString(CultureInfo.InvariantCulture) + "'" + qp.GetTextBlockText(id, w) + "'" + "<br>"); } qp.ReleaseTextBlocks(id); } } } |
|
![]() |
|
AndrewC
Moderator Group
Joined: 08 Dec 10 Location: Geelong, Aust Status: Offline Points: 841 |
Post Options
Thanks(0)
Quote Reply
Posted: 23 May 13 at 9:01AM |
|
If it returned 0 then it is not finding any text on the page. I would need to see the PDF file before I could make and further comments.
Andrew.
|
|
![]() |
|
Skylla
Beginner
Joined: 21 May 13 Status: Offline Points: 4 |
Post Options
Thanks(0)
Quote Reply
Posted: 23 May 13 at 2:05PM |
|
Its a basic pdf actually which is create by me for testing.
Just 24, 23, 22, 20 pt text's in it. Created with word, saved as pdf.
|
|
![]() |
|
AndrewC
Moderator Group
Joined: 08 Dec 10 Location: Geelong, Aust Status: Offline Points: 841 |
Post Options
Thanks(0)
Quote Reply
Posted: 24 May 13 at 8:25AM |
|
My code is working correctly with your PDF and is returning the 22pt font from both pages.
Is LoadFromFile returning 1 in your case ? Does QP.PageCount return 1 or 2 ? It should be 2 for your PDF. It could be a permissions problem. I suspect LoadFromFile is failing. By default QPL always has a single blank page allocated in memory it could be that is the reason nothing is being extracted. You may then try string s= QP.GetPageText(7); MessageBox.Show(s); to make sure the text is actually being extracted. Andrew.
|
|
![]() |
|
Post Reply
|
|
|
Tweet
|
| Forum Jump | Forum Permissions ![]() You cannot post new topics in this forum You cannot reply to topics in this forum You cannot delete your posts in this forum You cannot edit your posts in this forum You cannot create polls in this forum You cannot vote in polls in this forum |
Copyright © 2017 Debenu. Debenu Quick PDF Library is a PDF SDK. All rights reserved. About — Contact — Blog — Support — Online Store