Do you own a Debenu Quick PDF Library version 7, 8, 9, 10, 11, 12, 13 or iSEDQuickPDF license? Upgrade to Debenu Quick PDF Library 14 today!
![]() |
Search text and get the bound boxes |
Post Reply
|
| Author | |
pinozzy
Beginner
Joined: 22 Mar 16 Status: Offline Points: 10 |
Post Options
Thanks(0)
Quote Reply
Topic: Search text and get the bound boxesPosted: 22 Mar 16 at 1:50PM |
|
Hello All,
I'm using the Viewer SDK (with c#) for search some text in a document. I need to localize that text and get the co-ordinates of the single results. My approach is now this: SearchPDFText("my string"); what I obtain is the number of the occurrences. Now I need the bound rectangles of theese occurrences. How can I do that? I tried the GetSelectedTextBlockBound() looping with NextSearchResult(), whithout any success. I believe that there is a better way, what am I missing? Many thanks. |
|
![]() |
|
Ingo
Moderator Group
Joined: 29 Oct 05 Status: Offline Points: 3530 |
Post Options
Thanks(0)
Quote Reply
Posted: 22 Mar 16 at 6:30PM |
|
Hi Pinozzy,
i don't know how similar is the viewer sdk to the QuickPDF-library... Perhaps this kb-post can help? http://www.debenu.com/kb/extract-text-from-pdfs-as-a-text-block-list/ Cheers and welcome here, Ingo |
|
|
Cheers,
Ingo |
|
![]() |
|
pinozzy
Beginner
Joined: 22 Mar 16 Status: Offline Points: 10 |
Post Options
Thanks(0)
Quote Reply
Posted: 23 Mar 16 at 8:42AM |
|
oh, I see. I switched to the quickpdf lib, I wonder why it is missing here too that search feature. Thanks buddy for your advice, I'll start from there (and post the code, maybe someone have my same need)
|
|
![]() |
|
pinozzy
Beginner
Joined: 22 Mar 16 Status: Offline Points: 10 |
Post Options
Thanks(0)
Quote Reply
Posted: 26 Mar 16 at 3:11PM |
|
This is how I solved my problem.
This method returns results from a search by Regex. Bye! ---- public struct FindResult { public string Text { get; set; } public RectangleF Rectangle { get; set; } public int Page { get; set; } } public override List<FindResult> SearchPattern(int pageIndex, string pattern) { var retVal = new List<FindResult>(); var dpl = Document.DPL; dpl.SetTextExtractionWordGap(1); dpl.SetTextExtractionOptions(3, 0); var regex = new Regex(pattern); for (var i = 0; i < Pages; i++) { if(pageIndex > 0 && (pageIndex - 1) != i) continue; var id = dpl.ExtractPageTextBlocks(4); dpl.SelectPage(i); for (var f = 1; f <= dpl.GetTextBlockCount(id); f++) { var text = dpl.GetTextBlockText(id, f); var match = regex.Match(text); if (!match.Success) continue; var res = new FindResult { Rectangle = new RectangleF( (float)dpl.GetTextBlockBound(id, f, 7), (float)dpl.GetTextBlockBound(id, f, 8), (float)dpl.GetTextBlockBound(id, f, 5) - (float)dpl.GetTextBlockBound(id, f, 7), (float)dpl.GetTextBlockBound(id, f, 6) - (float)dpl.GetTextBlockBound(id, f, 4) ), Page = i+1, Text = text }; retVal.Add(res); } dpl.ReleaseTextBlocks(id); } return retVal; } |
|
![]() |
|
Ingo
Moderator Group
Joined: 29 Oct 05 Status: Offline Points: 3530 |
Post Options
Thanks(0)
Quote Reply
Posted: 27 Mar 16 at 1:46PM |
|
Hi!
Thanks a lot for sharing your code. I've put it into the sample section. Perhaps it can be a help for other ones, too :) |
|
|
Cheers,
Ingo |
|
![]() |
|
Post Reply
|
|
|
Tweet
|
| Forum Jump | Forum Permissions ![]() You cannot post new topics in this forum You cannot reply to topics in this forum You cannot delete your posts in this forum You cannot edit your posts in this forum You cannot create polls in this forum You cannot vote in polls in this forum |
Copyright © 2017 Debenu. Debenu Quick PDF Library is a PDF SDK. All rights reserved. About — Contact — Blog — Support — Online Store