Print Page | Close Window

How to get a list of all urls

Printed From: Debenu Quick PDF Library - PDF SDK Community Forum
Category: For Users of the Library
Forum Name: I need help - I can help
Forum Description: Problems and solutions while programming with the Debenu Quick PDF Library and Debenu PDF Viewer SDK
URL: http://www.quickpdf.org/forum/forum_posts.asp?TID=2812
Printed Date: 05 May 24 at 1:04AM
Software Version: Web Wiz Forums 11.01 - http://www.webwizforums.com


Topic: How to get a list of all urls
Posted By: ToniSanta
Subject: How to get a list of all urls
Date Posted: 15 Jan 14 at 2:06PM
Hi,
is there a way to get a list of all urls contained in a PDF document?
best regards
Toni
 


-------------
best regards,

toni



Replies:
Posted By: Ingo
Date Posted: 20 Jan 14 at 7:11AM
Hi Toni,
 
there's something in the FAQ:
http://www.quickpdflibrary.com/faq/how-do-i-retrieve-all-urls-and-related-text-from-a-pdf.php" rel="nofollow - http://www.quickpdflibrary.com/faq/how-do-i-retrieve-all-urls-and-related-text-from-a-pdf.php
 
Cheers and welcome here,
Ingo
 


-------------
Cheers,
Ingo



Posted By: AndrewC
Date Posted: 21 Jan 14 at 12:07AM

Nice find Ingo.  Most URL's are actually set up as Annotation Links so this will work well on most PDF's.

Not all PDF documents are created the same way and there are various tricks that are sometimes used to implement URL links.  If the url's are not annotations but are part of the normal PDF text then you will need to use the Debenu Quick PDF Library text extraction routines and then parse the raw text results looking for http:// and www.

  QP.GetPageText(8) should work well.

Try Ingo's link first though as it will most likely do what you need.

Andrew.


Posted By: ToniSanta
Date Posted: 30 Jan 14 at 4:30PM
Thanks Ingo, thanks Andrew.
 
best regards
Toni
 


-------------
best regards,

toni



Print Page | Close Window

Forum Software by Web Wiz Forums® version 11.01 - http://www.webwizforums.com
Copyright ©2001-2014 Web Wiz Ltd. - http://www.webwiz.co.uk