Print Page | Close Window

ExtractFilePages only odd number pages

Printed From: Debenu Quick PDF Library - PDF SDK Community Forum
Category: For Users of the Library
Forum Name: I need help - I can help
Forum Description: Problems and solutions while programming with the Debenu Quick PDF Library and Debenu PDF Viewer SDK
URL: http://www.quickpdf.org/forum/forum_posts.asp?TID=2789
Printed Date: 19 Apr 24 at 8:00AM
Software Version: Web Wiz Forums 11.01 - http://www.webwizforums.com


Topic: ExtractFilePages only odd number pages
Posted By: carmined58
Subject: ExtractFilePages only odd number pages
Date Posted: 25 Nov 13 at 7:20AM
Hi,
 
I'm using ExtractFilePages to extract document pages from a large PDF. The document is initially 25,000 pages, but I'm breaking it up into 100-page files.
 
Each document to be extracted is two pages. I'm then reading the text from the extracted pages.
 
I need to extract 2 pages, save it as one document. I then need to read the text from only page one.
 
Does anyone know of a way I can use ExtractFilePageText to extract text from only odd numbered pages?
 
Thank you!



Replies:
Posted By: Ingo
Date Posted: 25 Nov 13 at 8:15AM
With ExtractFilePages (with page-ranges!) you can determine the ranges you want and then you can use ExtractFilePageText.
Another idea: Determine the pages you want... put it into a stringlist... and then using ExtractFilePageText in a loop.
ExtractFilePages:
http://www.quickpdflibrary.com/help/quickpdf/ExtractFilePages.php" rel="nofollow - http://www.quickpdflibrary.com/help/quickpdf/ExtractFilePages.php
ExtractFilePageText:
http://www.quickpdflibrary.com/help/quickpdf/ExtractFilePageText.php" rel="nofollow - http://www.quickpdflibrary.com/help/quickpdf/ExtractFilePageText.php


-------------
Cheers,
Ingo



Posted By: carmined58
Date Posted: 25 Nov 13 at 9:07AM

Hi Ingo,

First, thank you for the quick response.

Second, please forgive my ignorance, but I'm not quite following you.

I'm going to try to explain in detail what I'm doing, what I need. I'm extracting pages from two separate PDFs. I've created one application that will extract pages and read text from both. Theoretically, at least.

I have the first working fine whereby each individual page is extracted and is a standalone document. The second, however, is turning out to be more difficult.

For the second type of document I'm extracting pages from a 100 page document. The first and second page need to be extracted as one document, the third and fourth page need to be extracted as one document, the fifth and sixth document need to be extracted as one document, and so on.

To extract the pages for the first document I use:

FuncReturnResult = DPL.ExtractFilePages (strFileName, "", strPathName + "\" + strPage, CStr(intPageno))

I do an initial page count, then loop through while the page number is <= the page count:

Do While intPageno <= intPageCount...For i As integer = 1 to intPageCount

"intPageno" is incremented as we progress through the loop. Using this variable for an ExtractFilePages parameter obviously doesn't work when I need to extract two pages each time.

So that's my first issue, how to get 2 pages extracted in each iteration through the loop. I'm confident that if I'm able to successfully extract 2 pages and create 1 document from the 2 pages that I'll be able to use the ExtractFilePageText successfully to extract only the text on page 1 of each document.

Having explained my issue would you please be so kind as to tell me how I could go about extracting 2 pages from the 100 page document, extracting the next 2, then the next 2, and so on? Maybe provide an example?

Thank you, Ingo!

 



Posted By: AndrewC
Date Posted: 26 Nov 13 at 1:46AM
Camine,

This will be the most efficient way to process the file.

Andrew.

            QP.LoadFromFile("100pages.pdf", "");
            int mainDocID = QP.SelectedDocument();

            for (int i = 1; i <= QP.PageCount(); i += 2)
            {
                string range = i.ToString() + "-" + (i + 1).ToString();  // "1-2", "3-4" ....

                QP.NewDocument();                      // Has an empty page 1
                QP.CopyPageRanges(mainDocID, range);   // Append the 2 page from range.
                QP.DeletePages(1, 1);                  // Remove the first empty page.

                QP.SaveToFile("100_pages_" + range + ".pdf");
                QP.RemoveDocument(QP.SelectedDocument());
            }
            QP.RemoveDocument(mainDocID);



Posted By: carmined58
Date Posted: 26 Nov 13 at 8:39AM
Hi Andrew,
Thanks so much for the reply and information. I'm using VB, am not familiar enough with C#.
Regardless, your example struck a chord and I'm now able to extract the pages as necessary.
Thank you very much!



Print Page | Close Window

Forum Software by Web Wiz Forums® version 11.01 - http://www.webwizforums.com
Copyright ©2001-2014 Web Wiz Ltd. - http://www.webwiz.co.uk