ExtractFilePages only odd number pages

Message Topic Search Topic Options Post Reply Create New Topic Printable Version Translate Topic

Hi, 
I'm using ExtractFilePages to extract document pages from a large PDF. The document is initially 25,000 pages, but I'm breaking it up into 100-page files.

Each document to be extracted is two pages. I'm then reading the text from the extracted pages.

I need to extract 2 pages, save it as one document. I then need to read the text from only page one. 

Does anyone know of a way I can use ExtractFilePageText to extract text from only odd numbered pages?

Thank you!

Author	Message Topic Search Topic Options Post Reply Create New Topic Printable Version Translate Topic
carmined58 Members Profile Find Members Posts Beginner Joined: 19 Oct 13 Location: South Dakota Status: Offline Points: 7	Post Options Post Reply Quote carmined58 Report Post Thanks(0) Quote Reply Topic: ExtractFilePages only odd number pages Posted: 25 Nov 13 at 7:20AM
	Hi, I'm using ExtractFilePages to extract document pages from a large PDF. The document is initially 25,000 pages, but I'm breaking it up into 100-page files. Each document to be extracted is two pages. I'm then reading the text from the extracted pages. I need to extract 2 pages, save it as one document. I then need to read the text from only page one. Does anyone know of a way I can use ExtractFilePageText to extract text from only odd numbered pages? Thank you!

Ingo Members Profile Find Members Posts Moderator Group Joined: 29 Oct 05 Status: Offline Points: 3530	Post Options Post Reply Quote Ingo Report Post Thanks(0) Quote Reply Posted: 25 Nov 13 at 8:15AM
	With ExtractFilePages (with page-ranges!) you can determine the ranges you want and then you can use ExtractFilePageText. Another idea: Determine the pages you want... put it into a stringlist... and then using ExtractFilePageText in a loop. ExtractFilePages: http://www.quickpdflibrary.com/help/quickpdf/ExtractFilePages.php ExtractFilePageText: http://www.quickpdflibrary.com/help/quickpdf/ExtractFilePageText.php
	Cheers, Ingo

carmined58 Members Profile Find Members Posts Beginner Joined: 19 Oct 13 Location: South Dakota Status: Offline Points: 7	Post Options Post Reply Quote carmined58 Report Post Thanks(0) Quote Reply Posted: 25 Nov 13 at 9:07AM
	Hi Ingo, First, thank you for the quick response. Second, please forgive my ignorance, but I'm not quite following you. I'm going to try to explain in detail what I'm doing, what I need. I'm extracting pages from two separate PDFs. I've created one application that will extract pages and read text from both. Theoretically, at least. I have the first working fine whereby each individual page is extracted and is a standalone document. The second, however, is turning out to be more difficult. For the second type of document I'm extracting pages from a 100 page document. The first and second page need to be extracted as one document, the third and fourth page need to be extracted as one document, the fifth and sixth document need to be extracted as one document, and so on. To extract the pages for the first document I use: FuncReturnResult = DPL.ExtractFilePages (strFileName, "", strPathName + "\" + strPage, CStr(intPageno)) I do an initial page count, then loop through while the page number is <= the page count: Do While intPageno <= intPageCount...For i As integer = 1 to intPageCount "intPageno" is incremented as we progress through the loop. Using this variable for an ExtractFilePages parameter obviously doesn't work when I need to extract two pages each time. So that's my first issue, how to get 2 pages extracted in each iteration through the loop. I'm confident that if I'm able to successfully extract 2 pages and create 1 document from the 2 pages that I'll be able to use the ExtractFilePageText successfully to extract only the text on page 1 of each document. Having explained my issue would you please be so kind as to tell me how I could go about extracting 2 pages from the 100 page document, extracting the next 2, then the next 2, and so on? Maybe provide an example? Thank you, Ingo!

AndrewC Members Profile Find Members Posts Moderator Group Joined: 08 Dec 10 Location: Geelong, Aust Status: Offline Points: 841	Post Options Post Reply Quote AndrewC Report Post Thanks(0) Quote Reply Posted: 26 Nov 13 at 1:46AM
	Camine, This will be the most efficient way to process the file. Andrew. QP.LoadFromFile("100pages.pdf", ""); int mainDocID = QP.SelectedDocument(); for (int i = 1; i <= QP.PageCount(); i += 2) { string range = i.ToString() + "-" + (i + 1).ToString(); // "1-2", "3-4" .... QP.NewDocument(); // Has an empty page 1 QP.CopyPageRanges(mainDocID, range); // Append the 2 page from range. QP.DeletePages(1, 1); // Remove the first empty page. QP.SaveToFile("100_pages_" + range + ".pdf"); QP.RemoveDocument(QP.SelectedDocument()); } QP.RemoveDocument(mainDocID);

carmined58 Members Profile Find Members Posts Beginner Joined: 19 Oct 13 Location: South Dakota Status: Offline Points: 7	Post Options Post Reply Quote carmined58 Report Post Thanks(0) Quote Reply Posted: 26 Nov 13 at 8:39AM
	Hi Andrew, Thanks so much for the reply and information. I'm using VB, am not familiar enough with C#. Regardless, your example struck a chord and I'm now able to extract the pages as necessary. Thank you very much!