2011-08-30 97 views
6

我有一个pdf文档,它具有我用c#编程式填写的表单字段。根据三个条件,我需要修剪(删除)该文档中的一些页面。itextsharp修剪pdf文档的页面

这可能吗?

为条件1:我需要保持1-4页,但删除5和6

页条件2:我需要保持1-4页,但删除5和6保持

的条件3:我需要保持1-5页,但删除6

回答

5

不用删除的文件是什么你实际上是创建一个新的文件,只能导入要保持页页。下面是一个完整的WinForms应用程序,它可以做到这一点(针对iTextSharp 5.1.1.0)。功能removePagesFromPdf的最后一个参数是要保留的页面数组。

下面的代码工作过的物理文件,但会很容易转换为基于流的东西,这样你就不必写入到磁盘,如果你不想。

using System; 
using System.ComponentModel; 
using System.IO; 
using System.Linq; 
using System.Windows.Forms; 
using iTextSharp.text.pdf; 
using iTextSharp.text; 


namespace Full_Profile1 
{ 
    public partial class Form1 : Form 
    { 
     public Form1() 
     { 
      InitializeComponent(); 
     } 

     private void Form1_Load(object sender, EventArgs e) 
     { 
      //The files that we are working with 
      string sourceFolder = Environment.GetFolderPath(Environment.SpecialFolder.Desktop); 
      string sourceFile = Path.Combine(sourceFolder, "Test.pdf"); 
      string destFile = Path.Combine(sourceFolder, "TestOutput.pdf"); 

      //Remove all pages except 1,2,3,4 and 6 
      removePagesFromPdf(sourceFile, destFile, 1, 2, 3, 4, 6); 
      this.Close(); 
     } 
     public void removePagesFromPdf(String sourceFile, String destinationFile, params int[] pagesToKeep) 
     { 
      //Used to pull individual pages from our source 
      PdfReader r = new PdfReader(sourceFile); 
      //Create our destination file 
      using (FileStream fs = new FileStream(destinationFile, FileMode.Create, FileAccess.Write, FileShare.None)) 
      { 
       using (Document doc = new Document()) 
       { 
        using (PdfWriter w = PdfWriter.GetInstance(doc, fs)) 
        { 
         //Open the desitination for writing 
         doc.Open(); 
         //Loop through each page that we want to keep 
         foreach (int page in pagesToKeep) 
         { 
          //Add a new blank page to destination document 
          doc.NewPage(); 
          //Extract the given page from our reader and add it directly to the destination PDF 
          w.DirectContent.AddTemplate(w.GetImportedPage(r, page), 0, 0); 
         } 
         //Close our document 
         doc.Close(); 
        } 
       } 
      } 
     } 
    } 
} 
2

这里是我用来复制除现有PDF的最后一页以外的所有代码的代码。一切都在内存流中。变量pdfByteArray是使用ms.ToArray()获得的原始pdf的byte []。 pdfByteArray被新的PDF覆盖。

 PdfReader originalPDFReader = new PdfReader(pdfByteArray); 

     using (MemoryStream msCopy = new MemoryStream()) 
     { 
      using (Document docCopy = new Document()) 
      { 
       using (PdfCopy copy = new PdfCopy(docCopy, msCopy)) 
       { 
       docCopy.Open(); 
       for (int pageNum = 1; pageNum <= originalPDFReader.NumberOfPages - 1; pageNum ++) 
       { 
        copy.AddPage(copy.GetImportedPage(originalPDFReader, pageNum)); 
       } 
       docCopy.Close(); 
       } 
      } 

      pdfByteArray = msCopy.ToArray(); 
14

使用PdfReader.SelectPages()与PdfStamper结合使用。下面的代码使用iTextSharp 5.5.1。

public void SelectPages(string inputPdf, string pageSelection, string outputPdf) 
{ 
    using (PdfReader reader = new PdfReader(inputPdf)) 
    { 
     reader.SelectPages(pageSelection); 

     using (PdfStamper stamper = new PdfStamper(reader, File.Create(outputPdf))) 
     { 
      stamper.Close(); 
     } 
    } 
} 

然后你调用这个方法,并为每个条件选择正确的页面。

条件1:

SelectPages(inputPdf, "1-4", outputPdf); 

条件2:

SelectPages(inputPdf, "1-4,6", outputPdf); 

SelectPages(inputPdf, "1-6,!5", outputPdf); 

条件3:

SelectPages(inputPdf, "1-5", outputPdf); 

以下是关于构成页面选择的iTextSharp源代码的评论。这是它是用来处理页面选择SequenceList类:

/** 
* This class expands a string into a list of numbers. The main use is to select a 
* range of pages. 
* <p> 
* The general systax is:<br> 
* [!][o][odd][e][even]start-end 
* <p> 
* You can have multiple ranges separated by commas ','. The '!' modifier removes the 
* range from what is already selected. The range changes are incremental, that is, 
* numbers are added or deleted as the range appears. The start or the end, but not both, can be ommited. 
*/