我需要使用Python为现有PDF添加一些额外的文本，最好的方法是什么，以及需要安装哪些额外的模块。使用Python将文本添加到现有PDF中

注意：理想情况下，我希望能够在Windows和Linux上运行此操作，但只有Linux才会执行此操作。

在此先感谢。
理查德。

编辑：pyPDF和ReportLab看起来不错，但都不会允许我编辑现有的PDF，有没有其他的选择？

来源

2009-07-24 Frozenskys

我知道这是一个较旧的职位，但我花了很长时间试图找到一个解决方案。我碰到一个像样只采用ReportLab的和PyPDF，所以我想我会分享：

读取使用PdfFileReader（）您的PDF，我们会打电话给这个输入
创建一个包含新的PDF您文本添加使用ReportLab的，它保存为一个字符串对象
读取使用PdfFileReader（）的字符串对象，我们称这个文本
创建使用PdfFileWriter（）一个新的PDF对象，我们称这个输出
迭代通过输入和应用.mergePage（文本 .getPage（0））的文本添加到您希望每个页面，然后使用输出 .addPage（），以修改后的网页添加到一个新的文档

这适用于简单的文本添加。请参阅PyPDF的样本来为文档加水印。

下面是一些代码来回答以下问题：

packet = StringIO.StringIO() 
can = canvas.Canvas(packet, pagesize=letter) 
<do something with canvas> 
can.save() 
packet.seek(0) 
input = PdfFileReader(packet)

在这里，您可以用另一个文件

来源

2010-02-01 23:28:31 dwelch

将PDF转换为可编辑格式，编写更改，然后将其转换回PDF格式，您可能会碰到更好的运气。我不知道可以直接编辑PDF的库，但DOC和PDF之间有很多转换器。

来源

2009-07-24 21:03:21 aehlke

问题是，我只有在PDF源（从第三方）和PDF - > DOC - > PDF在转换中会损失很多。我也需要在Linux上运行，所以DOC可能不是最好的选择。 – Frozenskys 2009-07-24 21:08:21

我相信Adobe保持PDF编辑功能非常封闭和专有，这样他们就可以为其更高版本的Acrobat销售许可证。也许你可以找到一种方法来自动使用Acrobat Pro来编辑它，使用某种宏接口。 – aehlke 2009-07-24 21:14:45

如果要写入的部分是表单字段，则有XML接口来编辑它们 - 否则我找不到任何东西。 – aehlke 2009-07-24 21:15:57

-2

你试过pyPdf？

对不起，它无法修改页面的内容。

来源

2009-07-24 21:13:14 Zoman

看起来像可能工作，有人使用它？内存使用情况如何？ – Frozenskys 2009-07-24 21:17:37

它具有添加文本水印的功能，如果格式正确，可能会起作用。 – Frozenskys 2009-07-24 21:24:41

如果你在Windows上，这可能工作：

PDF Creator Pilot

还有一个PDF创建和编辑框架在Python的白皮书。这是一个有点过时，但也许可以给你一些有用的信息：

Using Python as PDF Editing and Processing Framework

来源

2009-07-24 21:14:54 thedz

这里合并输入文件的网页是我在别处找到了一个完整的答案[Python的2。7]：

from pyPdf import PdfFileWriter, PdfFileReader 
import StringIO 
from reportlab.pdfgen import canvas 
from reportlab.lib.pagesizes import letter 

packet = StringIO.StringIO() 
# create a new PDF with Reportlab 
can = canvas.Canvas(packet, pagesize=letter) 
can.drawString(10, 100, "Hello world") 
can.save() 

#move to the beginning of the StringIO buffer 
packet.seek(0) 
new_pdf = PdfFileReader(packet) 
# read your existing PDF 
existing_pdf = PdfFileReader(file("original.pdf", "rb")) 
output = PdfFileWriter() 
# add the "watermark" (which is the new pdf) on the existing page 
page = existing_pdf.getPage(0) 
page.mergePage(new_pdf.getPage(0)) 
output.addPage(page) 
# finally, write "output" to a real file 
outputStream = file("destination.pdf", "wb") 
output.write(outputStream) 
outputStream.close()

这里是为Python 3.x的更新：

from PyPDF2 import PdfFileWriter, PdfFileReader 
import io 
from reportlab.pdfgen import canvas 
from reportlab.lib.pagesizes import letter 

packet = io.BytesIO() 
# create a new PDF with Reportlab 
can = canvas.Canvas(packet, pagesize=letter) 
can.drawString(10, 100, "Hello world") 
can.save() 

#move to the beginning of the StringIO buffer 
packet.seek(0) 
new_pdf = PdfFileReader(packet) 
# read your existing PDF 
existing_pdf = PdfFileReader(open("original.pdf", "rb")) 
output = PdfFileWriter() 
# add the "watermark" (which is the new pdf) on the existing page 
page = existing_pdf.getPage(0) 
page.mergePage(new_pdf.getPage(0)) 
output.addPage(page) 
# finally, write "output" to a real file 
outputStream = open("destination.pdf", "wb") 
output.write(outputStream) 
outputStream.close()

来源

2013-07-09 00:16:42

cpdf将从命令行做的工作。这不是蟒蛇，虽然（据我所知）：

cpdf -add-text "Line of text" input.pdf -o output .pdf

来源

2014-03-05 11:51:36 user2243670

pdfrw将让你在网页从现有的PDF阅读，并吸引他们到画布的ReportLab（类似于绘制图像）。在github的pdfrw examples/rl1子目录中有这样的例子。免责声明：我是pdfrw作者。

来源

2015-07-11 04:47:00

利用David Dehghan的answer以上，在Python 2.7.13以下工作：

from PyPDF2 import PdfFileWriter, PdfFileReader, PdfFileMerger 

import StringIO 

from reportlab.pdfgen import canvas 
from reportlab.lib.pagesizes import letter 

packet = StringIO.StringIO() 
# create a new PDF with Reportlab 
can = canvas.Canvas(packet, pagesize=letter) 
can.drawString(290, 720, "Hello world") 
can.save() 

#move to the beginning of the StringIO buffer 
packet.seek(0) 
new_pdf = PdfFileReader(packet) 
# read your existing PDF 
existing_pdf = PdfFileReader("original.pdf") 
output = PdfFileWriter() 
# add the "watermark" (which is the new pdf) on the existing page 
page = existing_pdf.getPage(0) 
page.mergePage(new_pdf.getPage(0)) 
output.addPage(page) 
# finally, write "output" to a real file 
outputStream = open("destination.pdf", "wb") 
output.write(outputStream) 
outputStream.close()

来源

2017-04-22 21:52:28

使用Python将文本添加到现有PDF中

回答

这里合并输入文件的网页是我在别处找到了一个完整的答案[Python的2。7]：

这里是为Python 3.x的更新：

相关问题