我假设这必须是一个内存问题,但我不确定。该程序通过PDF循环查找损坏的文件。当一个文件被破坏时,它会将该位置写入一个txt文件供我稍后查看。当第一次运行它时,我将通过和失败情况记录到日志中。在67381条日志条目之后,它停止。然后我改变了这个逻辑,所以它只记录错误,但是,在控制台中,我显示了一个循环的计数,所以我可以告诉程序有多远。大约有19万个文件循环,每次停留在67381处。它看起来像python程序仍在后台运行,因为内存和CPU持续波动,但很难确定。我现在也不知道它是否仍将错误写入日志。Python循环计数停在67381
下面是代码,
import PyPDF2, os
from time import gmtime,strftime
path = raw_input("Enter folder path of PDF files:")
t = open(r'c:\pdf_check\log.txt','w')
count = 1
for dirpath,dnames,fnames in os.walk(path):
for file in fnames:
print count
count = count + 1
if file.endswith(".pdf"):
file = os.path.join(dirpath, file)
try:
PyPDF2.PdfFileReader(open(file, "rb"))
except PyPDF2.utils.PdfReadError:
curdate = strftime("%Y-%m-%d %H:%M:%S", gmtime())
t.write (str(curdate) + " " + "-" + " " + file + " " + "-" + " " + "fail" + "\n")
else:
pass
#curdate = strftime("%Y-%m-%d %H:%M:%S", gmtime())
#t.write(str(curdate) + " " + "-" + " " + file + " " + "-" + " " + "pass" + "\n")
t.close()
编辑1:(新代码) 新的代码和相同的问题:
import PyPDF2, os
from time import gmtime,strftime
path = raw_input("Enter folder path of PDF files:")
t = open(r'c:\pdf_check\log.txt','w')
count = 1
for dirpath,dnames,fnames in os.walk(path):
for file in fnames:
print count
count = count + 1
if file.endswith(".pdf"):
file = os.path.join(dirpath, file)
try:
with open(file,'rb') as f:
PyPDF2.PdfFileReader(f)
except PyPDF2.utils.PdfReadError:
curdate = strftime("%Y-%m-%d %H:%M:%S", gmtime())
t.write (str(curdate) + " " + "-" + " " + file + " " + "-" + " " + "fail" + "\n")
f.close()
else:
pass
f.close()
#curdate = strftime("%Y-%m-%d %H:%M:%S", gmtime())
#t.write(str(curdate) + " " + "-" + " " + file + " " + "-" + " " + "pass" + "\n")
t.close()
编辑2:我想现在从运行此不同的机器具有更强大的硬件和不同版本的Windows(10 Pro而不是服务器2008 R2),但我不认为这是问题。
'PyPDF2.PdfFileReader(open(file,“rb”))'不保证关闭文件。使用上下文管理器使文件的句柄关闭(不会受伤) –
它如何停止?默默? –
是的,它只是冻结,python程序运行在任务管理器仍然和CPU和内存在改变,但等待很长时间后没有任何反应。 – HMan06