xor-python中的大文件

我想将xOr操作应用于大量文件，其中一些文件非常大。
基本上我得到一个文件，并按字节逐字逐句（或至少这是我认为我在做什么）。当它碰到一个更大的文件时（大约70MB），我得到一个内存不足的错误，我的脚本崩溃了。
我的电脑有16GB的Ram，有超过50％的可用空间，所以我不会把它和我的硬件联系起来。xor-python中的大文件

def xor3(source_file, target_file): 
    b = bytearray(open(source_file, 'rb').read()) 
    for i in range(len(b)): 
     b[i] ^= 0x71 
    open(target_file, 'wb').write(b)

我试图读取数据块文件，但似乎我对这个太unexperimented作为输出是不希望的一个。第一个函数返回当然:)

def xor(data): 
    b = bytearray(data) 
    for i in range(len(b)): 
     b[i] ^= 0x41 
    return data 


def xor4(source_file, target_file): 
    with open(source_file,'rb') as ifile: 
     with open(target_file, 'w+b') as ofile: 
      data = ifile.read(1024*1024) 
      while data: 
       ofile.write(xor(data)) 
       data = ifile.read(1024*1024)

什么是这种操作的此时，相应的解决方案我想要什么，？我做错了什么？

来源

2016-11-10 Vlad M

什么是文件 –

的内容在的功能之一，你在使用'0x71'异或其他与'0x41'。这是你的预期吗？显然这改变了结果... – Bakuriu

@Bakuriu我只是在不同的密钥文件上试用它。 –

使用seek函数来获得以块的文件，每次追加懒洋洋它来输出文件

CHUNK_SIZE = 1000 #for example 

with open(source_file, 'rb') as source: 
    with open(target_file, 'a') as target: 
     bytes = bytearray(source.read(CHUNK_SIZE)) 
     source.seek(CHUNK_SIZE) 

     for i in range(len(bytes)): 
      bytes[i] ^= 0x71 

     target.write(bytes)

来源

2016-11-10 14:39:51

我试过这个，它似乎只处理文件的第一个块。如果我设法解决它，我会在这里更新。 –

遍历大文件。

from operator import xor 
from functools import partial 
def chunked(file, chunk_size): 
    return iter(lambda: file.read(chunk_size), b'') 
myoperation = partial(xor, 0x71) 

with open(source_file, 'rb') as source, open(target_file, 'ab') as target: 
    processed = (map(myoperation, bytearray(data)) for data in chunked(source, 65536)) 
    for data in processed: 
     target.write(bytearray(data))

来源

2016-11-10 14:48:00

注意：你应该使用'b'''而不是''''作为标记值。另外：目标文件也应该以二进制模式打开：'ab'而不是'a'。这两个简单的更改使得代码可以在python2和python3中运行。 – Bakuriu

除非我错了，在你的第二个例子，您可以通过调用bytearray并将其分配给b创建data副本。然后您修改b，但返回data。 b中的修改对data本身没有影响。

来源

2016-11-10 14:58:05 data

这是非常真实的！谢谢你的光临。 –

这恐怕只能在Python 2中，再一次显示出它的好得多使用的字节流：

def xor(infile, outfile, val=0x71, chunk=1024): 
    with open(infile, 'r') as inf: 
     with open(outfile, 'w') as outf: 
      c = inf.read(chunk) 
      while c != '': 
       s = "".join([chr(ord(cc) ^val) for cc in c]) 
       outf.write(s) 
       c = inf.read(chunk)

来源

2016-11-10 15:49:37

xor-python中的大文件

回答

相关问题