2011-06-08 48 views
2

我有一个关于数据存储的问题。我有一个程序正在创建一个对象列表。什么是最好的方式来存储这些文件,以便程序可以在以后重新加载它们?我试着用泡椒,但我想我可能是标题沿着错误的胡同,我不断收到此错误,当我尝试读回数据:新的Python(编程)和数据存储

Traceback (most recent call last): 
    File "test.py", line 110, in <module> 
knowledge = pickle.load(open("data.txt")) 
    File "/sw/lib/python3.1/pickle.py", line 1356, in load 
encoding=encoding, errors=errors).load() 
File "/sw/lib/python3.1/codecs.py", line 300, in decode 
(result, consumed) = self._buffer_decode(data, self.errors, final) 
    UnicodeDecodeError: 'utf8' codec can't decode byte 0x80 in position 0: invalid start byte 

编辑补充:这里是一个位的代码我想:

FILE = open("data.txt", "rb") 

knowledge = pickle.load(open("data.txt")) 

FILE = open("data.txt", 'wb') 

pickle.dump(knowledge, FILE) 
+1

哪个Python版本?你是如何创建该文件的? – delnan 2011-06-08 14:54:17

+0

你是如何拯救他们的? – Nix 2011-06-08 14:54:41

+0

重试酸洗。仔细阅读文档!在这里发布一些代码,我们会帮助你找到问题:)。你也可以使用JSON,有几个模块。 – slezica 2011-06-08 14:55:39

回答

0

如果你只是想在以后重新创建一些类对象,最简单的解决方案将是它们的属性转储到一个文件中,并可以根据读回,创造了对象内容。

请参见: http://docs.python.org/tutorial/inputoutput.html

+0

不,这并不容易。这是很多额外的打字和违反DRY(因此也带来不同步的风险)。 – delnan 2011-06-08 14:53:54

+0

数据结构相当复杂。您链接的文章建议不要手动做,并建议泡菜。你有什么想法可能会导致我的错误? – CGPGrey 2011-06-08 15:02:11

-1

您可以使用cPickle的,或Picke不要紧。以二进制模式打开(rb),然后尝试将协议设置为-1。

尝试这样:

import cPickle 

my_file= open('wohoo.file', 'wb') 

largeObject= Magic() #insert your logic here 
cPickle.dump(largeObject, my_file, -1) 
my_file.close() 

other_file = open('wohoo.file', 'rb') 
welcomeBack - cPickle.load(other_file) 
other_file.close() 
+0

-1可能是错的,再读一遍。它确实找到了该文件。它甚至可以读取它。它只是无法将其解码为Python更喜欢的编码。 – delnan 2011-06-08 14:58:49

9

我认为这个问题是该行

knowledge = pickle.load(open("data.txt")) 

不以二进制方式打开该文件。 Python 3.2:

>>> import pickle 
>>> 
>>> knowledge = {1:2, "fred": 19.3} 
>>> 
>>> with open("data.txt", 'wb') as FILE: 
...  pickle.dump(knowledge, FILE) 
... 
>>> knowledge2 = pickle.load(open("data.txt")) 
Traceback (most recent call last): 
    File "<stdin>", line 1, in <module> 
    File "/opt/local/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/codecs.py", line 300, in decode 
    (result, consumed) = self._buffer_decode(data, self.errors, final) 
UnicodeDecodeError: 'utf8' codec can't decode byte 0x80 in position 0: invalid start byte 
>>> knowledge2 = pickle.load(open("data.txt","rb")) 
>>> knowledge2 
{1: 2, 'fred': 19.3} 
1

不需要重写shelve,Python的对象持久性库。例如:

import shelve 

d = shelve.open(filename) # open -- file may get suffix added by low-level 
          # library 

d[key] = data # store data at key (overwrites old data if 
       # using an existing key) 
data = d[key] # retrieve a COPY of data at key (raise KeyError if no 
       # such key) 
del d[key]  # delete data stored at key (raises KeyError 
       # if no such key) 
flag = d.has_key(key) # true if the key exists 
klist = d.keys() # a list of all existing keys (slow!) 

# as d was opened WITHOUT writeback=True, beware: 
d['xx'] = range(4) # this works as expected, but... 
d['xx'].append(5) # *this doesn't!* -- d['xx'] is STILL range(4)! 

# having opened d without writeback=True, you need to code carefully: 
temp = d['xx']  # extracts the copy 
temp.append(5)  # mutates the copy 
d['xx'] = temp  # stores the copy right back, to persist it 

# or, d=shelve.open(filename,writeback=True) would let you just code 
# d['xx'].append(5) and have it work as expected, BUT it would also 
# consume more memory and make the d.close() operation slower. 

d.close()  # close it