2014-10-09 72 views
0

我需要读取表中的所有行(超过一百万)。我已阅读关于分页(http://www.datastax.com/dev/blog/datastax-python-driver-2-0-released) - 没有多大帮助。 的代码相当直截了当:Python 2.6 Cassandra 2.0.1 ReadTimeout

... 
retry = RetryPolicy() 
retry.RETRY = 10 
cluster = Cluster(
[ ... ], 
reconnection_policy=ConstantReconnectionPolicy(5.0, 100), 
auth_provider=auth_provider, 
load_balancing_policy=RoundRobinPolicy(), 
default_retry_policy=retry, 
port=9042) 
session = cluster.connect("test") 
session.default_timeout = 9999 
session.default_fetch_size = 1000 

... 
... 

uname_stmt = SimpleStatement(q, fetch_size=100) 
uname_stmt.consistency_level = ConsistencyLevel.ONE 

for row in session.execute(uname_stmt): 
    ... 

基本上后约5分钟左右(可以是1分钟或者它可以是10)的最后一个for循环触发此错误:

Traceback (most recent call last): 
File "test.py", line 67, in <module> 
for row in session.execute(uname_stmt): 
File "/usr/lib/python2.6/site-packages/cassandra/cluster.py", line 2939, in next 
result = self.response_future.result(self.timeout) 
File "/usr/lib/python2.6/site-packages/cassandra/cluster.py", line 2771, in result 
raise self._final_exception 
cassandra.ReadTimeout: code=1200 [Coordinator node timed out waiting for replica nodes' responses] message="Operation timed out - received only 0 responses." info={'received_responses': 0, 'data_retrieved': False, 'required_responses': 1, 'consistency': 1} 

任何帮助会很棒! 谢谢!

回答

0

这可能是因为卡桑德拉试图重组所有的SSTables。

这就是为什么读取操作在许多SSTable上发生并且超时。

Cassandra使用压缩管理磁盘上SSTables的累积。

尝试使用紧凑的命令可能会有所帮助。

nodetool紧凑

+0

感谢您的回复,我已经试过了,没有帮助:( 我还安装了最新的卡桑德拉 - 同样的问题。 – Pavel 2014-11-03 15:55:51