2011-03-31 191 views
1

我们正在删除Cassandra中的大量记录。我们得到以下错误。PHPCassa + Cassandra上的TFramedTransport错误

Error performing remove on 10.130.279.40:9160: exception 'TTransportException' with message 'TSocket: timed out reading 4 bytes from 10.130.279.40:9160' in /home/zonefiles/php/thrift/transport/TSocket.php:268 
    Stack trace: 
    0 /home/zonefiles/php/thrift/transport/TTransport.php(87): TSocket->read(4) 
    1 /home/zonefiles/php/thrift/transport/TFramedTransport.php(135): TTransport->readAll(4) 
    2 /home/zonefiles/php/thrift/transport/TFramedTransport.php(102): TFramedTransport->readFrame() 
    3 [internal function]: TFramedTransport->read(8192) 
    4 /home/zonefiles/php/thrift/packages/cassandra/Cassandra.php(691): thrift_protocol_read_binary(Object(TBinaryProtocolAccelerated), 'cassandra_Cassa...', false) 
    5 /home/zonefiles/php/thrift/packages/cassandra/Cassandra.php(664): CassandraClient->recv_remove() 
    6 [internal function]: CassandraClient->remove('CUSTOMERSERVICE...', Object(cassandra_ColumnPath), 1301555573936295, 1) 
    7 /home/zonefiles/php/connection.php(230): call_user_func_array(Array, Array) 
    8 /home/zonefiles/php/columnfamily.php(582): ConnectionPool->call('remove', 'CUSTOMERSERVICE...', Object(cassandra_ColumnPath), 1301555573936295, 1) 
    9 /home/zonefiles/php/delete.php(34): ColumnFamily->remove('CUSTOMERSERVICE...') 
    10 {main} 
    Error connecting to 10.130.279.40:9160: exception 'TTransportException' with message 'TSocket: timed out reading 4 bytes from 10.130.279.40:9160' in /home/zonefiles/php/thrift/transport/TSocket.php:268 
    Stack trace: 
    0 /home/zonefiles/php/thrift/transport/TTransport.php(87): TSocket->read(4) 
    1 /home/zonefiles/php/thrift/transport/TFramedTransport.php(135): TTransport->readAll(4) 
    2 /home/zonefiles/php/thrift/transport/TFramedTransport.php(102): TFramedTransport->readFrame() 
    3 [internal function]: TFramedTransport->read(8192) 
    4 /home/zonefiles/php/thrift/packages/cassandra/Cassandra.php(1015): thrift_protocol_read_binary(Object(TBinaryProtocolAccelerated), 'cassandra_Cassa...', false) 
    5 /home/zonefiles/php/thrift/packages/cassandra/Cassandra.php(992): CassandraClient->recv_describe_version() 
    6 /home/zonefiles/php/connection.php(63): CassandraClient->describe_version() 
    7 /home/zonefiles/php/connection.php(163): ConnectionWrapper->__construct('CDTMain1', '10.130.279.40:9...', NULL, true, 5000, 5000) 
    8 /home/zonefiles/php/connection.php(254): ConnectionPool->make_conn() 
    9 /home/zonefiles/php/connection.php(241): ConnectionPool->handle_conn_failure(Object(ConnectionWrapper), 'remove', Object(TTransportException), 1) 
    10 /home/zonefiles/php/columnfamily.php(582): ConnectionPool->call('remove', 'CUSTOMERSERVICE...', Object(cassandra_ColumnPath), 1301555573936295, 1) 
    11 /home/zonefiles/php/delete.php(34): ColumnFamily->remove('CUSTOMERSERVICE...') 
    12 {main} 

这里是我们用来生成该错误的PHP:

<?php 
set_time_limit(2000); 
require 'connection.php'; 
require 'columnfamily.php'; 
$servers[0]['host'] = 'private ip'; 
$servers[0]['port'] = '9160'; 
$conn = new Connection('Server11', $servers); 
$urlFamily = new ColumnFamily($conn, 'Domain'); // ColumnFamily 

$start = microtime(true); 

$limit = 100000000; 

$rows = $urlFamily->get_range($key_start='', $key_finish='zzzzzzzzzzzzzzz',100000000); 

$num = 0; 
$delCount = 0; 

foreach($rows as $key => $columns) { 
    // Do stuff with $key or $columns 
     if (strpos($key, ' .net') !== false) { 
       //echo 'deleting ' . $key . "\n"; 
       $urlFamily->remove($key); 
       $delCount++; 
     } 
     if ($num++ > 100000000) break; 
     //$num++; 
     if ($num % 100000 == 0) echo $num . "\n"; 
} 

$end = microtime(true); 

echo $num . " total\n"; 
echo $delCount . ' deleted in ' . ($end - $start) . " seconds\n"; 
echo $delCount/($end - $start) . " deleted per second\n"; 

?> 

我们在Fedora 14劳克林运行PHP 5.3.5当我们插入的记录数量庞大,我们也得到这个错误和节俭0.5.0。

一个理论是,这是由卡桑德拉不能够足够快地处理命令造成的。你同意/不同意吗?你以前见过这个吗?

如果您建议删除不同的方式(例如截断),那么当我们用Cassandra做其他事情时,我们仍然如何防止这个问题发生?

+0

是delete语句的目的是删除在.NET结束域的列表。有没有办法查询cassandra来检索其中有.net的所有URL?我知道在MySQL中它会像SELECT * WHERE Domain LIKE'* .net' – dengeltrees 2011-03-31 08:55:10

回答

2

那些只是日志消息,还是实际上正在引发异常? phpcassa每次调用error_log()时,都会在使用其他连接重试之前捕获这样的异常。基本上,这意味着您应该密切关注记录的堆栈轨迹,但您不必过多担心它们。

这些是客户端套接字超时,这意味着该调用所花费的时间超过默认的5秒超时。为什么这些首先发生在很大程度上取决于Cassandra的行为方式。监测Cassandra可能是最好的开始。

+0

我们尝试增加超时,我们仍然收到这些错误。我应该监视Cassandra?我如何执行监控?关于Cassandra在显示消息后所做的最佳猜测是什么:重试读/写,跳过它,崩溃? – dengeltrees 2011-04-03 08:31:35

+0

有关缓冲区大小或密钥大小或yaml文件中的某些内容的设置是否会导致错误? – dengeltrees 2011-04-03 08:37:17

+0

它在TSocket.php中引发异常,这是引发它的行。 throw new TTransportException('TSocket:超时读取'。$ len。'bytes from'。 $ this-> host _。':'。$ this-> port_); 异常被connection.php捕获,它显示错误消息。 – dengeltrees 2011-04-03 10:11:28

0

根据我的程序员,我们实际上是通过将超时值提升到非常高的值来解决这个问题。我们试图导入一个5GB的文件,所以我猜这个db需要每次读取超过5秒钟。

以下是已设置的特定超时:

$ send_timeout = 60000 $ recv_timeout = 60000