2011-06-02 166 views
1

我使用猪CassandraStroage()来插入一个大数据集分成卡桑德拉,运行4个小时后,将其与以下异常崩溃:卡桑德拉猪插入例外

java.lang.NullPointerException 
     at org.apache.cassandra.dht.RandomPartitioner.getToken(RandomPartitioner.java:134) 
     at org.apache.cassandra.dht.RandomPartitioner.getToken(RandomPartitioner.java:36) 
     at org.apache.cassandra.client.RingCache.getRange(RingCache.java:129) 
     at org.apache.cassandra.hadoop.ColumnFamilyRecordWriter.write(ColumnFamilyRecordWriter.java:127) 
     at org.apache.cassandra.hadoop.ColumnFamilyRecordWriter.write(ColumnFamilyRecordWriter.java:62) 
     at org.apache.cassandra.hadoop.pig.CassandraStorage.putNext(Unknown Source) 
     at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOut 
putFormat.java:138) 
     at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOut 
putFormat.java:97) 
     at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:498) 
     at org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80) 
     at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map.collect(PigMapOnly.java:48) 
     at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:239) 
     at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:232) 
     at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:53) 
     at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) 
     at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621) 
     at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305) 
     at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177) 

任何想法,为什么会这样?

+0

好的,我发现这是由于我的数据集中的一个条目有一个空键。 – 2011-06-06 16:18:49

回答

0

尽管在您的情况下不是问题的原因,但值得注意的是,当尝试插入指定分区键不存在的列族时可能会发生此错误。

在这种情况下,它会在第一次遇到reducer类时抛出异常。