2010-09-19 47 views
-1

我有一个简单地图,减少程序,其中我的地图和降低原语看起来像这样的hadoop +可写接口+阅读字段抛出在减速器异常

地图(K,V)=(文字,OutputAggregator)
减少(文本,OutputAggregator)=(文本,文本)

重要的一点是,从我的地图函数,我发出一个OutputAggregator类型的对象,这是我自己的类,实现了Writable接口。但是,我的减少失败,出现以下例外。更具体地说,readFieds()函数抛出异常。任何线索为什么?我使用hadoop 0.18.3

10/09/19 04:04:59 INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId= 
10/09/19 04:04:59 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same. 
10/09/19 04:04:59 INFO mapred.FileInputFormat: Total input paths to process : 1 
10/09/19 04:04:59 INFO mapred.FileInputFormat: Total input paths to process : 1 
10/09/19 04:04:59 INFO mapred.FileInputFormat: Total input paths to process : 1 
10/09/19 04:04:59 INFO mapred.FileInputFormat: Total input paths to process : 1 
10/09/19 04:04:59 INFO mapred.JobClient: Running job: job_local_0001 
10/09/19 04:04:59 INFO mapred.MapTask: numReduceTasks: 1 
10/09/19 04:04:59 INFO mapred.MapTask: io.sort.mb = 100 
10/09/19 04:04:59 INFO mapred.MapTask: data buffer = 79691776/99614720 
10/09/19 04:04:59 INFO mapred.MapTask: record buffer = 262144/327680 
Length = 10 
10 
10/09/19 04:04:59 INFO mapred.MapTask: Starting flush of map output 
10/09/19 04:04:59 INFO mapred.MapTask: bufstart = 0; bufend = 231; bufvoid = 99614720 
10/09/19 04:04:59 INFO mapred.MapTask: kvstart = 0; kvend = 10; length = 327680 
gl_books 
10/09/19 04:04:59 WARN mapred.LocalJobRunner: job_local_0001 
java.lang.NullPointerException 
at org.myorg.OutputAggregator.readFields(OutputAggregator.java:46) 
at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:67) 
at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:40) 
at org.apache.hadoop.mapred.Task$ValuesIterator.readNextValue(Task.java:751) 
at org.apache.hadoop.mapred.Task$ValuesIterator.next(Task.java:691) 
at org.apache.hadoop.mapred.Task$CombineValuesIterator.next(Task.java:770) 
at org.myorg.xxxParallelizer$Reduce.reduce(xxxParallelizer.java:117) 
at org.myorg.xxxParallelizer$Reduce.reduce(xxxParallelizer.java:1) 
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.combineAndSpill(MapTask.java:904) 
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:785) 
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:698) 
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:228) 
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:157) 
java.io.IOException: Job failed! 
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1113) 
at org.myorg.xxxParallelizer.main(xxxParallelizer.java:145) 
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source) 
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) 
at java.lang.reflect.Method.invoke(Unknown Source) 
at org.apache.hadoop.util.RunJar.main(RunJar.java:155) 
at org.apache.hadoop.mapred.JobShell.run(JobShell.java:54) 
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) 
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79) 
at org.apache.hadoop.mapred.JobShell.main(JobShell.java:68) 
+1

发布了OutputAggregator.readFields()的代码。第46行是什么? – bajafresh4life 2010-09-20 01:18:52

回答

2

发布有关自定义代码的问题时:发布相关代码段。所以线46和&前几行后,将真正帮助的内容... :)

不过,这可能帮助:

陷阱写自己可写的类时是Hadoop的重用的事实一遍又一遍的实际类的实例。在调用readFields之间,你不会得到一个闪亮的新实例。

因此,在readFields方法开始时,您必须假定您所在的对象填充了“垃圾”,并且在继续之前必须清除。

我给你的建议是实现一个“clear()”方法,它完全擦除当前实例并将其重置为创建完成并构造函数完成后的状态。当然,您将该方法作为您的readField中的键和值的第一件事。

HTH

0

除了尼尔斯Basjes答案:只要初始化空的构造函数内的成员变量(你必须提供,否则的Hadoop不能初始化你的对象),例如:

public OutputAggregator() { 
    this.member = new IntWritable(); 
    ... 
} 

假设this.memberIntWritable类型。