I have two Map/Reduce classes, named MyMappper1/MyReducer1 and MyMapper2/MyReducer2, and want to use the output of MyReducer1 as the input of MyMapper2, by setting the input path of job2 to the output path of job1.
的类型如下:错误在使用一个的MapReduce的输出作为另一个的MapReduce的输入
public class MyMapper1 extends Mapper<LongWritable, Text, IntWritable, IntArrayWritable>
public class MyReducer1 extends Reducer<IntWritable, IntArrayWritable, IntWritable, IntArrayWritable>
public class MyMapper2 extends Mapper<IntWritable, IntArrayWritable, IntWritable, IntArrayWritable>
public class MyReducer2 extends Reducer<IntWritable, IntArrayWritable, IntWritable, IntWritable>
public class IntArrayWritable extends ArrayWritable {
public IntArrayWritable() {
super(IntWritable.class);
}
}
以及用于设置输入/输出路径的代码是这样的:
Path temppath = new Path("temp-dir-" + temp_time);
FileOutputFormat.setOutputPath(job1, temppath);
...........
FileInputFormat.addInputPath(job2, temppath);
的设置输入/输出格式的代码如下:
job1.setOutputFormatClass(TextOutputFormat.class);
..........
job2.setInputFormatClass(KeyValueTextInputFormat.class);
但是我运行作业2时,总是得到异常:
11/04/16 12:34:09 WARN mapred.LocalJobRunner: job_local_0002
java.lang.ClassCastException: org.apache.hadoop.io.Text cannot be cast to org.apache.hadoop.io.IntWritable
at ligon.MyMapper2.map(MyMapper2.java:1)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:646)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:322)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:210)
我曾试图改变InputFormat和OUTPUTFORMAT,但没有成功,类似的(尽管不相同)的例外发生在作业2。
我完整的代码包是: http://dl.dropbox.com/u/7361939/HW2_Q1.zip
非常感谢您!
谢谢。现在的问题是:ArrayWritable由第一个reducer输出如下 - 没有任何元素值 - 如何让第二个映射器接受这个并从这个字符串转换为对象? [email protected] [email protected] [email protected] [email protected] [email protected] [email protected] [email protected] [email protected] [email protected] [email protected] b21 – 2011-04-17 14:31:07
我也有同样的问题,并得到相同的错误。我想把另一个hadoop工作的输出用作第二个hadoop工作的输入。第一份工作的输出具有MapWritable作为值。第二份工作的解决方案是job.InputFormatClass()。但是我应该使用哪一个参数 – Yeameen 2012-04-20 07:21:51