2016-01-24 110 views
0

我似乎无法获得MultipleInputs函数来读取2个单独的文件进行处理。输出文件总是显示为空白。我通过参考在线示例代码尝试学习和调试,但似乎无法工作。Hadoop:MutipleInput函数不能正常工作

public static class Mapper1 extends Mapper<Object, Text, Text, Text> 

{   
     private Text word = new Text(); 
     private final static Text identifier = new Text("a"); 

     public void map(Object key, Text value, Context context) 
       throws IOException, InterruptedException { 
      StringTokenizer itr = new StringTokenizer(value.toString()); 

      while (itr.hasMoreTokens()) { 
       word.set(itr.nextToken()); 
        context.write(word,identifier); 
      } 
     } 
    } 

    public static class Reducer extends 
      Reducer<Text, Text, Text, IntWritable> { 

     private IntWritable commoncount = new IntWritable(); 


     public void reduce(Text key, Iterable<Text> values, Context context) 
       throws IOException, InterruptedException { 
      int count1 = 0; 
      int count2 = 0; 

      for (Text val : values) { 
        if(val.equals("a")) 
         count1++; 
       else if(val.equals("b")) 
        count2++; 

      } 
      if (count1 != 0 && count2 != 0) 
        context.write(key,new IntWritable(count1 <= count2 ? count1: count2)); 
      } 
     } 

    public static void main(String[] args) throws IOException, 
      InterruptedException, ClassNotFoundException { 

     Configuration conf = new Configuration(); 

     Job job1 = new Job(conf, "Testing"); 
     job1.setJarByClass(CommonWords.class); 

     job1.setMapOutputKeyClass(Text.class); 
     job1.setMapOutputValueClass(Text.class); 
     job1.setOutputKeyClass(Text.class); 
     job1.setOutputValueClass(IntWritable.class); 
     job1.setReducerClass(reduce.class); 
     job1.setMapperClass(Mapper1.class); 
     job1.setMapperClass(Mapper2.class); 
     MultipleInputs.addInputPath(job1, new Path(args[0]), KeyValueTextInputFormat.class, Mapper1.class); 
     MultipleInputs.addInputPath(job1, new Path(args[1]), KeyValueTextInputFormat.class, Mapper2.class); 

     FileOutputFormat.setOutputPath(job1, new Path(args[2])); 
     job1.waitForCompletion(true); 
    } 
} 
+0

“Mapper2”类在哪里? – Thanga

+0

您好,Mapper2类与Mapper1完全相同,但文本被设置为“b”。 – gatsby

回答

0

从代码中删除这些行并执行。因为在使用MultipleInputs时不应指定任何映射器类

job1.setMapperClass(Mapper1.class); 
job1.setMapperClass(Mapper2.class);