向每个映射器传递不同的参数

我有一个使用多个映射器和一个reducer的工作。映射器几乎完全相同，只是它们用于产生结果的String的值不同。向每个映射器传递不同的参数

目前我有几个类，String的每个值我提到—它感觉应该有一个更好的方式，这并不需要太多的代码重复。有没有办法将这些String值作为参数传递给映射器？

我的工作是这样的：

Input File A ----> Mapper A using 
         String "Foo" ----+ 
             |---> Reducer 
        Mapper B using ----+ 
Input File B ----> String "Bar"

我想要把它弄成这个样子：

Input File A ----> GenericMapper parameterized 
           with String "Foo" ----+ 
                |---> Reducer 
        GenericMapper parameterized ----+ 
Input File B ---->   with String "Bar"

编辑：这里是我目前有两个简化的映射类。他们准确地代表我的实际情况。

class MapperA extends Mapper<Text, Text, Text, Text> { 
    public void map(Text key, Text value, Context context) { 
     context.write(key, new Text(value.toString() + "Foo")); 
    } 
} 

class MapperB extends Mapper<Text, Text, Text, Text> { 
    public void map(Text key, Text value, Context context) { 
     context.write(key, new Text(value.toString() + "Bar")); 
    } 
}

编辑：每个映射器应该使用什么字符串只取决于该文件中的数据从何而来。没有办法区分这些文件，除非通过文件名。

来源

2015-01-20 Paul Manta

一些实际的Mapper代码很好，我想。或者至少你现在的Mapper结构 – maffelbaffel 2015-01-20 22:35:07

@maffelbaffel我加了一些代码。 – 2015-01-20 22:43:02

除了附加字符串之外，映射器A与B的区别是什么？你在使用多个输入吗？你有多少个文件？从驱动程序代码中，您可以传递与每个文件名关联的“字符串”，并且可以从map（）中获取当前正在处理的文件名并追加必要的密钥。我相信我没有完全得到这个问题。我错过了什么。 – 2015-01-21 03:05:42

假设你使用的文件输入格式，你可以得到你当前输入文件名的映射是这样的：

if (context.getInputSplit() instanceof FileSplit) { 
    FileSplit fileSplit = (FileSplit) context.getInputSplit(); 
    Path inputPath = fileSplit.getPath(); 
    String fileId = ... //parse inputPath into a file id 
    ... 
}

您可以分析inputPath不过你想要的，如仅使用文件名或仅使用分区ID等来生成标识输入文件的唯一ID。例如：如规定以上，并获得价值

conf.set("my.property.A", "foo"); 
conf.set("my.property.B", "bar");

在映射计算文件“ID”：

/some/path/A -> A 
/some/path/B -> B

配置出你对驱动程序的每个可能的文件“ID”属性

conf.get("my.property." + fileId);

来源

2015-01-21 04:47:05 yurgis

我相信这和我评论的一样:) – 2015-01-22 17:18:29

也许你会使用if语句在你的映射器中选择字符串。什么取决于使用一个或另一个字符串？

或者可能使用抽象映射器类。

来源

2015-01-20 22:47:19 igonzalez

也许这样？

abstract class AbstractMapper extends Mapper<Text, Text, Text, Text> { 
    protected String text; 
    public void map(Text key, Text value, Context context) { 
     context.write(key, new Text(value.toString() + text)); 
    } 
} 
class MapperImpl1 extends AbstractMapper{ 
    @Override 
    public void map(Text key, Text value, Context context) { 
     text = "foo"; 
     super.map(); 
    } 
} 
class MapperImpl2 extends AbstractMapper{ 
     @Override 
     public void map(Text key, Text value, Context context) { 
      text = "bar"; 
      super.map(); 
     } 
    }

来源

2015-01-20 23:01:51 maffelbaffel

如果我没有弄错，这种方法将无助于将代码减少到唯一的映射器类，因为它仍然需要为每个映射器类型定制实现。 – 2015-01-21 07:48:09

向每个映射器传递不同的参数

回答

相关问题