我可以分享HashMap不同映射器相同值如静态变量?我在hadoop集群中运行作业,并且我试图在所有在不同datanode上运行的mapper之间共享变量值。如何在Hadoop中的映射器之间共享HashMap?
INPUT ==>文件路径写到FileID
InputFormat => KeyValueTextInputFormat
public class Demo {
static int termID=0;
public static class DemoMapper extends Mapper<Object, Text, IntWritable, Text> {
static HashMap<String, Integer> termMapping = new HashMap<String, Integer>();
@Override
protected void map(Object key, Text value, Context context) throws IOException, InterruptedException {
BufferedReader reader = new BufferedReader(new FileReader(value));
String line;
String currentTerm;
while ((line = reader.readLine()) != null) {
tokenizer = new StringTokenizer(line, " ");
while (tokenizer.hasMoreTokens()) {
currentTerm = tokenizer.nextToken();
if (!termMap.containsKey(currentTerm)) {
if (!termMapping.containsKey(currentTerm)) {
termMapping.put(currentTerm, termID++);
}
termMap.put(currentTerm, 1);
} else {
termMap.put(currentTerm, termMap.get(currentTerm) + 1);
}
}
}
}
}
public static void main(String[] args) {
}
}
我知道你可以播放地图在Spark之间的任务。从未尝试过使用MapReduce –
Thx,但我不想使用Spark –
好吧,然后显示您尝试添加Map的MapReduce代码。你得到了什么错误? –