1
我想likue这个关键连接两个列表(NoHeaderIndexed和NoFirstIndexed):洗牌内存池是免费的:SPARK与Java
final Broadcast<JavaPairRDD<Long, Tuple2<String, String>>> c = ctx.broadcast(noHeaderIndexed);
JavaPairRDD<Tuple2<Tuple2<String, String>, Long>, Tuple2<Tuple2<String, String>, Long>> rs = noFirstIndexed.mapToPair(new PairFunction<Tuple2<Long, Tuple2<String, String>>, Tuple2<Tuple2<String, String>, Long>, Tuple2<Tuple2<String, String>, Long>>() {
@Override
public Tuple2<Tuple2<Tuple2<String, String>, Long>, Tuple2<Tuple2<String, String>, Long>> call(Tuple2<Long, Tuple2<String, String>> longTuple2Tuple2) throws Exception {
String s1 = "";
if (c.value().lookup(longTuple2Tuple2._1).get(0)._1 != null)
s1 = c.value().lookup(longTuple2Tuple2._1).get(0)._1;
String s2 = "";
if (c.value().lookup(longTuple2Tuple2._1).get(0)._2 != null)
s2 = c.value().lookup(longTuple2Tuple2._1).get(0)._2;
return new Tuple2<Tuple2<Tuple2<String, String>, Long>, Tuple2<Tuple2<String, String>, Long>>(new Tuple2<Tuple2<String, String>, Long>(new Tuple2<String, String>(longTuple2Tuple2._2._1,longTuple2Tuple2._2._2),longTuple2Tuple2._1),new Tuple2<Tuple2<String, String>, Long>(new Tuple2<String, String>(s1,s2),longTuple2Tuple2._1));
}
});
//writeResult(rs, "rs.txt");
rs.coalesce(1,true).saveAsTextFile(path+ "rs");
但是,当我试着执行它时,它显示此:
INFO ShuffleMemoryManager: Thread 61 waiting for at least 1/2N of shuffle memory pool to be free
而且它不终止执行。你能否向我解释这个问题,我该如何解决这个问题。
预先感谢您。
在此命令
谢谢你的回答,但我有同样的问题与rs.coalesce(10,真正的).saveAsTextFile(路径+“rs”); – 2014-11-06 09:11:42
尝试 rs.saveAsTextFile(path +“rs”); – user1989252 2014-11-06 16:11:25
我有几个文件喜欢结果。但总是出现同样的问题 – 2014-11-12 10:15:52