2017-09-25 69 views
0

我试着跟随火花例子 - https://www.javaworld.com/article/2972863/big-data/open-source-java-projects-apache-spark.html如何将对象转换为Spark中的字符串?

但在这行获得编译错误 - > input.flatMap(S - > Arrays.asList(s.split( “”)));

Type mismatch cannot convert from JavaRDD<object> to JavaRDD<String> 

代码: -

public class WordCountTask 
{ 



    public static void wordCountJava8(String filename) 
    { 
     // Define a configuration to use to interact with Spark 
     SparkConf conf = new SparkConf().setMaster("local").setAppName("Work Count App"); 

     // Create a Java version of the Spark Context from the configuration 
     JavaSparkContext sc = new JavaSparkContext(conf); 

     // Load the input data, which is a text file read from the command line 
     JavaRDD<String> input = sc.textFile(filename); 

     // Java 8 with lambdas: split the input string into words 
     JavaRDD<String> words = input.flatMap(s -> Arrays.asList(s.split(" "))); 

     // Java 8 with lambdas: transform the collection of words into pairs (word and 1) and then count them 
     JavaPairRDD<String, Integer> counts = words.mapToPair(t -> new Tuple2(t, 1)).reduceByKey((x, y) -> (int)x + (int)y); 

     // Save the word count back out to a text file, causing evaluation. 
     counts.saveAsTextFile("output"); 
    } 

    public static void main(String[] args) 
    { 
     if(args.length == 0) 
     { 
      System.out.println("Usage: WordCount <file>"); 
      System.exit(0); 
     } 

     wordCountJava8(args[ 0 ]); 
    } 
} 
+0

你可以尝试从代码去除Arrays.asList JavaRDD 字= input.flatMap(S - > s.split( “”)); –

回答

0

flatMap函数返回集合的迭代器对象。

FlatMapFunction<T, R>: 
     Iterable<R> call(T t) 

替换 -

JavaRDD<String> words = input.flatMap(s -> Arrays.asList(s.split(" "))); 

有了:

JavaRDD<String> words = input.flatMap(s -> Arrays.asList(s.split(" ")).iterator()); 
1

您需要使用可迭代

JavaRDD<String> words = input.flatMap(
      new FlatMapFunction<String, String>() { public Iterable<String> call(String x) { 
       return Arrays.asList(x.split(" ")); 
      }}); 

或者

JavaRDD<String> words = input.flatMap(s -> Arrays.asList(s.split(" ")).iterator()); 
+0

第一种方法不会编译,期望的返回类型是Iterator。 –