2017-08-10 145 views
0

对于外行人的另一个问题:斯卡拉斯卡拉 - 但类RDD是不变的类型T

两个似乎相同但不相同的RDD。具体如下:

val rdd0 = sc.parallelize(List("a", "b", "c", "d", "e")) 
val rdd1 = rdd0.map(x => (x, 110 - x.toCharArray()(0).toByte)) 
val rdd2 = sc.parallelize(List(("c", 2), ("d, 2)", ("e", 2), ("f", 2)))) 
//Seemingly the same type but not, how practically to get them to be UNIONed? 
val rddunion = rdd1.union(rdd2).collect() 

得到这样的:

<console>:182: error: type mismatch; 
found : org.apache.spark.rdd.RDD[Product with Serializable] 
required: org.apache.spark.rdd.RDD[(String, Int)] 
Note: Product with Serializable >: (String, Int), but class RDD is invariant in type T. 
You may wish to define T as -T instead. (SLS 4.5) 
    val rddunion = rdd1.union(rdd2).collect() 
          ^

如何得到这个对于新手工作。我现在可以看到为什么人们对Scala有点犹豫。阅读一些文档,但不完全清楚。如何让这个RDD联合工作?

非常感谢。

+0

谢谢,我真的很喜欢减价! – thebluephantom

回答

3

你在错误的地方写"("d, 2)"

所以不是

val rdd2 = sc.parallelize(List(("c", 2), ("d, 2)", ("e", 2), ("f", 2)))) 

正确的一个

val rdd2 = sc.parallelize(List(("c", 2), ("d", 2), ("e", 2), ("f", 2)))