2014-10-29 484 views
5

我正在尝试创建case类对象的RDD。 。例如,spark创建RDD时找不到RDD类型

// sqlContext from the previous example is used in this example. 
// createSchemaRDD is used to implicitly convert an RDD to a SchemaRDD. 
import sqlContext.createSchemaRDD 

val people: RDD[Person] = ... // An RDD of case class objects, from the previous example. 

// The RDD is implicitly converted to a SchemaRDD by createSchemaRDD, allowing it to be stored using  Parquet. 
people.saveAsParquetFile("people.parquet") 

我试着给

case class Person(name: String, age: Int) 

    // Create an RDD of Person objects and register it as a table. 
    val people: RDD[Person] = sc.textFile("/user/root/people.txt").map(_.split(",")).map(p => Person(p(0), p(1).trim.toInt)) 
    people.registerAsTable("people") 

我收到以下错误,完成部分从前面的例子:

<console>:28: error: not found: type RDD 
     val people: RDD[Person] =sc.textFile("/user/root/people.txt").map(_.split(",")).map(p => Person(p(0), p(1).trim.toInt)) 

上出了什么问题的任何想法? 在此先感谢!

回答

21

这里的问题是明确的RDD[String]类型注释。它看起来像RDD默认情况下不在spark-shell中导入,这就是为什么Scala抱怨它无法找到RDD类型。先尝试运行import org.apache.spark.rdd.RDD

+0

谢谢你,乔希。 – user1189851 2014-10-29 16:49:46