2016-01-17 14 views
0

如何在不将结果转换为DataFrame的情况下重命名count操作的列?如何重命名通过Apache Spark中的GroupedDataset操作创建的新列?

case class LogRow(id: String, location: String, time: Long) 
case class KeyValue(key: (String, String), value: Long) 

val log = LogRow("1", "a", 1) :: LogRow("1", "a", 2) :: LogRow("1", "b", 3) :: LogRow("1", "a", 4) :: LogRow("1", "b", 5) :: LogRow("1", "b", 6) :: LogRow("1", "c", 7) :: LogRow("2", "a", 1) :: LogRow("2", "b", 2) :: LogRow("2", "b", 3) :: LogRow("2", "a", 4) :: LogRow("2", "a", 5) :: LogRow("2", "a", 6) :: LogRow("2", "c", 7) :: Nil 
log.toDS().groupBy(l => { 
    (l.id, l.location) 
}).count().toDF().toDF("key", "value").as[KeyValue].show 

+-----+-----+ 
| key|value| 
+-----+-----+ 
|[1,a]| 3| 
|[1,b]| 3| 
|[1,c]| 1| 
|[2,a]| 4| 
|[2,b]| 2| 
|[2,c]| 1| 
+-----+-----+ 
+0

什么ü通过更改列是什么意思?改名? –

+0

对不起,是重命名。 –

回答

1

只是它映射到直接需要的类型:

log.toDS.groupBy(l => { 
    (l.id, l.location) 
}).count.as[KeyValue]