我在火花一个模式作为如何在重新计算后替换spark数据框中的值?
root
|-- atom: array (nullable = true)
| |-- element: struct (containsNull = true)
| | |-- dailydata: array (nullable = true)
| | | |-- element: struct (containsNull = true)
| | | | |-- datatimezone: string (nullable = true)
| | | | |-- intervaltime: long (nullable = true)
| | | | |-- intervalvalue: long (nullable = true)
| | | | |-- utcacquisitiontime: string (nullable = true)
| | |-- usage: string (nullable = true)
| -- titlename: string (nullable = true)
我已提取的utcacquisitiontime
和从以上模式datatimezone
如下
val result=q.selectExpr("explode(dailydata) as r").select("r.utcacquisitiontime","r.datatimezone")
+--------------------+------------+
| utcacquisitiontime|datatimezone|
+--------------------+------------+
|2017-03-27T22:00:00Z| +02:00|
|2017-03-27T22:15:00Z| +02:00|
|2017-03-27T22:30:00Z| +02:00|
|2017-03-27T22:45:00Z| +02:00|
|2017-03-27T23:00:00Z| +02:00|
|2017-03-27T23:15:00Z| +02:00|
|2017-03-27T23:30:00Z| +02:00|
|2017-03-27T23:45:00Z| +02:00|
|2017-03-28T00:00:00Z| +02:00|
|2017-03-28T00:15:00Z| +02:00|
|2017-03-28T00:30:00Z| +02:00|
|2017-03-28T00:45:00Z| +02:00|
|2017-03-28T01:00:00Z| +02:00|
|2017-03-28T01:15:00Z| +02:00|
|2017-03-28T01:30:00Z| +02:00|
|2017-03-28T01:45:00Z| +02:00|
|2017-03-28T02:00:00Z| +02:00|
|2017-03-28T02:15:00Z| +02:00|
|2017-03-28T02:30:00Z| +02:00|
|2017-03-28T02:45:00Z| +02:00|
+--------------------+------------+
我需要使用这两个列计算localtime
和由localtime
替换它们经过计算。我应该如何计算localtime
并替换它?
可以使用'withColumn'方法上数据帧,并使用[火花功能](https://spark.apache.org/docs/2.0.2/api/java/o rg/apache/spark/sql/functions.html) –