我需要实施星火下面的SQL逻辑DataFrame
星火据帧巢式病例在声明
SELECT KEY,
CASE WHEN tc in ('a','b') THEN 'Y'
WHEN tc in ('a') AND amt > 0 THEN 'N'
ELSE NULL END REASON,
FROM dataset1;
我输入DataFrame
是如下:
val dataset1 = Seq((66, "a", "4"), (67, "a", "0"), (70, "b", "4"), (71, "d", "4")).toDF("KEY", "tc", "amt")
dataset1.show()
+---+---+---+
|KEY| tc|amt|
+---+---+---+
| 66| a| 4|
| 67| a| 0|
| 70| b| 4|
| 71| d| 4|
+---+---+---+
我有落实巢式病例当声明为:
dataset1.withColumn("REASON", when(col("tc").isin("a", "b"), "Y")
.otherwise(when(col("tc").equalTo("a") && col("amt").geq(0), "N")
.otherwise(null))).show()
+---+---+---+------+
|KEY| tc|amt|REASON|
+---+---+---+------+
| 66| a| 4| Y|
| 67| a| 0| Y|
| 70| b| 4| Y|
| 71| d| 4| null|
+---+---+---+------+
如果嵌套的when语句更进一步,则使用“otherwise”语句执行上述逻辑的可读性是很麻烦的。
有没有更好的方法来实现嵌套情况下的语句在Spark DataFrames
?