2017-08-15 83 views
0

我有一个数据帧:星火数据框中小数精度

val groupby = df.groupBy($"column1",$"Date")  
.agg(sum("amount").as("amount")) 
.orderBy($"column1",desc("cob_date")) 

当applyin窗函数添加新列差:

val windowspec= Window.partitionBy("column1").orderBy(desc("DATE")) 

groupby.withColumn("diffrence" ,lead($"amount", 1,0).over(windowspec)).show() 


+--------+------------+-----------+--------------------------+ 
| Column | Date  | Amount | Difference    | 
+--------+------------+-----------+--------------------------+ 
| A  | 3/31/2017 | 12345.45 | 3456.540000000000000000 | 
+--------+------------+-----------+--------------------------+ 
| A  | 2/28/2017 | 3456.54 | 34289.430000000000000000 | 
+--------+------------+-----------+--------------------------+ 
| A  | 1/31/2017 | 34289.43 | 45673.987000000000000000 | 
+--------+------------+-----------+--------------------------+ 
| A  | 12/31/2016 | 45673.987 | 0.00E+00     | 
+--------+------------+-----------+--------------------------+ 

我越来越小数与尾部零。当我做printSchema()获取数据类型的差异:decimal(38,18)。有人告诉我如何将数据类型更改为decimal(38,2)或删除尾随零

+0

请看https://spark.apache.org/docs/1.6.2/api/java/org/apache/spark/sql/ functions.html#format_number(org.apache.spark.sql.Column,%20int) –

回答

1

你可以像下面的具体小数的大小投的数据,

铅($ “量”,1,0).over(windowspec).cast(DataTypes.createDecimalType(32,2))

+0

这对我有效 –

0

在纯SQL中,可以使用公知技术:

SELECT ceil(100 * column_name_double)/100 AS cost ...