0
我有一个DataFrame包含由VectorAssembler创建的特征向量,它也包含空值。我现在想用一个载体来代替空值:火花填充DataFrame与矢量为null
val nil = Vectors.dense(1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0,1.0, 1.0, 1.0, 1.0, 1.0,1.0, 1.0, 1.0, 1.0, 1.0)
df.na.fill(nil) // does not work.
什么是做到这一点的正确方法?
编辑: 我发现要归功于回答道:
val nil = Vectors.dense(1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0,1.0, 1.0, 1.0, 1.0, 1.0,1.0, 1.0, 1.0, 1.0, 1.0)
import sc.implicits._
var fill = Seq(Tuple1(nil)).toDF("replacement")
val dates = data.schema.fieldNames.filter(e => e.contains("1"))
data = data.crossJoin(broadcast(fill))
for(e <- dates){
data = data.withColumn(e, coalesce(data.col(e), $"replacement"))
}
data = data.drop("replacement")