我有一个基本的spark mllib程序,如下所示。未找到org.apache.spark.sql.types.SQLUserDefinedType类 - 继续存根
import org.apache.spark.mllib.clustering.KMeans
import org.apache.spark.SparkContext
import org.apache.spark.SparkConf
import org.apache.spark.mllib.linalg.Vectors
class Sample {
val conf = new SparkConf().setAppName("helloApp").setMaster("local")
val sc = new SparkContext(conf)
val data = sc.textFile("data/mllib/kmeans_data.txt")
val parsedData = data.map(s => Vectors.dense(s.split(' ').map(_.toDouble))).cache()
// Cluster the data into two classes using KMeans
val numClusters = 2
val numIterations = 20
val clusters = KMeans.train(parsedData, numClusters, numIterations)
// Export to PMML
println("PMML Model:\n" + clusters.toPMML)
}
我已经通过手动的IntelliJ都具有1.5.0版本添加spark-core
,spark-mllib
和spark-sql
到项目类路径。
我在运行程序时遇到下面的错误?任何想法有什么不对?
Error:scalac: error while loading Vector, Missing dependency 'bad symbolic reference. A signature in Vector.class refers to term types in package org.apache.spark.sql which is not available. It may be completely missing from the current classpath, or the version on the classpath might be incompatible with the version used when compiling Vector.class.', required by /home/fazlann/Downloads/spark-mllib_2.10-1.5.0.jar(org/apache/spark/mllib/linalg/Vector.class
你是什么意思的“手动添加”? –
我使用模块设置选项将jar添加到类路径 – DesirePRG