2015-11-08 54 views
0

我有一个基本的spark mllib程序,如下所示。未找到org.apache.spark.sql.types.SQLUserDefinedType类 - 继续存根

import org.apache.spark.mllib.clustering.KMeans 

import org.apache.spark.SparkContext 
import org.apache.spark.SparkConf 
import org.apache.spark.mllib.linalg.Vectors 


class Sample { 
    val conf = new SparkConf().setAppName("helloApp").setMaster("local") 
    val sc = new SparkContext(conf) 
    val data = sc.textFile("data/mllib/kmeans_data.txt") 
    val parsedData = data.map(s => Vectors.dense(s.split(' ').map(_.toDouble))).cache() 

    // Cluster the data into two classes using KMeans 
    val numClusters = 2 
    val numIterations = 20 
    val clusters = KMeans.train(parsedData, numClusters, numIterations) 

    // Export to PMML 
    println("PMML Model:\n" + clusters.toPMML) 
} 

我已经通过手动的IntelliJ都具有1.5.0版本添加spark-corespark-mllibspark-sql到项目类路径。

我在运行程序时遇到下面的错误?任何想法有什么不对?

Error:scalac: error while loading Vector, Missing dependency 'bad symbolic reference. A signature in Vector.class refers to term types in package org.apache.spark.sql which is not available. It may be completely missing from the current classpath, or the version on the classpath might be incompatible with the version used when compiling Vector.class.', required by /home/fazlann/Downloads/spark-mllib_2.10-1.5.0.jar(org/apache/spark/mllib/linalg/Vector.class

+0

你是什么意思的“手动添加”? –

+0

我使用模块设置选项将jar添加到类路径 – DesirePRG

回答

1

DesirePRG。我遇到了和你一样的问题。解决方法是导入一些组装火花和哈多普的罐子,如spark-assembly-1.4.1-hadoop2.4.0.jar,那么它可以正常工作。