2017-10-11 82 views
1

您好我是一个小白到斯卡拉和IntelliJ,我只是想这样做的斯卡拉:为什么Spark-xml失败,NoSuchMethodError与Spark 2.0.0依赖关系?

import org.apache.spark 
import org.apache.spark.sql.SQLContext 
import com.databricks.spark.xml.XmlReader 


object SparkSample { 
    def main(args: Array[String]): Unit = { 
    val conf = new spark.SparkConf() 
    conf.setAppName("Datasets Test") 
    conf.setMaster("local[2]") 
    val sc = new spark.SparkContext(conf) 

    val sqlContext = new SQLContext(sc) 
    val df = sqlContext.read 
     .format("com.databricks.spark.xml") 
     .option("rowTag", "shop") 
     .load("shops.xml") /* NoSuchMethod error here */ 

    val selectedData = df.select("author", "_id") 
    df.show 
} 

基本上我试图XML转换成数据帧火花 我得到在” .load一个NoSuchMethod错误( “shops.xml”)” 下面是SBT

version := "0.1" 

scalaVersion := "2.11.3" 
val sparkVersion = "2.0.0" 
val sparkXMLVersion = "0.3.3" 

libraryDependencies ++= Seq(
    "org.apache.spark"  %% "spark-core"  % sparkVersion exclude("jline", "2.12"), 
    "org.apache.spark"  %% "spark-sql"  % sparkVersion excludeAll(ExclusionRule(organization = "jline"),ExclusionRule("name","2.12")), 
    "com.databricks"  %% "spark-xml"  % sparkXMLVersion, 
) 

下面是跟踪:

Exception in thread "main" java.lang.NoSuchMethodError: org.apache.spark.sql.types.DecimalType$.Unlimited()Lorg/apache/spark/sql/types/DecimalType; 
at com.databricks.spark.xml.util.InferSchema$.<init>(InferSchema.scala:50) 
at com.databricks.spark.xml.util.InferSchema$.<clinit>(InferSchema.scala) 
at com.databricks.spark.xml.XmlRelation$$anonfun$1.apply(XmlRelation.scala:46) 
at com.databricks.spark.xml.XmlRelation$$anonfun$1.apply(XmlRelation.scala:46) 
at scala.Option.getOrElse(Option.scala:120) 
at com.databricks.spark.xml.XmlRelation.<init>(XmlRelation.scala:45) 
at com.databricks.spark.xml.DefaultSource.createRelation(DefaultSource.scala:66) 
at com.databricks.spark.xml.DefaultSource.createRelation(DefaultSource.scala:44) 
at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:315) 
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:149) 
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:132) 

有人能指出这个错误吗?看起来像是一个依赖性问题。 火花核心似乎是工作的罚款,但没有火花SQL 我有斯卡拉2.12之前,但改为2.11,因为火花核心没有得到解决

+0

[error:object xml is not a member of package com.databricks.spark](https://stackoverflow.com/questions/46369452/error-object-xml-is-not-a-member-的封装-COM-databricks火花) – Pavel

回答

1

TL;博士我认为这是一个Scala的版本不匹配的问题。使用spark-xml 0.4.1

引用火花XML的Requirements(突出矿井):

This library requires Spark 2.0+ for 0.4.x.

For version that works with Spark 1.x, please check for branch-0.3.

那对我说,火花XML 0.3.3作品星火1.x的(不星火2.0.0您请求)。