我试图在集群上启动Spark作业(Spark 1.4.0)。无论是从命令行还是Eclipse,我都会在Spark Utils类中得到关于withDummyCallSite
函数丢失的错误。在maven依赖关系中,我可以看到spark-core_2.10-1.4.0.jar被加载,应该包含这个函数。我正在运行Java 1.7,与之前编译代码的Java版本相同。我可以在Spark Master监视器上看到该作业已启动,因此它似乎不是防火墙问题。下面是我在控制台中看到的错误(这两个命令行和Eclipse):
ERROR 09:53:06,314 Logging.scala:75 -- Task 0 in stage 1.0 failed 4 times; aborting job
SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
java.lang.NoSuchMethodError: org.apache.spark.util.Utils$.withDummyCallSite(Lorg/apache/spark/SparkContext;Lscala/Function0;)Ljava/lang/Object;
at org.apache.spark.sql.parquet.ParquetRelation2.buildScan(newParquet.scala:269)
at org.apache.spark.sql.sources.HadoopFsRelation.buildScan(interfaces.scala:530)
at org.apache.spark.sql.sources.DataSourceStrategy$$anonfun$8.apply(DataSourceStrategy.scala:98)
at org.apache.spark.sql.sources.DataSourceStrategy$$anonfun$8.apply(DataSourceStrategy.scala:98)
at org.apache.spark.sql.sources.DataSourceStrategy$$anonfun$pruneFilterProject$1.apply(DataSourceStrategy.scala:266)
at org.apache.spark.sql.sources.DataSourceStrategy$$anonfun$pruneFilterProject$1.apply(DataSourceStrategy.scala:265)
at org.apache.spark.sql.sources.DataSourceStrategy$.pruneFilterProjectRaw(DataSourceStrategy.scala:296)
at org.apache.spark.sql.sources.DataSourceStrategy$.pruneFilterProject(DataSourceStrategy.scala:261)
at org.apache.spark.sql.sources.DataSourceStrategy$.apply(DataSourceStrategy.scala:94)
at org.apache.spark.sql.catalyst.planning.QueryPlanner$$anonfun$1.apply(QueryPlanner.scala:58)
at org.apache.spark.sql.catalyst.planning.QueryPlanner$$anonfun$1.apply(QueryPlanner.scala:58)
at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)
at org.apache.spark.sql.catalyst.planning.QueryPlanner.plan(QueryPlanner.scala:59)
at org.apache.spark.sql.catalyst.planning.QueryPlanner.planLater(QueryPlanner.scala:54)
at org.apache.spark.sql.execution.SparkStrategies$HashAggregation$.apply(SparkStrategies.scala:162)
at org.apache.spark.sql.catalyst.planning.QueryPlanner$$anonfun$1.apply(QueryPlanner.scala:58)
at org.apache.spark.sql.catalyst.planning.QueryPlanner$$anonfun$1.apply(QueryPlanner.scala:58)
at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)
at org.apache.spark.sql.catalyst.planning.QueryPlanner.plan(QueryPlanner.scala:59)
at org.apache.spark.sql.SQLContext$QueryExecution.sparkPlan$lzycompute(SQLContext.scala:932)
at org.apache.spark.sql.SQLContext$QueryExecution.sparkPlan(SQLContext.scala:930)
at org.apache.spark.sql.SQLContext$QueryExecution.executedPlan$lzycompute(SQLContext.scala:936)
at org.apache.spark.sql.SQLContext$QueryExecution.executedPlan(SQLContext.scala:936)
at org.apache.spark.sql.DataFrame.collect(DataFrame.scala:1255)
at org.apache.spark.sql.DataFrame.count(DataFrame.scala:1269)
(截断日志为简洁起见)
预先感谢任何指针!
感谢您的响应。我再次检查了spark-core_2.10-1.4.0.jar的内容(通过我在pom.xlm中指定的依赖项下载),并且使用了DummyCallSite函数确实缺少。我从http://mvnrepository.com手动下载了这个jar,并替换了jar,现在问题消失了。它是完全相同的版本(2.10-1.4.0),不知道为什么该功能首先失踪。 – bbtus
没问题!很高兴听到手动更换jar解决了您的问题。实际上,在这种情况下,您可以直接从本地cahce/repo中删除依赖项的文件夹结构(例如,C:/ Users/{您的用户名} /。m2/reposirty/org/apache/spark)。在删除这个jar之后,maven从你的远程仓库下载新的副本,你可以在你的本地仓库中获得指定版本的新副本。 – asg