2016-07-26 120 views
1

我使用下面的命令泊坞运行火花:星火的NoSuchMethodError

docker run -it \ 
    -p 8088:8088 -p 8042:8042 -p 50070:50070 \ 
    -v "$(PWD)"/log4j.properties:/usr/local/spark/conf/log4j.properties \ 
    -v "$(PWD)":/app -h sandbox sequenceiq/spark:1.6.0 bash 

运行spark-submit --version报告版本1.6.0

我​​命令如下:

spark-submit --class io.jobi.GithubDay \ 
    --master local[*] \ 
    --name "Daily Github Push Counter" \ 
    /app/min-spark_2.11-1.0.jar \ 
    "file:///app/data/github-archive/*.json" \ 
    "/app/data/ghEmployees.txt" \ 
    "file:///app/data/emp-gh-push-output" "json" 

build.sbt

name := """min-spark""" 

version := "1.0" 

scalaVersion := "2.11.7" 

lazy val sparkVersion = "1.6.0" 

libraryDependencies ++= Seq(
    "org.apache.spark" %% "spark-core" % sparkVersion % "provided", 
    "org.apache.spark" %% "spark-sql" % sparkVersion % "provided" 
) 

// Change this to another test framework if you prefer 
libraryDependencies += "org.scalatest" %% "scalatest" % "2.2.4" % "test" 

GithubDay.scala

package io.jobi 

import org.apache.spark.sql.SQLContext 
import org.apache.spark.{SparkConf, SparkContext} 

import scala.io.Source.fromFile 

/** 
    * Created by hammer on 7/15/16. 
    */ 
object GithubDay { 

    def main(args: Array[String]): Unit = { 
    println("Application arguments: ") 
    args.foreach(println) 

    val conf = new SparkConf() 
    val sc = new SparkContext(conf) 
    val sqlContext = new SQLContext(sc) 


    try { 
     println("args(0): " + args(0)) 

     val ghLog = sqlContext.read.json(args(0)) 
     val pushes = ghLog.filter("type = 'PushEvent'") 
     val grouped = pushes.groupBy("actor.login").count() 
     val ordered = grouped.orderBy(grouped("count").desc) 

     val employees = Set() ++ (
     for { 
      line <- fromFile(args(1)).getLines() 
     } yield line.trim 
    ) 

     val bcEmployees = sc.broadcast(employees) 

     import sqlContext.implicits._ 
     println("register function") 
     val isEmployee = sqlContext.udf.register("SetContainsUdf", (u: String) => bcEmployees.value.contains(u)) 
     println("registered udf") 
     val filtered = ordered.filter(isEmployee($"login")) 
     println("applied filter") 

     filtered.write.format(args(3)).save(args(2)) 
    } finally { 
     sc.stop() 
    } 
    } 
} 

我建立使用sbt clean package但输出当我运行它是:

Application arguments: 
file:///app/data/github-archive/*.json 
/app/data/ghEmployees.txt 
file:///app/data/emp-gh-push-output 
json 
args(0): file:///app/data/github-archive/*.json 
imported implicits 
defined isEmp 
register function 
Exception in thread "main" java.lang.NoSuchMethodError: scala.reflect.api.JavaUniverse.runtimeMirror(Ljava/lang/ClassLoader;)Lscala/reflect/api/JavaUniverse$JavaMirror; 
    at io.jobi.GithubDay$.main(GithubDay.scala:53) 
    at io.jobi.GithubDay.main(GithubDay.scala) 
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) 
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) 
    at java.lang.reflect.Method.invoke(Method.java:606) 
    at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731) 
    at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181) 
    at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206) 
    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121) 
    at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) 

从我读过NoSuchMethodError是不兼容版本的结果,但我正在使用1.6.0进行构建并部署到1.6.0,所以我不明白发生了什么。

回答

2

除非您自己编译了Spark,否则开箱即用的版本1.6.0将与Scala 2.10.x一起编译。 This is stated in the docs(它说1.6.2,但也与1.6.0相关):

Spark在Java 7+,Python 2.6+和R 3.1+上运行。对于Scala API, Spark 1.6.2使用Scala 2.10。您将需要使用兼容的Scala 版本(2.10.x)。

你想:

scalaVersion := "2.10.6" 

一个提示到的是,该错误是在斯卡拉类:scala.reflect.api.JavaUniverse.runtimeMirror(Ljava/lang/ClassLoader;)