2017-08-29 114 views
1

我正在使用apache toree(来自github的版本)。当我试图对postgresql表执行查询时,我得到了间歇性的scala编译器错误(当我运行同一个单元两次,错误消失,代码运行良好)。无法访问包中的AnyRef Scala

我在寻找如何调试这些错误的建议。错误看起来很奇怪(它们出现在标准输出的笔记本上)。

error: missing or invalid dependency detected while loading class file 'QualifiedTableName.class'. 
Could not access type AnyRef in package scala, 
because it (or its dependencies) are missing. Check your build definition for 
missing or conflicting dependencies. (Re-run with `-Ylog-classpath` to see the problematic classpath.) 
A full rebuild may help if 'QualifiedTableName.class' was compiled against an incompatible version of scala. 
error: missing or invalid dependency detected while loading class file 'FunctionIdentifier.class'. 
Could not access type AnyRef in package scala, 
because it (or its dependencies) are missing. Check your build definition for 
missing or conflicting dependencies. (Re-run with `-Ylog-classpath` to see the problematic classpath.) 
A full rebuild may help if 'FunctionIdentifier.class' was compiled against an incompatible version of scala. 
error: missing or invalid dependency detected while loading class file 'DefinedByConstructorParams.class'. 
... 

代码简单:从一个postgres表中提取一个数据集:

%AddDeps org.postgresql postgresql 42.1.4 --transitive 
val props = new java.util.Properties(); 
props.setProperty("driver","org.postgresql.Driver"); 
val df = spark.read.jdbc(url = "jdbc:postgresql://postgresql/database?user=user&password=password", 
       table = "table", predicates = Array("1=1"), connectionProperties = props) 
df.show() 

我检查为明显(二者toree和Apache火花使用阶2.11.8,我建立阿帕奇toree与APACHE_SPARK_VERSION = 2.2.0这是相同的我donwloaded火花)

作为参考,这是我用于设置toree和火花Dockerfile的一部分:

RUN wget https://d3kbcqa49mib13.cloudfront.net/spark-2.2.0-bin-hadoop2.7.tgz && tar -zxf spark-2.2.0-bin-hadoop2.7.tgz && chmod -R og+rw /opt/spark-2.2.0-bin-hadoop2.7 && chown -R a1414.a1414 /opt/spark-2.2.0-bin-hadoop2.7 
RUN (curl https://bintray.com/sbt/rpm/rpm > /etc/yum.repos.d/bintray-sbt-rpm.repo) 
RUN yum -y install --nogpgcheck sbt 
RUN (unset http_proxy; unset https_proxy; yum -y install --nogpgcheck java-1.8.0-openjdk-devel.i686) 
RUN (git clone https://github.com/apache/incubator-toree && cd incubator-toree && make clean release APACHE_SPARK_VERSION=2.2.0 ; exit 0) 
RUN (. /opt/rh/rh-python35/enable; cd /opt/incubator-toree/dist/toree-pip ;python setup.py install) 
RUN (. /opt/rh/rh-python35/enable; jupyter toree install --spark_home=/opt/spark-2.2.0-bin-hadoop2.7 --interpreters=Scala) 
+0

可能取决于不同的版本的Scala的(IIb)的 – cchantep

+0

即库之间的冲突我也喜欢,但奇怪的是,它有时可以工作。有没有办法找出答案?我试图用sbt制作一个依赖树,它没有显示多个版本的scala lib:https://gist.github.com/anonymous/1ea6f24a30ac77a2252884227b88d522 – kervel

+0

检查依赖关系树 – cchantep

回答

0

正如在cchantep的评论中所说的,您可能使用的是不同于用于构建Spark的Scala版本。

的最简单的解决方案是检查哪一个被用于由火花,并切换到这一个,例如,在Mac:

brew switch scala 2.11.8