2015-03-31 32 views
3

由于我必须将Spark包含到Spark代码中,所以我想要求帮助找出解决此问题的方法,而无需删除log4j导入。必须包含log4J,但它在Apache Spark shell中导致错误。如何避免错误?

简单的代码如下:

:cp symjar/log4j-1.2.17.jar 
import org.apache.spark.rdd._ 

     val hadoopConf=sc.hadoopConfiguration; 
     hadoopConf.set("fs.s3n.impl", "org.apache.hadoop.fs.s3native.NativeS3FileSystem") 
     hadoopConf.set("fs.s3n.awsAccessKeyId","AKEY") 
     hadoopConf.set("fs.s3n.awsSecretAccessKey","SKEY") 
    val numOfProcessors = 2 
    val filePath = "s3n://SOMEFILE.csv" 
    var rdd = sc.textFile(filePath, numOfProcessors) 
    def doStuff(rdd: RDD[String]): RDD[String] = {rdd} 
    doStuff(rdd) 

首先,我得到这个错误:

error: error while loading StorageLevel, class file '/root/spark/lib/spark-assembly-1.3.0-hadoop1.0.4.jar(org/apache/spark/storage/StorageLevel.class)' has location not matching its contents: contains class StorageLevel 
error: error while loading Partitioner, class file '/root/spark/lib/spark-assembly-1.3.0-hadoop1.0.4.jar(org/apache/spark/Partitioner.class)' has location not matching its contents: contains class Partitioner 
error: error while loading BoundedDouble, class file '/root/spark/lib/spark-assembly-1.3.0-hadoop1.0.4.jar(org/apache/spark/partial/BoundedDouble.class)' has location not matching its contents: contains class BoundedDouble 
error: error while loading CompressionCodec, class file '/root/spark/lib/spark-assembly-1.3.0-hadoop1.0.4.jar(org/apache/hadoop/io/compress/CompressionCodec.class)' has location not matching its contents: contains class CompressionCodec 

然后,我再次运行此行,和错误自败:

var rdd = sc.textFile(filePath, numOfProcessors) 

但是,代码的最终结果是:

error: type mismatch; 
found : org.apache.spark.rdd.org.apache.spark.rdd.org.apache.spark.rdd.org.apache.spark.rdd.org.apache.spark.rdd.RDD[String] 
required: org.apache.spark.rdd.org.apache.spark.rdd.org.apache.spark.rdd.org.apache.spark.rdd.org.apache.spark.rdd.RDD[String] 
       doStuff(rdd) 
        ^

我该如何避免从导入中删除log4j并且没有得到上述错误? (这很重要,因为我使用log4j的jar很多,并且与Spark-Shell有冲突)。

回答

1

答案不仅仅是使用:cp命令,而且还需要将所有包含在.../spark/conf/spark-env.sh中的输出添加到SPARK_SUBMIT_CLASSPATH =“.../the /路径/ to/a.jar“

0

另一个答案是,如果使用IDE(如Scala for Eclipse和Maven),则将jar从Maven中排除。例如,我想排除ommons-codec(然后将不同版本作为JAR添加到项目中),并将这些更改添加到pom.xml中:

............... 
      <dependencies> 
          <dependency> 
           <groupId>org.apache.spark</groupId> 
           <artifactId>spark-core_2.11</artifactId> 
           <version>1.3.0</version> 
          <exclusions> 
          <exclusion> 
          <groupId>commons-codec</groupId> 
          <artifactId>commons-codec</artifactId> 
          <version>1.3</version> 
          </exclusion> 
          </exclusions> 
         </dependency> 
         </dependencies> 
...............