我有1个主站和6个使用hadoop 2.6.0和spark 1.6.2的预建版本的集群。我正在运行hadoop MR和spark作业,而没有在所有节点上安装openjdk 7时出现任何问题。但是,当我在所有节点上将openjdk 7升级到openjdk 8时,会引发提交和spark-shell导致的错误。运行的纱线与火花不与Java一起工作8
16/08/17 14:06:22 ERROR client.TransportClient: Failed to send RPC 4688442384427245199 to /xxx.xxx.xxx.xx:42955: java.nio.channels.ClosedChannelExce ption
java.nio.channels.ClosedChannelException
16/08/17 14:06:22 WARN netty.NettyRpcEndpointRef: Error sending message [message = RequestExecutors(0,0,Map())] in 1 attempts
org.apache.spark.SparkException: Exception thrown in awaitResult
at org.apache.spark.rpc.RpcTimeout$$anonfun$1.applyOrElse(RpcTimeout.scala:77)
at org.apache.spark.rpc.RpcTimeout$$anonfun$1.applyOrElse(RpcTimeout.scala:75)
at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:36)
at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:59)
at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:59)
at scala.PartialFunction$OrElse.apply(PartialFunction.scala:167)
at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:83)
at org.apache.spark.rpc.RpcEndpointRef.askWithRetry(RpcEndpointRef.scala:102)
at org.apache.spark.rpc.RpcEndpointRef.askWithRetry(RpcEndpointRef.scala:78)
at org.apache.spark.scheduler.cluster.YarnSchedulerBackend$YarnSchedulerEndpoint$$anonfun$receiveAndReply$1$$anonfun$applyOrElse$1.apply$m cV$sp(YarnSchedulerBackend.scala:271)
at org.apache.spark.scheduler.cluster.YarnSchedulerBackend$YarnSchedulerEndpoint$$anonfun$receiveAndReply$1$$anonfun$applyOrElse$1.apply(Y arnSchedulerBackend.scala:271)
at org.apache.spark.scheduler.cluster.YarnSchedulerBackend$YarnSchedulerEndpoint$$anonfun$receiveAndReply$1$$anonfun$applyOrElse$1.apply(Y arnSchedulerBackend.scala:271)
at scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
at scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: Failed to send RPC 4688442384427245199 to /xxx.xxx.xxx.xx:42955: java.nio.channels.ClosedChannelException
at org.apache.spark.network.client.TransportClient$3.operationComplete(TransportClient.java:239)
at org.apache.spark.network.client.TransportClient$3.operationComplete(TransportClient.java:226)
at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:680)
at io.netty.util.concurrent.DefaultPromise$LateListeners.run(DefaultPromise.java:845)
at io.netty.util.concurrent.DefaultPromise$LateListenerNotifier.run(DefaultPromise.java:873)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:357)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:357)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
... 1 more
Caused by: java.nio.channels.ClosedChannelException
16/08/17 14:06:22 ERROR spark.SparkContext: Error initializing SparkContext.
java.lang.IllegalStateException: Spark context stopped while waiting for backend
at org.apache.spark.scheduler.TaskSchedulerImpl.waitBackendReady(TaskSchedulerImpl.scala:581)
at org.apache.spark.scheduler.TaskSchedulerImpl.postStartHook(TaskSchedulerImpl.scala:162)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:549)
at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:58)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:240)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
at py4j.Gateway.invoke(Gateway.java:236)
at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:80)
at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:69)
at py4j.GatewayConnection.run(GatewayConnection.java:211)
at java.lang.Thread.run(Thread.java:745)
Traceback (most recent call last):
File "/home/hd_spark/spark2/python/pyspark/shell.py", line 49, in <module>
spark = SparkSession.builder.getOrCreate()
File "/home/hd_spark/spark2/python/pyspark/sql/session.py", line 169, in getOrCreate
sc = SparkContext.getOrCreate(sparkConf)
File "/home/hd_spark/spark2/python/pyspark/context.py", line 294, in getOrCreate
SparkContext(conf=conf or SparkConf())
File "/home/hd_spark/spark2/python/pyspark/context.py", line 115, in __init__
conf, jsc, profiler_cls)
File "/home/hd_spark/spark2/python/pyspark/context.py", line 168, in _do_init
self._jsc = jsc or self._initialize_context(self._conf._jconf)
File "/home/hd_spark/spark2/python/pyspark/context.py", line 233, in _initialize_context
return self._jvm.JavaSparkContext(jconf)
File "/home/hd_spark/spark2/python/lib/py4j-0.10.1-src.zip/py4j/java_gateway.py", line 1183, in __call__
File "/home/hd_spark/spark2/python/lib/py4j-0.10.1-src.zip/py4j/protocol.py", line 312, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling None.org.apache.spark.api.java.JavaSparkContext.
: java.lang.IllegalStateException: Spark context stopped while waiting for backend
at org.apache.spark.scheduler.TaskSchedulerImpl.waitBackendReady(TaskSchedulerImpl.scala:581)
at org.apache.spark.scheduler.TaskSchedulerImpl.postStartHook(TaskSchedulerImpl.scala:162)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:549)
at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:58)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:240)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
at py4j.Gateway.invoke(Gateway.java:236)
at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:80)
at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:69)
at py4j.GatewayConnection.run(GatewayConnection.java:211)
at java.lang.Thread.run(Thread.java:745)
我已出口JAVA_HOME上的.bashrc并且已经设置了使用
sudo update-alternatives --config java
sudo update-alternatives --config javac
这些命令的openjdk 8为默认的java。此外,我已经尝试与甲骨文Java 8和相同的错误出现。从属节点上的容器日志具有相同的错误,如下所示。
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/tmp/hadoop-hd_spark/nm-local-dir/usercache/hd_spark/filecache/17/__spark_libs__8247267244939901627.zip/slf4j-log4j12-1.7.16.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/local/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
16/08/17 14:05:11 INFO executor.CoarseGrainedExecutorBackend: Started daemon with process name: [email protected]
16/08/17 14:05:11 INFO util.SignalUtils: Registered signal handler for TERM
16/08/17 14:05:11 INFO util.SignalUtils: Registered signal handler for HUP
16/08/17 14:05:11 INFO util.SignalUtils: Registered signal handler for INT
16/08/17 14:05:11 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
16/08/17 14:05:11 INFO spark.SecurityManager: Changing view acls to: hd_spark
16/08/17 14:05:11 INFO spark.SecurityManager: Changing modify acls to: hd_spark
16/08/17 14:05:11 INFO spark.SecurityManager: Changing view acls groups to:
16/08/17 14:05:11 INFO spark.SecurityManager: Changing modify acls groups to:
16/08/17 14:05:11 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(hd_spark); groups with view permissions: Set(); users with modify permissions: Set(hd_spark); groups with modify permissions: Set()
16/08/17 14:05:12 INFO client.TransportClientFactory: Successfully created connection to /xxx.xxx.xxx.xx:37417 after 78 ms (0 ms spent in bootstraps)
16/08/17 14:05:12 INFO spark.SecurityManager: Changing view acls to: hd_spark
16/08/17 14:05:12 INFO spark.SecurityManager: Changing modify acls to: hd_spark
16/08/17 14:05:12 INFO spark.SecurityManager: Changing view acls groups to:
16/08/17 14:05:12 INFO spark.SecurityManager: Changing modify acls groups to:
16/08/17 14:05:12 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(hd_spark); groups with view permissions: Set(); users with modify permissions: Set(hd_spark); groups with modify permissions: Set()
16/08/17 14:05:12 INFO client.TransportClientFactory: Successfully created connection to /xxx.xxx.xxx.xx:37417 after 1 ms (0 ms spent in bootstraps)
16/08/17 14:05:12 INFO storage.DiskBlockManager: Created local directory at /tmp/hadoop-hd_spark/nm-local-dir/usercache/hd_spark/appcache/application_1471352972661_0005/blockmgr-d9f23a56-1420-4cd4-abfd-ae9e128c688c
16/08/17 14:05:12 INFO memory.MemoryStore: MemoryStore started with capacity 366.3 MB
16/08/17 14:05:12 INFO executor.CoarseGrainedExecutorBackend: Connecting to driver: spark://[email protected]:37417
16/08/17 14:05:13 ERROR executor.CoarseGrainedExecutorBackend: RECEIVED SIGNAL TERM
16/08/17 14:05:13 INFO storage.DiskBlockManager: Shutdown hook called
16/08/17 14:05:13 INFO util.ShutdownHookManager: Shutdown hook called
我试图与火花1.6.2预建版本,2.0火花预建版本,并且还通过建立它自己试着用火花2.0。
即使在升级到java 8之后,Hadoop作业仍能正常工作。当我切换回到java 7时,spark工作正常。
我的scala版本是2.11,OS是Ubuntu 14.04.4 LTS。
如果有人能给我一个解决这个问题的想法,这将是非常好的。
谢谢!
ps我在日志中将我的IP地址更改为xxx.xxx.xxx.xx。
貌似工作人员的下列属性克服这种尝试连接到驱动程序,但失败:'16/08/17 14:05:12 INFO executor.CoarseGrainedExecutorBackend:连接到驱动程序:spark://[email protected]:37417 16/08/17 14:05:13 ERROR executor.CoarseGrainedExecutorBackend:RECEIVED SIGNAL TERM'。司机日志说什么? –
我在哪里可以找到驱动程序日志?我在hadoop/logs/userlog目录中发现了工作节点日志,但在主节点中找不到与驱动程序相关的任何日志。在spark/logs目录中,只有历史服务器日志和主节点中的hadoop/logs/userlog为空。谢谢! – jmoa
http://spark.apache.org/docs/latest/running-on-yarn.html –