2014-11-21 191 views
1
2014-11-21 19:05:37,532 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource: Resource hdfs://hadoop-master.nycloudlab.internal:8020/user/admin/.staging/job_1415362431963_0311/libjars/hbase-hadoop-compat.jar(->/yarn/nm/usercache/admin/filecache/1513/hbase-hadoop-compat.jar) transitioned from INIT to LOCALIZED 
2014-11-21 19:05:37,542 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl: Recovering application application_1415362431963_0302 
2014-11-21 19:05:37,554 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application: Application application_1415362431963_0302 transitioned from NEW to INITING 
2014-11-21 19:05:37,578 INFO org.apache.hadoop.service.AbstractService: Service org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl failed in state INITED; cause: java.lang.NullPointerException 
java.lang.NullPointerException 
     at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.recoverContainer(ContainerManagerImpl.java:289) 
     at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.recover(ContainerManagerImpl.java:252) 
     at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.serviceInit(ContainerManagerImpl.java:235) 
     at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) 
     at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107) 
     at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:250) 
     at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) 
     at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:445) 
     at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:492) 
2014-11-21 19:05:37,588 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl: Applications still running : [application_1415362431963_0302] 
2014-11-21 19:05:37,588 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService: org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService waiting for pending aggregation during exit 
2014-11-21 19:05:37,589 INFO org.apache.hadoop.service.AbstractService: Service NodeManager failed in state INITED; cause: java.lang.NullPointerException 
java.lang.NullPointerException 
     at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.recoverContainer(ContainerManagerImpl.java:289) 
     at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.recover(ContainerManagerImpl.java:252) 
     at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.serviceInit(ContainerManagerImpl.java:235) 
     at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) 
     at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107) 
     at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:250) 
     at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) 
     at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:445) 
     at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:492) 
2014-11-21 19:05:37,590 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Stopping NodeManager metrics system... 
2014-11-21 19:05:37,591 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NodeManager metrics system stopped. 
2014-11-21 19:05:37,591 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NodeManager metrics system shutdown complete. 
2014-11-21 19:05:37,591 FATAL org.apache.hadoop.yarn.server.nodemanager.NodeManager: Error starting NodeManager 
java.lang.NullPointerException 
     at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.recoverContainer(ContainerManagerImpl.java:289) 
     at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.recover(ContainerManagerImpl.java:252) 
     at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.serviceInit(ContainerManagerImpl.java:235) 
     at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) 
     at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107) 
     at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:250) 
     at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) 
     at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:445) 
     at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:492) 
2014-11-21 19:05:37,593 INFO org.apache.hadoop.yarn.server.nodemanager.NodeManager: SHUTDOWN_MSG: 
+1

发现存在有报道此 https://issues.apache.org/jira/browse/YARN-2816 删去修正这一问题的/ tmp/Hadoop的纱/纱纳米恢复JIRA问题 LevelDB永远不会写入:它始终附加到日志文件,或将现有文件合并在一起以生成新文件。因此,操作系统崩溃会导致部分写入的日志记录(或几个部分写入的日志记录)。 LevelDB恢复代码使用校验和来检测,并跳过不完整的记录。 – 2014-11-21 17:22:10

回答

2

通过删除/ tmp/hadoop-yarn/yarn-nm-recovery修复了这个问题。 LevelDB永远不会写入。 它总是附加到日志文件。

相关问题