2013-03-08 66 views
0

我试图连接到远程HDFS群集。我已阅读了一些文档并开始使用,但没有找到最佳解决方案。 情况:我在xxx-something.com上有HDFS。我可以通过SSH连接到它,一切正常。Java/Scala远程HDFS使用情况

但是我试图做的,从它获取文件到我的本地机器。

我做了什么:

我创建核心的site.xml在我的conf文件夹(我创建游戏应用!)。在那里,我已经将fs.default.name配置更改为hdfs://xxx-something.com:8020(不确定端口)。 然后我试图推出一个简单的测试:

val conf = new Configuration() 
conf.addResource(new Path("conf/core-site.xml")) 
val fs = FileSystem.get(conf) 
val status = fs.listStatus(new Path("/data/")) 

而且我得到错误:

13:56:09.012 [specs2.DefaultExecutionStrategy1] WARN org.apache.hadoop.conf.Configuration - conf/core-site.xml:a attempt to override final parameter: fs.trash.interval; Ignoring. 
13:56:09.012 [specs2.DefaultExecutionStrategy1] WARN org.apache.hadoop.conf.Configuration - conf/core-site.xml:a attempt to override final parameter: hadoop.tmp.dir; Ignoring. 
13:56:09.013 [specs2.DefaultExecutionStrategy1] WARN org.apache.hadoop.conf.Configuration - conf/core-site.xml:a attempt to override final parameter: fs.checkpoint.dir; Ignoring. 
13:56:09.022 [specs2.DefaultExecutionStrategy1] DEBUG org.apache.hadoop.fs.FileSystem - Creating filesystem for hdfs://xxx-something.com:8021 
13:56:09.059 [specs2.DefaultExecutionStrategy1] DEBUG org.apache.hadoop.conf.Configuration - java.io.IOException: config() 
    at org.apache.hadoop.conf.Configuration.<init>(Configuration.java:226) 
    at org.apache.hadoop.conf.Configuration.<init>(Configuration.java:213) 
    at org.apache.hadoop.security.SecurityUtil.<clinit>(SecurityUtil.java:53) 
    at org.apache.hadoop.net.NetUtils.<clinit>(NetUtils.java:62) 

提前感谢!

更新: 可能是错误的港口。现在我将它设置为22,我仍然得到同样的错误,但经过3次它说:

14:01:01.877 [specs2.DefaultExecutionStrategy1] DEBUG org.apache.hadoop.ipc.Client - Connecting to xxx-something.com/someIp:22 
14:01:02.187 [specs2.DefaultExecutionStrategy1] DEBUG org.apache.hadoop.ipc.Client - IPC Client (47) connection to xxx-something.com/someIp:22 from britva sending #0 
14:01:02.188 [IPC Client (47) connection to xxx-something.com/someIp:22 from britva] DEBUG org.apache.hadoop.ipc.Client - IPC Client (47) connection to xxx-something.com/someIp:22 from britva: starting, having connections 1 
14:01:02.422 [IPC Client (47) connection to xxx-something.com/someIp:22 from britva] DEBUG org.apache.hadoop.ipc.Client - IPC Client (47) connection to xxx-something.com/someIp:22 from britva got value #1397966893 

而且算账:

Call to xxx-something.com/someIp:22 failed on local exception: java.io.EOFException 
java.io.IOException: Call to xxx-something.com/someIp:22 failed on local exception: java.io.EOFException 
    at org.apache.hadoop.ipc.Client.wrapException(Client.java:1103) 
    at org.apache.hadoop.ipc.Client.call(Client.java:1071) 
    at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225) 
    at com.sun.proxy.$Proxy1.getProtocolVersion(Unknown Source) 
    at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:396) 
    at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:379) 
    at org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:118) 
    at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:222) 
    at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:187) 
    at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:89) 
    at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1328) 
    at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:65) 
    at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1346) 
    at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:244) 
    at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:122) 
    at HdfsSpec$$anonfun$1$$anonfun$apply$3.apply(HdfsSpec.scala:33) 
    at HdfsSpec$$anonfun$1$$anonfun$apply$3.apply(HdfsSpec.scala:17) 
    at testingSupport.specs2.MyNotifierRunner$$anon$2$$anon$1.executeBody(MyNotifierRunner.scala:16) 
    at testingSupport.specs2.MyNotifierRunner$$anon$2$$anon$1.execute(MyNotifierRunner.scala:16) 
Caused by: java.io.EOFException 
    at org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:807) 
    at org.apache.hadoop.ipc.Client$Connection.run(Client.java:745) 

是什么意思?

回答

2

您需要在运行名称节点(HDFS master)的服务器上的$ HADOOP_HOME/conf/core-site.xml中找到fs.default.name属性以获取正确的端口。它可能是8020,或者它可能是别的。这就是你应该使用的。确保您和服务器之间没有防火墙,禁止端口上的连接。