的Hadoop在Azure上，我可以使用I/O不同的Blob存储容器？

我目前工作的一个项目，以创建Azure中的大数据架构。为了了解Azure的作品中，我创建了一个数据工厂和Blob存储，并成立了一个字一个流水线上的按需HDInsight系统计算Hadoop的过程。的Hadoop在Azure上，我可以使用I/O不同的Blob存储容器？

这是管道JSON文件：

{ 
"name": "MRSamplePipeline5", 
    "properties": { 
     "description": "Sample Pipeline to Run the Word Count Program", 
     "activities": [ 
      { 
       "type": "HDInsightMapReduce", 
       "typeProperties": { 
        "className": "wordcount", 
        "jarFilePath": "executables/hadoop-example.jar", 
        "jarLinkedService": "AzureStorageLinkedService", 
        "arguments": [ 
         "/davinci.txt", 
         "/WordCountOutput1" 
        ] 
       }, 
       "outputs": [ 
        { 
         "name": "MROutput4" 
        } 
       ], 
       "policy": { 
        "timeout": "01:00:00", 
        "concurrency": 1, 
        "retry": 3 
       }, 
       "scheduler": { 
        "frequency": "Minute", 
        "interval": 15 
       }, 
       "name": "MRActivity", 
       "linkedServiceName": "HDInsightOnDemandLinkedService" 
      } 
     ], 
     "start": "2017-07-24T00:00:00Z", 
     "end": "2017-07-24T00:00:00Z", 
     "isPaused": false, 
     "hubName": "testazuredatafact_hub", 
     "pipelineMode": "OneTime", 
     "expirationTime": "3.00:00:00" 
    } 
}

它的工作，即使输出是一个名为 “WordCountOutput1 /一部分-R-00000” 的文件。

我的问题是：如何将输入文件（davinci.txt）和输出文件（Output1）定义在我的blob存储的不同容器（例如“exampledata”）中？

来源

2017-07-24 Markus Appel

Hadoop的文件路径可以在一个完整的URI语法，包括方案和权限来指定，在不同种类的文件系统的指向（例如HDFS与天青与S3），并且在特定情况下，不同的Azure存储容器。 Azure存储访问的相关方案是“wasb”。该权限包含容器和帐户。例如，请考虑以下hadoop fs -ls命令。

# WASB backed by container "test" in Azure Storage account "cnauroth" 
hadoop fs -ls wasb://[email protected]/users/cnauroth 

# WASB backed by container "qa" in Azure Storage account "cnauroth" 
hadoop fs -ls wasb://[email protected]/users/cnauroth 

# WASB backed by container "production" in Azure Storage account "cnauroth-live" 
hadoop fs -ls wasb://[email protected]/users/cnauroth

从同一客户端主机执行的每个命令列出了不同的Azure存储帐户/容器。

参数传递给您的作业提交时，您可以使用相同的URI语法。

来源

2017-07-25 18:47:54

的Hadoop在Azure上，我可以使用I/O不同的Blob存储容器？

回答

相关问题