2016-05-30 32 views
1

嗨我正在使用Azure数据工厂进行复制活动。 我想拷贝到横跨容器递归和它的子文件夹,如下所示: MyFolder文件/年/月/日/小时} /New_Generated_File.csv天青数据工厂从容器递归复制

,我产生和导入到该文件夹​​中的文件有一个始终不同的名字。

问题是活动似乎在等待着。

管道按小时计划。

我附加了数据集和链接服务的json代码。

数据集:

{ 
"name": "Txns_In_Blob", 
"properties": { 
    "structure": [ 
     { 
      "name": "Column0", 
      "type": "String" 
     }, 
     [....Other Columns....] 
    ], 
    "published": false, 
    "type": "AzureBlob", 
    "linkedServiceName": "LinkedService_To_Blob", 
    "typeProperties": { 
     "folderPath": "uploadtransactional/yearno={Year}/monthno={Month}/dayno={Day}/hourno={Hour}/{Custom}.csv", 
     "format": { 
      "type": "TextFormat", 
      "rowDelimiter": "\n", 
      "columnDelimiter": " " 
     } 
    }, 
    "availability": { 
     "frequency": "Hour", 
     "interval": 1 
    }, 
    "external": true, 
    "policy": {} 
} 

}

链接服务:

{ 
"name": "LinkedService_To_Blob", 
"properties": { 
    "description": "", 
    "hubName": "dataorchestrationsystem_hub", 
    "type": "AzureStorage", 
    "typeProperties": { 
     "connectionString": "DefaultEndpointsProtocol=https;AccountName=wizestorage;AccountKey=**********" 
    } 
} 

}

回答

2

不强制给数据集中的folderPath属性的文件名。只需删除文件名,然后所有文件将由datafactory为您加载。

{ 
    "name": "Txns_In_Blob", 
    "properties": { 
    "structure": [ 
     { 
      "name": "Column0", 
      "type": "String" 
     }, 
     [....Other Columns....] 
    ], 
    "published": false, 
    "type": "AzureBlob", 
    "linkedServiceName": "LinkedService_To_Blob", 
    "typeProperties": { 
     "folderPath": "uploadtransactional/yearno={Year}/monthno={Month}/dayno={Day}/hourno={Hour}/", 
     "partitionedBy": [ 
      { "name": "Year", "value": { "type": "DateTime", "date": "SliceStart", "format": "yyyy" } }, 
      { "name": "Month", "value": { "type": "DateTime", "date": "SliceStart", "format": "%M" } }, 
      { "name": "Day", "value": { "type": "DateTime", "date": "SliceStart", "format": "%d" } }, 
      { "name": "Hour", "value": { "type": "DateTime", "date": "SliceStart", "format": "hh" } } 
     ], 
     "format": { 
      "type": "TextFormat", 
      "rowDelimiter": "\n", 
      "columnDelimiter": " " 
     } 
    }, 
    "availability": { 
     "frequency": "Hour", 
     "interval": 1 
    }, 
    "external": true, 
    "policy": {} 
} 

通过上述folderPath它会生成一个管道,其执行UTC时间的运行时间值 uploadtransactional/yearno=2016/monthno=05/dayno=30/hourno=07/现在

+0

谢谢你的答案。你说的是正确的,我也尝试过,但为什么管道还在永久等待?我检查了参数“external”设置为false。我得到这个错误:Blob https://wizestorage.blob.core.windows.net/upl/yearno={Year}/monthno={Month}/dayno={Day}/hourno={Hour}/ does not存在 它似乎没有将脚本转换为实际路径。 –

+0

您是否定义了将值分配给'{Hour} {Year}'等的逻辑? – Sandesh

+0

如果数据集被标记为外部,那么管道将立即执行。如果它设置为false,那么它会一直等待,直到某些** _ other pipeline生成_ **数据集 – Sandesh