2017-01-03 55 views
3

我得到,而在管道中的ADF运行USQL活动以下错误:在活动错误而Azure的数据工厂运行USQL活动在管道

错误:

{"errorId":"E_CSC_USER_SYNTAXERROR","severity":"Error","component":"CSC", 
    "source":"USER","message":"syntax error. 
    Final statement did not end with a semicolon","details":"at token 'txt', line 3\r\nnear the ###:\r\n**************\r\nDECLARE @in string = \"/demo/SearchLog.txt\";\nDECLARE @out string = \"/scripts/Result.txt\";\nSearchLogProcessing.txt ### \n", 
    "description":"Invalid syntax found in the script.", 
    "resolution":"Correct the script syntax, using expected token(s) as a guide.","helpLink":"","filePath":"","lineNumber":3, 
    "startOffset":109,"endOffset":112}]. 

这里是输出数据集的代码,管道和USQL脚本,我试图在管道中执行。

OutputDataset:

{ 
"name": "OutputDataLakeTable", 
"properties": { 
    "published": false, 
    "type": "AzureDataLakeStore", 
    "linkedServiceName": "LinkedServiceDestination", 
    "typeProperties": { 
     "folderPath": "scripts/" 
    }, 
    "availability": { 
     "frequency": "Hour", 
     "interval": 1 
    } 
} 

管道:

{ 
    "name": "ComputeEventsByRegionPipeline", 
    "properties": { 
     "description": "This is a pipeline to compute events for en-gb locale and date less than 2012/02/19.", 
     "activities": [ 
      { 
       "type": "DataLakeAnalyticsU-SQL", 
       "typeProperties": { 
        "script": "SearchLogProcessing.txt", 
        "scriptPath": "scripts\\", 
        "degreeOfParallelism": 3, 
        "priority": 100, 
        "parameters": { 
         "in": "/demo/SearchLog.txt", 
         "out": "/scripts/Result.txt" 
        } 
       }, 
       "inputs": [ 
        { 
         "name": "InputDataLakeTable" 
        } 
       ], 
       "outputs": [ 
        { 
         "name": "OutputDataLakeTable" 
        } 
       ], 
       "policy": { 
        "timeout": "06:00:00", 
        "concurrency": 1, 
        "executionPriorityOrder": "NewestFirst", 
        "retry": 1 
       }, 
       "scheduler": { 
        "frequency": "Minute", 
        "interval": 15 
       }, 
       "name": "CopybyU-SQL", 
       "linkedServiceName": "AzureDataLakeAnalyticsLinkedService" 
      } 
     ], 
     "start": "2017-01-03T12:01:05.53Z", 
     "end": "2017-01-03T13:01:05.53Z", 
     "isPaused": false, 
     "hubName": "denojaidbfactory_hub", 
     "pipelineMode": "Scheduled" 
    } 
} 

这里是我试图用 “DataLakeAnalyticsU-SQL” 活动类型执行我USQL脚本。

@searchlog = 
    EXTRACT UserId   int, 
      Start   DateTime, 
      Region   string, 
      Query   string, 
      Duration  int?, 
      Urls   string, 
      ClickedUrls  string 
    FROM @in 
    USING Extractors.Text(delimiter:'|'); 

@rs1 = 
    SELECT Start, Region, Duration 
    FROM @searchlog 
WHERE Region == "kota"; 


OUTPUT @rs1 
    TO @out 
     USING Outputters.Text(delimiter:'|'); 

请建议我如何解决此问题。

回答

7

你的脚本缺少scriptLinkedService属性。您还(当前)需要将U-SQL脚本放入Azure Blob存储中才能成功运行它。因此,你还需要一个AzureStorage链接服务,例如:

{ 
    "name": "StorageLinkedService", 
    "properties": { 
     "description": "", 
     "type": "AzureStorage", 
     "typeProperties": { 
      "connectionString": "DefaultEndpointsProtocol=https;AccountName=myAzureBlobStorageAccount;AccountKey=**********" 
     } 
    } 
} 

创建该链接服务,与您相关的Blob存储帐户替换Blob存储名称myAzureBlobStorageAccount,然后将在U-SQL脚本(SearchLogProcessing.txt)那里有一个容器,然后再试一次在下面我举的例子管道,我有一个叫我团块存储adlascripts容器和脚本是在那里:

确保scriptPath完成,如亚历山大提及。管道的开始:

{ 
    "name": "ComputeEventsByRegionPipeline", 
    "properties": { 
     "description": "This is a pipeline to compute events for en-gb locale and date less than 2012/02/19.", 
     "activities": [ 
      { 
       "type": "DataLakeAnalyticsU-SQL", 
       "typeProperties": { 
        "scriptPath": "adlascripts\\SearchLogProcessing.txt", 
        "scriptLinkedService": "StorageLinkedService", 
        "degreeOfParallelism": 3, 
        "priority": 100, 
        "parameters": { 
         "in": "/input/SearchLog.tsv", 
         "out": "/output/Result.tsv" 
        } 
       }, 
... 

inputoutput .tsv格式的文件可以在数据湖泊和使用的AzureDataLakeStoreLinkedService链接服务。

我可以看到您正在尝试关注演示:https://docs.microsoft.com/en-us/azure/data-factory/data-factory-usql-activity#script-definition。这不是最直观的演示,似乎有一些问题,如StorageLinkedService的定义在哪里?,其中是SearchLogProcessing.txt?好的,我发现它通过谷歌搜索,但网页应该有一个链接。我得到它的工作,但感觉有点像混血王子的哈利波特。

+0

谢谢wbob。它为我工作得很好。但是,我们只能在脚本链接服务而不是Azure数据存储区中使用Azure存储。 – Jai

+0

是的,当我尝试在ADLS中存储U-SQL脚本时,我收到了一个非常具体的(并且非常有帮助的)错误:“scriptLinkedService”AzureDataStoreLinkedService“不受支持,目前scriptLinkedService只能接受Azure存储链接服务,请使用Azure存储链接的服务,并把你的脚本在blob代替。“ – wBob

+0

伟大的调查,哈利鲍勃:) –

1

删除U-SQL活动定义中的script属性,并在scriptPath属性中提供脚本的完整路径(包括文件名)。

参考:https://docs.microsoft.com/en-us/azure/data-factory/data-factory-usql-activity

+0

如果我不使用脚本属性,那么它会给我一个错误,指出找不到U-SQL脚本,请使用script或scriptLinkedService。所以,我必须在那里使用脚本属性。 – Jai

+0

我已经使用了“scripts \\ SearchLogProcessing.txt”,同一个文件出现在ADLS中。不过,我得到的错误是没有找到USQL脚本。 – Jai

+0

使用'scripts/SearchLogProcessing.txt'(正斜杠) –

1

我有一个类似的问题,其中Azure Data Factory无法识别我的脚本文件。避免整个问题而不必粘贴大量代码的一种方法是注册一个存储过程。你可以这样说:

DROP PROCEDURE IF EXISTS master.dbo.sp_test; 
CREATE PROCEDURE master.dbo.sp_test() 
AS 
BEGIN 

@searchlog = 
EXTRACT UserId   int, 
     Start   DateTime, 
     Region   string, 
     Query   string, 
     Duration  int?, 
     Urls   string, 
     ClickedUrls  string 
FROM @in 
USING Extractors.Text(delimiter:'|'); 

@rs1 = 
    SELECT Start, Region, Duration 
    FROM @searchlog 
WHERE Region == "kota"; 


OUTPUT @rs1 
    TO @out 
     USING Outputters.Text(delimiter:'|'); 
END; 

运行此之后,你可以在你的JSON管道定义中使用

"script": "master.dbo.sp_test()" 

。无论何时更新U-SQL脚本,只需重新运行该过程的定义即可。然后将不需要将脚本文件复制到Blob存储。

相关问题