2016-08-17 119 views
1

USql呼叫数据我有一个数据湖这个JSON文件看起来像这样:多维JSON阵列

{ 
"id":"398507", 
"contenttype":"POST", 
"posttype":"post", 
"uri":"http://twitter.com/etc", 
"title":null, 
"profile":{ 
    "@class":"PublisherV2_0", 
    "name":"Company", 
    "id":"2163171", 
    "profileIcon":"https://pbs.twimg.com/image", 
    "profileLocation":{ 
     "@class":"DocumentLocation", 
     "locality":"Toronto", 
     "adminDistrict":"ON", 
     "countryRegion":"Canada", 
     "coordinates":{ 
     "latitude":43.7217, 
     "longitude":-31.432}, 
     "quadKey":"000000000000000"}, 
     "displayName":"Name", 
     "externalId":"00000000000"}, 
    "source":{ 
     "name":"blogs", 
     "id":"18", 
     "param":"Twitter"}, 
    "content":{ 
     "text":"Description of post"}, 
     "language":{ 
      "name":"English", 
      "code":"en"}, 
     "abstracttext":"More Text and links", 
     "score":{} 
    } 
} 

以数据调用到我的申请,我要转JSON转换成字符串使用此代码:

DECLARE @input string = @"/MSEStream/{*}.json"; 

REFERENCE ASSEMBLY [Newtonsoft.Json]; 
REFERENCE ASSEMBLY [Microsoft.Analytics.Samples.Formats]; 


@allposts = 
EXTRACT 
    jsonString string 
FROM @input 
USING Extractors.Text(delimiter:'\b', quoting:true); 

@extractedrows = SELECT Microsoft.Analytics.Samples.Formats.Json.JsonFunctions.JsonTuple(jsonString) AS er FROM @allposts; 


@result = 
SELECT er["id"] AS postID, 
     er["contenttype"] AS contentType, 
     er["posttype"] AS postType, 
     er["uri"] AS uri, 
     er["title"] AS Title, 
     er["acquisitiondate"] AS acquisitionDate, 
     er["modificationdate"] AS modificationDate, 
     er["publicationdate"] AS publicationDate, 
     er["profile"] AS profile 
FROM @extractedrows; 

OUTPUT @result 
TO "/ProcessedQueries/all_posts.csv" 
USING Outputters.Csv(); 

此输出的JSON到.csv文件,该文件是可读的,当我下载的文件正确显示所有数据。我的问题是当我需要获取配置文件内的数据。因为JSON现在是一个字符串,我似乎无法提取任何数据并将其放入要使用的变量中。有没有办法做到这一点?还是我需要查看其他选项来读取数据?

回答

2

您可以在配置文件字符串上使用JsonTuple来进一步提取所需的特定属性。在此链接中提供了处理嵌套Json的U-SQL代码示例 - https://github.com/Azure/usql/blob/master/Examples/JsonSample/JsonSample/NestedJsonParsing.usql

您可以使用JsonTuple在型材柱进一步提取特定的节点

例如使用JsonTuple来获取配置文件节点的所有子节点,并提取特定的值,例如您在代码中所做的操作。

@childnodesofprofile = 
SELECT 
    Microsoft.Analytics.Samples.Formats.Json.JsonFunctions.JsonTuple(profile) AS childnodes_map 
FROM @result; 

@values = 
SELECT 
    childnodes_map["name"] AS name, 
    childnodes_map["id"] AS id 
FROM @result; 

另外,如果你有兴趣的具体数值,也可以通过paramters到JsonTuple函数来得到你想要的特定节点。下面的代码从递归嵌套节点由“$ ..值”通过JsonTuple

构建。

@locality = 
SELECT Microsoft.Analytics.Samples.Formats.Json.JsonFunctions.JsonTuple(profile, "$..locality").Values AS locality 
FROM @result; 

其他支持结构

JsonTuple(json, "id", "name")    // field names   
    JsonTuple(json, "$.address.zip")   // nested fields   
    JsonTuple(json, "$..address")    // recursive children 
    JsonTuple(json, "$[?(@.id > 1)].id")  // path expression  
    JsonTuple(json)       // all children 

希望这有助于说明当地节点(