2016-03-31 55 views
0

我有以下映射:ElasticSearch:从嵌套聚集查询中访问外文档字段

{ 
    "dynamic": "strict", 
    "properties": { 
     "id": { 
      "type": "string" 
     }, 
     "title": { 
      "type": "string" 
     }, 
     "things": { 
      "type": "nested", 
      "properties": { 
       "id": { 
        "type": "long" 
       }, 
       "something": { 
        "type": "long" 
       } 
      } 
     } 
    } 
} 

我插入文档如下(Python脚本):

body = {"id": 1, "title": "one", "things": [{"id": 1000, "something": 33}, {"id": 1001, "something": 34}, ]} 
es.create(index_name, doc_type=doc_type, body=body, id=1) 

body = {"id": 2, "title": "two", "things": [{"id": 1000, "something": 43}, {"id": 1001, "something": 44}, ]} 
es.create(index_name, doc_type=doc_type, body=body, id=2) 

body = {"id": 3, "title": "three", "things": [{"id": 1000, "something": 53}, {"id": 1001, "something": 54}, ]} 
es.create(index_name, doc_type=doc_type, body=body, id=3) 

我运行以下聚合查询:

{ 
    "query": { 
    "match_all": {} 
    }, 
    "aggs": { 
    "things": { 
     "aggs": { 
     "num_articles": { 
      "terms": { 
      "field": "things.id", 
      "size": 0 
      }, 
      "aggs": { 
      "articles": { 
       "top_hits": { 
       "size": 50 
       } 
      } 
      } 
     } 
     }, 
     "nested": { 
     "path": "things" 
     } 
    } 
    }, 
    "size": 0 
} 

(所以,我要计算每个“事物”出现的次数,并对每个事物列出一个列表在其中出现每个事物的文章)的

查询生成:

"key": 1000, 
"doc_count": 3, 
"articles": { 
    "hits": { 
     "total": 3, 
     "max_score": 1, 
     "hits": [{ 
      "_index": "test", 
      "_type": "article", 
      "_id": "2", 
      "_nested": { 
       "field": "things", 
       "offset": 0 
      }, 
      "_score": 1, 
      "_source": { 
       "id": 1000, 
       "something": 43 
      } 
     }, { 
      "_index": "test", 
      "_type": "article", 
      "_id": "1", 
      "_nested": { 
       "field": "things", 
       "offset": 0 
      }, 
      "_score": 1, 
      "_source": { 
       "id": 1000, 
       "something": 33 
      } 

......(依此类推)

我想什么是每个命中列出所有来自“外部”或顶级文档的字段,即在这种情况下是id和标题。

这实际上是可能的.....如果是这样如何?

回答

0

我不知道如果这是你在找什么,但让我们试试看:

{ 
    "query": { 
    "match_all": {} 
    }, 
    "aggs": { 
    "nested_things": { 
     "nested": { 
     "path": "things" 
     }, 
     "aggs": { 
     "num_articles": { 
      "terms": { 
      "field": "things.id", 
      "size": 0 
      }, 
      "aggs": { 
      "articles": { 
       "top_hits": { 
       "size": 50 
       } 
      }, 
      "reverse_things": { 
       "reverse_nested": {}, 
       "aggs": { 
       "title": { 
        "terms": { 
        "field": "title", 
        "size": 0 
        } 
       }, 
       "id": { 
        "terms": { 
        "field": "id", 
        "size": 0 
        } 
       } 
       } 
      } 
      } 
     } 
     } 
    } 
    } 
} 

这会产生这样的:

  "buckets": [ 
       { 
        "key": 1000, 
        "doc_count": 3, 
        "reverse_things": { 
        "doc_count": 3, 
        "id": { 
         "buckets": [ 
          { 
           "key": "1", 
           "doc_count": 1 
          }, 
          { 
           "key": "2", 
           "doc_count": 1 
          }, 
          { 
           "key": "3", 
           "doc_count": 1 
          } 
         ] 
        }, 
        "title": { 
         ... 
        } 
        }, 
        "articles": { 
        "hits": { 
         "total": 3, 
         "max_score": 1, 
         "hits": [ 
          { 
           "_index": "test", 
           "_type": "article", 
           "_id": "AVPOgQQjgDGxUAMojyuY", 
           "_nested": { 
           "field": "things", 
           "offset": 0 
           }, 
           "_score": 1, 
           "_source": { 
           "id": 1000, 
           "something": 53 
           } 
          }, 
          ... 
+0

非常近... .. –

+0

问题是''''reverse_things'''部分列出了ID和标题,但不是以相同的顺序。所以,密钥对ID是1,2,3 “ID”:{ “doc_count_error_upper_bound”:0, “sum_other_doc_count”:0, “桶”:[{ “钥匙”: “1”, “doc_count”:1 },{ “钥匙”: “2”, “doc_count”:1 },{ “键”: “3”, “doc_count”:1 }] }, –

+0

但是标题的关键是一,三,二。 “标题”:{ “doc_count_error_upper_bound”:0, “sum_other_doc_count”:0, “桶”:[{ “键”: “一个”, “doc_count”:1 },{ “键“: “三化”, “doc_count”:1 },{ “钥匙”: “两节”, “doc_count”:1 }] } 如果排序可能会被迫以配合原创文章,这将工作。谢谢@ kristian-ferkić顺便说一句... –