2016-06-21 52 views
0

我想聚合具有内部对象的数据。例如:ElasticSearch 1x - 对象条件下的聚合

{ 
    "_index": "product_index-en", 
    "_type": "elasticproductmodel", 
    "_id": "000001111", 
    "_score": 6.3316255, 
    "_source": { 
     "productId": "11111111111", 
     "productIdOnlyLetterAndDigit": "11111111111", 
     "productIdOnlyDigit": "11111111111", 
     "productNumber": "11111111111", 
     "name": "Glow Plug", 
     "nameOnlyLetterAndDigit": "glowplug", 
     "productImageLarge": "11111111111.jpg", 
     "itemGroupId": "11111", 
     "relatedProductIds": [], 
     "dataAreaCountries": [ 
      "fra", 
      "pol", 
      "uk", 
      "sie", 
      "sve", 
      "atl", 
      "ita", 
      "hol", 
      "dk" 
     ], 
     "oemItems": [ 
      { 
       "manufactorName": "BERU", 
       "manufacType": "0" 
      }, 
      { 
       "manufactorName": "LUCAS", 
       "manufacType": "0" 
      } 
     ] 
    } 
} 

我需要能够聚集oemItems.manufactorName值,但只有在oemItems.manufacType为“0”。我已经尝试了一些例子,比如这里接受的例子(Elastic Search Aggregate into buckets on conditions),但我似乎无法将它包裹在头上。

我试过下面,希望它会首先在manufacType上进行加密,然后再对它进行加工,然后对每种类型使用manufactorName,它似乎显示正确的命中数。然而,对于manufactorName桶是空的:

GET /product_index-en/_search 
{ 
"size": 0, 
    "aggs": { 
    "baked_goods": { 
     "nested": { 
     "path": "oemItems" 
     }, 
     "aggs": { 
     "test1": { 
      "terms": { 
      "field": "oemItems.manufacType", 
      "size": 500 
      }, 
      "aggs": { 
      "test2": { 
       "terms": { 
       "field": "oemItems.manufactorName", 
       "size": 500 
       } 
      } 
      } 
     } 
     } 
    } 
    } 
} 

而结果:

{ 
    "took": 27, 
    "timed_out": false, 
    "_shards": { 
     "total": 5, 
     "successful": 5, 
     "failed": 0 
    }, 
    "hits": { 
     "total": 471214, 
     "max_score": 0, 
     "hits": [] 
    }, 
    "aggregations": { 
     "baked_goods": { 
     "doc_count": 677246, 
     "test1": { 
      "doc_count_error_upper_bound": 0, 
      "sum_other_doc_count": 0, 
      "buckets": [ 
       { 
        "key": "0", 
        "doc_count": 436557, 
        "test2": { 
        "doc_count_error_upper_bound": 0, 
        "sum_other_doc_count": 0, 
        "buckets": [] 
        } 
       }, 
       { 
        "key": "1", 
        "doc_count": 240689, 
        "test2": { 
        "doc_count_error_upper_bound": 0, 
        "sum_other_doc_count": 0, 
        "buckets": [] 
        } 
       } 
      ] 
     } 
     } 
    } 
} 

我也尝试添加一个嵌套项过滤器,只查找oemItems具有manufacType 1以下查询。但是,它返回oemItem包含manufacType 1的对象,这意味着产品中的oemItem仍包含1或0 manufacType。我看不出在这个响应做一个汇总只会返回oemItems.manufactorName其中oemItems.manufacType是0

GET /product_index-en/_search 
{ 
     "query" : { "match_all" : {} }, 
     "filter" : { 
      "nested" : { 
       "path" : "oemItems", 
       "filter" : { 
        "bool" : { 
         "must" : [ 
          { 
           "term" : {"oemItems.manufacType" : "1"} 
          } 
         ] 
        } 
       } 
      } 
     }  
} 
+0

首先,你需要确保'oemItems'在你的映射中是'nested'类型的。是这样吗? – Val

+0

@Val不,它不是嵌套类型。我会改变它,看看是否有帮助。 –

+0

@Val我将它设置为嵌套并在我的文章中添加了一个示例。 –

回答

1

良好的开端至今。试试这样:

POST /product_index-en/_search 
{ 
    "size": 0, 
    "query": { 
    "nested": { 
     "path": "oemItems", 
     "query": { 
      "term": { 
       "oemItems.manufacType": "0" 
      } 
     } 
    } 
    }, 
    "aggs": { 
    "baked_goods": { 
     "nested": { 
     "path": "oemItems" 
     }, 
     "aggs": { 
     "test1": { 
      "terms": { 
      "field": "oemItems.manufactorName", 
      "size": 500 
      } 
     } 
     } 
    } 
    } 
} 
+0

问题是Object.oemItems可能包含具有manufacType 1,0或多个的对象。因此查询返回的匹配将包括除了0之外还具有manufactorType 1的对象,并且当我汇总这些结果时,我最终得到了manufacType 1和0.我想我需要向聚合添加一个过滤器,所以它只返回manufacType 0的oemItems? –

+0

试一试,它应该工作,因为嵌套字段是下面的不同文档。 – Val

+0

感谢您的帮助Val。我做了,但test1中的桶是空的 –