ElasticSearch查询与多个文档

我有这种格式在elasticsearch的数据的条件下，每一个都是单独的文件中：ElasticSearch查询与多个文档

{“PID”：1“纳米”：“汤姆”}，{“PID” ：1，'nm'：'dick''}，{'pid'：1，'nm'：'harry'}，{'pid'：2，'nm'：'tom'}，{'pid'： 2，'nm'：'harry'}，{'pid'：3，'nm'：'dick'}，{'pid'：3，'nm'：'harry'}，{'pid'：4， “纳米”：“哈利”}

{ 
     "took": 137, 
     "timed_out": false, 
     "_shards": { 
      "total": 5, 
      "successful": 5, 
      "failed": 0 
     }, 
     "hits": { 
      "total": 8, 
      "max_score": null, 
      "hits": [ 
      { 
       "_index": "query_test", 
       "_type": "user", 
       "_id": "AVj9KS86AaDUbQTYUmwY", 
       "_score": null, 
       "_source": { 
        "pid": 1, 
        "nm": "Harry" 
       } 
      }, 
      { 
       "_index": "query_test", 
       "_type": "user", 
       "_id": "AVj9KJ9BAaDUbQTYUmwW", 
       "_score": null, 
       "_source": { 
        "pid": 1, 
        "nm": "Tom" 
       } 
      }, 
      { 
       "_index": "query_test", 
       "_type": "user", 
       "_id": "AVj9KRlbAaDUbQTYUmwX", 
       "_score": null, 
       "_source": { 
        "pid": 1, 
        "nm": "Dick" 
       } 
      }, 
      { 
       "_index": "query_test", 
       "_type": "user", 
       "_id": "AVj9KYnKAaDUbQTYUmwa", 
       "_score": null, 
       "_source": { 
        "pid": 2, 
        "nm": "Harry" 
       } 
      }, 
      { 
       "_index": "query_test", 
       "_type": "user", 
       "_id": "AVj9KXL5AaDUbQTYUmwZ", 
       "_score": null, 
       "_source": { 
        "pid": 2, 
        "nm": "Tom" 
       } 
      }, 
      { 
       "_index": "query_test", 
       "_type": "user", 
       "_id": "AVj9KbcpAaDUbQTYUmwb", 
       "_score": null, 
       "_source": { 
        "pid": 3, 
        "nm": "Dick" 
       } 
      }, 
      { 
       "_index": "query_test", 
       "_type": "user", 
       "_id": "AVj9Kdy5AaDUbQTYUmwc", 
       "_score": null, 
       "_source": { 
        "pid": 3, 
        "nm": "Harry" 
       } 
      }, 
      { 
       "_index": "query_test", 
       "_type": "user", 
       "_id": "AVj9KetLAaDUbQTYUmwd", 
       "_score": null, 
       "_source": { 
        "pid": 4, 
        "nm": "Harry" 
       } 
      } 
      ] 
     } 
    }

，我需要找到PID的具有“哈利”和没有“汤姆”，这在上面的例子中是3和4这essentialy意味着寻找具有相同pid的文档，其中没有任何一个具有nm的值“汤姆“，但他们中至少有一个的价值为'哈里'。

如何查询？

编辑：使用Elasticsearch版本5

来源

2016-12-14 harbinger

如果你有一个POST请求主体可能类似于下面，在这里你可以使用bool：

POST _search 
{ 
    "query": { 
    "bool" : { 
     "must" : { 
     "term" : { "nm" : "harry" } 
     }, 
     "must_not" : { 
     "term" : { "nm" : "tom" } 
     } 
    } 
    } 
}

来源

2016-12-14 10:36:49 Kulasangar

在分析的字段上使用术语查询不危险吗？如果没有提供映射，nm将在这种情况下被分析。 – Artholl

@Artholl如果您使用'not_analyzed'，如果您不希望为上述场景分析该字段，该怎么办？ – Kulasangar

@Kulasangar不会在同一个文档上应用匹配/过滤条件吗？但是，例如，在这里，三个文档具有相同的pid，即1，但是'nm'有三个不同的值。 – harbinger

我相对非常新Elasticsearch，所以我可能是错的。但我从来没有见过这样的问题。简单的过滤器不能在这里使用，因为这些过滤器应用于您不需要的文档（而不是聚合）。我看到的是你想用“Having”子句（按照SQL）进行“Group by”查询。但按查询分组涉及的某些聚合（如任何字段的平均值，最大值，最小值）在“Having”子句中使用。基本上你使用reducer来进行聚合结果的Post处理。对于像这样的查询可以使用Bucket Selector Aggregation。阅读this
但你的情况是不同的。您不想在任何度量标准聚合上应用Having子句，但您想要检查“group by”数据的字段（或列）中是否存在某个值。就SQL而言，您希望在“group by”中执行“where”查询。这是我从未见过的。您也可以阅读this
但是，在应用程序级别，您可以通过中断查询来轻松完成此操作。首先找到独特的pid，其中nm = harry使用term aggs。然后获取附加条件nm！= tom的那些pid的文档。

P.S.我对ES很新。如果有人与我矛盾，我会很高兴在一个查询中展示如何做到这一点。我也会学到这一点。

来源

2016-12-14 18:44:19

ElasticSearch查询与多个文档

回答

相关问题