Elasticsearch查询日志

“必须匹配”我在日志下面，我想用ElasticSearch查询发现：在elasticsearchElasticsearch查询日志

2014-07-02 20:52:39 INFO home.helloworld: LOGGER/LOG:ID1234 has successfully been received, {"uuid"="abc123"} 
2014-07-02 20:52:39 INFO home.helloworld: LOGGER/LOG:ID1234 has successfully been transferred, {"uuid"="abc123"} 
2014-07-02 20:52:39 INFO home.byebyeworld: LOGGER/LOG:ID1234 has successfully been processed, {"uuid"="abc123"} 
2014-07-02 20:52:39 INFO home.byebyeworld: LOGGER/LOG:ID1234 has exited, {"uuid"="abc123"} 
2014-07-02 20:53:00 INFO home.helloworld: LOGGER/LOG:ID1234 has successfully been received, {"uuid"="def123"} 
2014-07-02 20:53:00 INFO home.helloworld: LOGGER/LOG:ID1234 has successfully been transferred, {"uuid"="def123"} 
2014-07-02 20:53:00 INFO home.byebyeworld: LOGGER/LOG:ID1234 has successfully been processed, {"uuid"="def123"} 
2014-07-02 20:53:00 INFO home.byebyeworld: LOGGER/LOG:ID1234 has exited, {"uuid"="def123"}

由于上述各行表示为单一的“消息” ，我很难用POST休息调用来查询它。我尝试使用包含“必须匹配”只得到我的日志的1号线，但它是不相符的，有时它会返回多个命中，而不是仅仅一重击：

{ 
    "query" : { 
     "constant_score" : { 
     "filter" : { 
      "bool" : { 
       "must" : [ 
       { "match_phrase_prefix" : {"message" : "home.helloworld:"}}, 
       { "match_phrase_prefix" : {"message" : "LOGGER/LOG:ID1234"}}, 
       { "match" : {"message" : "received, {\"uuid\"=\"abc123\"}"}} 
       ] 
      } 
     } 
     } 
    } 
}

难道我做错了什么上面elasticsearch查询？我认为“必须”等于AND，“匹配”更多的是CONTAINS，“match_phrase_prefix”是STARTSWITH？有人可以告诉我如何正确查询充满上述日志与不同的uuid号码的日志，只返回单击？最初我以为我得到了与上述查询，它首先返回只有1击，但随后返回2，然后更多..这对我来说是不一致的。先谢谢你！！

来源

2017-03-09 R.C

问题出在您的bool查询的第3个子句。让我给你几个问题，这些问题将会为你工作，我会解释他们为什么要完成这项工作。

首先查询

curl -XGET http://localhost:9200/my_logs/_search -d ' 
{ 
    "query" : { 
     "constant_score" : { 
     "filter" : { 
      "bool" : { 
       "must" : [ 
       { "match_phrase_prefix" : {"message" : "home.helloworld:"}}, 
       { "match_phrase_prefix" : {"message" : "LOGGER/LOG:ID1234"}}, 
       { "match" : { 
        "message" : { 
         "query": "received, {\"uuid\"=\"abc123\"", 
         "operator": "and" 
        } 
        } 
       } 
       ] 
      } 
     } 
     } 
    } 
}'

说明

让我们确保我们对索引在同一页上。默认情况下，索引器将通过标准的分析链传递数据。即分割空白，减少特殊字符，制作下套管等。因此，在指数中，我们只会有其位置的标记。

Match query作为全文查询将采取您的查询文本“received, {\"uuid\"=\"abc123\"”，并将通过分析以及。默认情况下，此分析仅用空白分割文本，减少特殊字符，制作低层大小等。此分析的结果看起来类似于以下（简化）：received，uuid，abc123。

match query会做什么 - 它将使用默认的operator（即or）将这些令牌与message字段组合起来。所以作为一个逻辑表达式，最后一项（match-query）看起来像这样：message:received OR message:uuid OR message:abc123。

这就是为什么前4个日志条目会匹配。我能够重现它。

第二个查询

curl -XGET http://localhost:9200/my_logs/_search -d ' 
{ 
    "query" : { 
     "constant_score" : { 
     "filter" : { 
      "bool" : { 
       "must" : [ 
       { "match_phrase_prefix" : {"message" : "home.helloworld:"}}, 
       { "match_phrase_prefix" : {"message" : "LOGGER/LOG:ID1234"}}, 
       { "match_phrase_prefix" : {"received, {\"uuid\"=\"abc123\""}} 
       ] 
      } 
     } 
     } 
    } 
}'

说明

这一个有点简单。记住：我们的索引过程会在索引中留下标记及其位置。

究竟是什么Match Phrase Prefix查询正在做 - 它采用输入查询（以“received, {\"uuid\"=\"abc123\"”为例），使查询文本分析完全相同。并试图找到标记received,uuid,abc123在相邻职位的索引。只是在相同的确切顺序：received - >uuid - >abc123（差不多）。

除了最后一个标记，在我们的例子中是abc123。准确地说，它会为最后一个标记创建通配符。即received - >uuid - >abc123*。

要被完美主义者我想补充一点，received - >uuid - >abc123（即没有结束通配符） - 是实际Match Phrase查询正在进行。它还计算索引中的位置，即尝试匹配“短语”，而不仅仅是随机位置中的单独标记。

来源

2017-03-10 20:42:38

Elasticsearch查询日志

回答

相关问题