2017-07-17 125 views
1

Elasticsearch多匹配查询与cross_fiels类型和同义词不能按预期方式工作。Elasticsearch multi_match查询不能处理同义词和cross_fields

我有以下配置:

{ 
    "my_index": { 
     "mappings": { 
      "my_mapping": { 
       "properties": { 
        "@timestamp": { 
         "type": "date" 
        }, 
        "@version": { 
         "type": "text", 
         "fields": { 
          "keyword": { 
           "type": "keyword", 
           "ignore_above": 256 
          } 
         } 
        }, 
        "field1": { 
         "type": "text", 
         "fields": { 
          "keyword": { 
           "type": "keyword", 
           "ignore_above": 256 
          } 
         } 
        }, 
        "field2": { 
         "type": "text", 
         "fields": { 
          "keyword": { 
           "type": "keyword", 
           "ignore_above": 256 
          } 
         } 
        } 
     }, 
     "settings": { 
      "index": { 
       "analysis": { 
        "filter": { 
         "my_synonym_filter": { 
          "type": "synonym", 
          "synonyms": [ 
           "matthew,matt,matty", 
           "thomas,tom,thom,tommy" 
          ] 
         } 
        }, 
        "analyzer": { 
         "my_synonyms": { 
          "filter": [ 
           "lowercase", 
           "my_synonym_filter" 
          ], 
          "tokenizer": "standard" 
         } 
        } 
       } 
      } 
     } 
    } 
} 

而下面的查询:

{ 
    "query":{ 
     "bool":{ 
      "should":[ 
       { 
        "multi_match":{ 
        "fields":[ 
         "field1^8", 
         "field2^2" 
        ], 
        "query":"Matt And Tom Oldfield", 
        "type":"cross_fields", 
        "analyzer": "my_synonyms" 
        } 
       } 
      ] 
     } 
    } 
} 

但是,当我执行它并没有扩张的同义词到每一个领域的查询,所以如果我分析查询解释如下:

(Synonym(field1:matt field1:matthew field1:matty) blended(terms:[field1:and^8.0, field2:and^2.0]) Synonym(field1:thom field1:thomas field1:tom field1:tommy) blended(terms:[field1:oldfield^8.0, field2:oldfield^2.0]))

因此,如果我在field1中有“Tom Oldfield”,而在field2中有“Matt Oldfield”,则查询与该结果不匹配,因为您可以看到它只扩展了同义词,但是仅扩展了第一个字段的同义词(field1),而不是其他字段。

如果我从查询中删除分析器,然后它会用“汤姆·菲尔德”在FIELD1和“马特菲尔德”在域2匹配文档和查询的解释如下:

(blended(terms:[field1:matt^8.0, field2:matt^2.0]) blended(terms:[field1:and^8.0, field2:and^2.0]) blended(terms:[field1:tom^8.0, field2:tom^2.0]) blended(terms:[field1:oldfield^8.0, field2:oldfield^2.0]))

是有没有办法让同义词扩展到每个领域?

+0

您的配置示例中存在一个问题 - field1重复 – Ivan

+0

对不起,我刚修复它。 –

回答

1

我无法在弹性5.5.0的env上重现您的问题。 这是我MVCE设置:

{ 
    "settings": { 
    "index": { 
     "analysis": { 
     "filter": { 
      "my_synonym_filter": { 
      "type": "synonym", 
      "synonyms": [ 
       "matthew,matt,matty", 
       "thomas,tom,thom,tommy" 
      ] 
      } 
     }, 
     "analyzer": { 
      "my_synonyms": { 
      "filter": [ 
       "lowercase", 
       "my_synonym_filter" 
      ], 
      "tokenizer": "standard" 
      } 
     } 
     } 
    } 
    }, 
    "mappings": { 
    "my_mapping": { 
     "properties": { 
     "field1": { 
      "type": "text", 
      "fields": { 
      "keyword": { 
       "type": "keyword", 
       "ignore_above": 256 
      } 
      } 
     }, 
     "field2": { 
      "type": "text", 
      "fields": { 
      "keyword": { 
       "type": "keyword", 
       "ignore_above": 256 
      } 
      } 
     } 
     } 
    } 
    } 
} 

下面的文档建立索引:

{ "field1": "Tom Oldfield", "field2": "Matt Oldfield"} 

上提供的查询ES创建以下Lucene query

((field1:matt)^8.0 | (field1:matthew)^8.0 | (field1:matty)^8.0 | (field2:matt)^2.0 | (field2:matthew)^2.0 | (field2:matty)^2.0) 
((field1:and)^8.0 | (field2:and)^2.0) 
((field1:tom)^8.0 | (field1:thomas)^8.0 | (field1:thom)^8.0 | (field1:tommy)^8.0 | (field2:tom)^2.0 | (field2:thomas)^2.0 | (field2:thom)^2.0 | (field2:tommy)^2.0) 
((field1:oldfield)^8.0 | (field2:oldfield)^2.0)) 

其中同义词是各个领域扩大。

+0

你说得对。如果我在我的笔记本电脑上使用ES进行试用,它会起作用,但如果我在AWS Elasticsearch服务上尝试它,它会生成我之前输入的内容。你有什么想法,为什么会发生? –

+0

@SofiaBraun你能提供ES版吗? – Ivan

+0

我正在使用ES 5.1 –

相关问题