2016-11-28 60 views
0

我有以下映射字段定义:使用自动完成与弹性搜索电子邮件不起作用

"my_field": { 
    "properties": { 
     "address": { 
      "type": "string", 
      "analyzer": "email", 
      "search_analyzer": "whitespace" 
     } 
    } 
} 

我的电子邮件分析是这样的:

{ 
    "analysis": { 
     "filter": { 
      "email_filter": { 
       "type": "edge_ngram", 
       "min_gram": "3", 
       "max_gram": "255" 
      } 
     }, 
     "analyzer": { 
      "email": { 
       "type": "custom", 
       "filter": [ 
        "lowercase", 
        "email_filter", 
        "unique" 
       ], 
       "tokenizer": "uax_url_email" 
      } 
     } 
    } 
} 

当我尝试搜索对于电子邮件ID,如[email protected]

搜索像tes,test.xy等术语不起作用。但是,如果我搜索 test.xyz或[email protected],它工作正常。我尝试使用我的电子邮件过滤器分析令牌,并且如预期的那样正常工作

Ex。击中http://localhost:9200/my_index/_analyze?analyzer=email&[email protected]

我得到:

{ 
    "tokens": [{ 
     "token": "tes", 
     "start_offset": 0, 
     "end_offset": 20, 
     "type": "word", 
     "position": 0 
    }, { 
     "token": "test", 
     "start_offset": 0, 
     "end_offset": 20, 
     "type": "word", 
     "position": 0 
    }, { 
     "token": "test.", 
     "start_offset": 0, 
     "end_offset": 20, 
     "type": "word", 
     "position": 0 
    }, { 
     "token": "test.x", 
     "start_offset": 0, 
     "end_offset": 20, 
     "type": "word", 
     "position": 0 
    }, { 
     "token": "test.xy", 
     "start_offset": 0, 
     "end_offset": 20, 
     "type": "word", 
     "position": 0 
    }, { 
     "token": "test.xyz", 
     "start_offset": 0, 
     "end_offset": 20, 
     "type": "word", 
     "position": 0 
    }, { 
     "token": "[email protected]", 
     "start_offset": 0, 
     "end_offset": 20, 
     "type": "word", 
     "position": 0 
    }, { 
     "token": "[email protected]", 
     "start_offset": 0, 
     "end_offset": 20, 
     "type": "word", 
     "position": 0 
    }, { 
     "token": "[email protected]", 
     "start_offset": 0, 
     "end_offset": 20, 
     "type": "word", 
     "position": 0 
    }, { 
     "token": "[email protected]", 
     "start_offset": 0, 
     "end_offset": 20, 
     "type": "word", 
     "position": 0 
    }, { 
     "token": "[email protected]", 
     "start_offset": 0, 
     "end_offset": 20, 
     "type": "word", 
     "position": 0 
    }, { 
     "token": "[email protected]", 
     "start_offset": 0, 
     "end_offset": 20, 
     "type": "word", 
     "position": 0 
    }, { 
     "token": "[email protected]", 
     "start_offset": 0, 
     "end_offset": 20, 
     "type": "word", 
     "position": 0 
    }, { 
     "token": "[email protected]", 
     "start_offset": 0, 
     "end_offset": 20, 
     "type": "word", 
     "position": 0 
    }, { 
     "token": "[email protected]", 
     "start_offset": 0, 
     "end_offset": 20, 
     "type": "word", 
     "position": 0 
    }, { 
     "token": "[email protected]", 
     "start_offset": 0, 
     "end_offset": 20, 
     "type": "word", 
     "position": 0 
    }, { 
     "token": "[email protected]", 
     "start_offset": 0, 
     "end_offset": 20, 
     "type": "word", 
     "position": 0 
    }, { 
     "token": "[email protected]", 
     "start_offset": 0, 
     "end_offset": 20, 
     "type": "word", 
     "position": 0 
    }] 
} 

所以我知道在断词的作品。但是在搜索时,它无法搜索部分字符串。

例如,寻找http://localhost:9200/my_index/my_field/_search?q=test,结果显示没有命中。我的指数

详情:

{ 
    "my_index": { 
     "aliases": { 
      "alias_default": {} 
     }, 
     "mappings": { 
      "my_field": { 
       "properties": { 
        "address": { 
         "type": "string", 
         "analyzer": "email", 
         "search_analyzer": "whitespace" 
        }, 
        "boost": { 
         "type": "long" 
        }, 
        "createdat": { 
         "type": "date", 
         "format": "strict_date_optional_time||epoch_millis" 
        }, 
        "instanceid": { 
         "type": "long" 
        }, 
        "isdeleted": { 
         "type": "integer" 
        }, 
        "object": { 
         "type": "string" 
        }, 
        "objecthash": { 
         "type": "string" 
        }, 
        "objectid": { 
         "type": "string" 
        }, 
        "parent": { 
         "type": "short" 
        }, 
        "parentid": { 
         "type": "integer" 
        }, 
        "updatedat": { 
         "type": "date", 
         "format": "strict_date_optional_time||epoch_millis" 
        } 
       } 
      } 
     }, 
     "settings": { 
      "index": { 
       "creation_date": "1480342980403", 
       "number_of_replicas": "1", 
       "max_result_window": "100000", 
       "uuid": "OUuiTma8CA2VNtw9Og", 
       "analysis": { 
        "filter": { 
         "email_filter": { 
          "type": "edge_ngram", 
          "min_gram": "3", 
          "max_gram": "255" 
         }, 
         "autocomplete_filter": { 
          "type": "edge_ngram", 
          "min_gram": "3", 
          "max_gram": "20" 
         } 
        }, 
        "analyzer": { 
         "autocomplete": { 
          "type": "custom", 
          "filter": [ 
           "lowercase", 
           "autocomplete_filter" 
          ], 
          "tokenizer": "standard" 
         }, 
         "email": { 
          "type": "custom", 
          "filter": [ 
           "lowercase", 
           "email_filter", 
           "unique" 
          ], 
          "tokenizer": "uax_url_email" 
         } 
        } 
       }, 
       "number_of_shards": "5", 
       "version": { 
        "created": "2010099" 
       } 
      } 
     }, 
     "warmers": {} 
    } 
} 
+0

有搜索“search_analyzer”:“whitespace”分析器。删除,并做映射 – Backtrack

+0

@Backtrack我相信这是正确的。检查http://stackoverflow.com/a/15932838/1465701。除非我在这里错过了一些东西,否则我认为这应该是正确的行为。 – nerandell

+2

你的贴图存在拼写错误,'analyser'应该读'analytics' – Val

回答

1

好了,一切看起来正确的,除非您的查询。

您只需在您的查询像这样指定address领域,也将努力:

http://localhost:9200/my_index/my_field/_search?q=address:test 

如果不指定address字段,查询将在_all字段的搜索分析工作是默认的standard,因此你为什么没有找到任何东西。

+0

它工作。我接受这个答案。 – nerandell

+0

太棒了,很高兴帮助! – Val