我有以下映射字段定义:使用自动完成与弹性搜索电子邮件不起作用
"my_field": {
"properties": {
"address": {
"type": "string",
"analyzer": "email",
"search_analyzer": "whitespace"
}
}
}
我的电子邮件分析是这样的:
{
"analysis": {
"filter": {
"email_filter": {
"type": "edge_ngram",
"min_gram": "3",
"max_gram": "255"
}
},
"analyzer": {
"email": {
"type": "custom",
"filter": [
"lowercase",
"email_filter",
"unique"
],
"tokenizer": "uax_url_email"
}
}
}
}
当我尝试搜索对于电子邮件ID,如[email protected]
搜索像tes,test.xy等术语不起作用。但是,如果我搜索 test.xyz或[email protected],它工作正常。我尝试使用我的电子邮件过滤器分析令牌,并且如预期的那样正常工作
Ex。击中http://localhost:9200/my_index/_analyze?analyzer=email&[email protected]
我得到:
{
"tokens": [{
"token": "tes",
"start_offset": 0,
"end_offset": 20,
"type": "word",
"position": 0
}, {
"token": "test",
"start_offset": 0,
"end_offset": 20,
"type": "word",
"position": 0
}, {
"token": "test.",
"start_offset": 0,
"end_offset": 20,
"type": "word",
"position": 0
}, {
"token": "test.x",
"start_offset": 0,
"end_offset": 20,
"type": "word",
"position": 0
}, {
"token": "test.xy",
"start_offset": 0,
"end_offset": 20,
"type": "word",
"position": 0
}, {
"token": "test.xyz",
"start_offset": 0,
"end_offset": 20,
"type": "word",
"position": 0
}, {
"token": "[email protected]",
"start_offset": 0,
"end_offset": 20,
"type": "word",
"position": 0
}, {
"token": "[email protected]",
"start_offset": 0,
"end_offset": 20,
"type": "word",
"position": 0
}, {
"token": "[email protected]",
"start_offset": 0,
"end_offset": 20,
"type": "word",
"position": 0
}, {
"token": "[email protected]",
"start_offset": 0,
"end_offset": 20,
"type": "word",
"position": 0
}, {
"token": "[email protected]",
"start_offset": 0,
"end_offset": 20,
"type": "word",
"position": 0
}, {
"token": "[email protected]",
"start_offset": 0,
"end_offset": 20,
"type": "word",
"position": 0
}, {
"token": "[email protected]",
"start_offset": 0,
"end_offset": 20,
"type": "word",
"position": 0
}, {
"token": "[email protected]",
"start_offset": 0,
"end_offset": 20,
"type": "word",
"position": 0
}, {
"token": "[email protected]",
"start_offset": 0,
"end_offset": 20,
"type": "word",
"position": 0
}, {
"token": "[email protected]",
"start_offset": 0,
"end_offset": 20,
"type": "word",
"position": 0
}, {
"token": "[email protected]",
"start_offset": 0,
"end_offset": 20,
"type": "word",
"position": 0
}, {
"token": "[email protected]",
"start_offset": 0,
"end_offset": 20,
"type": "word",
"position": 0
}]
}
所以我知道在断词的作品。但是在搜索时,它无法搜索部分字符串。
例如,寻找http://localhost:9200/my_index/my_field/_search?q=test,结果显示没有命中。我的指数
详情:
{
"my_index": {
"aliases": {
"alias_default": {}
},
"mappings": {
"my_field": {
"properties": {
"address": {
"type": "string",
"analyzer": "email",
"search_analyzer": "whitespace"
},
"boost": {
"type": "long"
},
"createdat": {
"type": "date",
"format": "strict_date_optional_time||epoch_millis"
},
"instanceid": {
"type": "long"
},
"isdeleted": {
"type": "integer"
},
"object": {
"type": "string"
},
"objecthash": {
"type": "string"
},
"objectid": {
"type": "string"
},
"parent": {
"type": "short"
},
"parentid": {
"type": "integer"
},
"updatedat": {
"type": "date",
"format": "strict_date_optional_time||epoch_millis"
}
}
}
},
"settings": {
"index": {
"creation_date": "1480342980403",
"number_of_replicas": "1",
"max_result_window": "100000",
"uuid": "OUuiTma8CA2VNtw9Og",
"analysis": {
"filter": {
"email_filter": {
"type": "edge_ngram",
"min_gram": "3",
"max_gram": "255"
},
"autocomplete_filter": {
"type": "edge_ngram",
"min_gram": "3",
"max_gram": "20"
}
},
"analyzer": {
"autocomplete": {
"type": "custom",
"filter": [
"lowercase",
"autocomplete_filter"
],
"tokenizer": "standard"
},
"email": {
"type": "custom",
"filter": [
"lowercase",
"email_filter",
"unique"
],
"tokenizer": "uax_url_email"
}
}
},
"number_of_shards": "5",
"version": {
"created": "2010099"
}
}
},
"warmers": {}
}
}
有搜索“search_analyzer”:“whitespace”分析器。删除,并做映射 – Backtrack
@Backtrack我相信这是正确的。检查http://stackoverflow.com/a/15932838/1465701。除非我在这里错过了一些东西,否则我认为这应该是正确的行为。 – nerandell
你的贴图存在拼写错误,'analyser'应该读'analytics' – Val