鹅卵石过滤器,我有以下文件:Elasticsearch使用带有同义词
south africa
north africa
我想从找回我的 “南非” 的文件:
- (a)
southafrica
(b)中safrica
(c)中
我所定义的以下的过滤器和分析仪:
POST test_index
{
"settings": {
"analysis": {
"filter": {
"synonym_filter": {
"type": "synonym",
"synonyms": [
"south,s",
"north,n"
]
},
"shingle_filter": {
"type": "shingle",
"min_shingle_size": 2,
"max_shingle_size": 3,
"token_separator": ""
}
},
"analyzer": {
"my_shingle": {
"type": "custom",
"tokenizer": "standard",
"filter": ["shingle_filter"]
},
"my_shingle_synonym": {
"type": "custom",
"tokenizer": "standard",
"filter": ["shingle_filter", "synonym_filter"]
},
"my_synonym_shingle": {
"type": "custom",
"tokenizer": "standard",
"filter": ["synonym_filter", "shingle_filter"]
}
}
}
},
"mappings": {}
}
1)随着my_shinglesouth africa
将被索引为south
,southafrica
,africa
2)With my_shingle_synonymsouth africa
将被索引为south
,s
,southafrica
,africa
3)同my_synonym_shinglesouth africa
将被索引为south
,souths
,southsafrica
,s
,safrica
,africa
因此,与
(1)I wil升二分找到B
(2)I将找到的a,b
(3)I会发现,C
我想south africa
要被索引为:south
,s
, southafrica
,safrica
,africa