2014-02-28 135 views
4

我有一个我在Elasticsearch中排序的控制台平台列表。Elasticsearch中的意外(不区分大小写)字符串排序

这里的 “姓名” 字段映射:

{ 
    "name": { 
     "type": "multi_field", 
     "fields": { 
      "name": { 
       "type": "string", 
       "index": "analyzed" 
      }, 
      "sort_name": { 
       "type": "string", 
       "index": "not_analyzed" 
      } 
     } 
    } 
} 

当我执行以下查询

{ 
    "query": { 
    "match_all": {} 
    }, 
    "sort": [ 
     { 
      "name.sort_name": { "order": "asc" } 
     } 
    ], 
    "fields": ["name"] 
} 

我得到这些结果:

{ 
    "took": 1, 
    "timed_out": false, 
    "_shards": { 
     "total": 3, 
     "successful": 3, 
     "failed": 0 
    }, 
    "hits": { 
     "total": 17, 
     "max_score": null, 
     "hits": [ 
      { 
       "_index": "platforms", 
       "_type": "platform", 
       "_id": "1393602489", 
       "_score": null, 
       "fields": { 
        "name": "GameCube" 
       }, 
       "sort": [ 
        "GameCube" 
       ] 
      }, 
      { 
       "_index": "platforms", 
       "_type": "platform", 
       "_id": "1393602490", 
       "_score": null, 
       "fields": { 
        "name": "Gameboy Advance" 
       }, 
       "sort": [ 
        "Gameboy Advance" 
       ] 
      }, 


    { 
      "_index": "platforms", 
      "_type": "platform", 
      "_id": "1393602498", 
      "_score": null, 
      "fields": { 
       "name": "Nintendo 3DS" 
      }, 
      "sort": [ 
       "Nintendo 3DS" 
      ] 
     }, 

     ...remove for brevity ... 

     { 
      "_index": "platforms", 
      "_type": "platform", 
      "_id": "1393602493", 
      "_score": null, 
      "fields": { 
       "name": "Xbox 360" 
      }, 
      "sort": [ 
       "Xbox 360" 
      ] 
     }, 
     { 
      "_index": "platforms", 
      "_type": "platform", 
      "_id": "1393602502", 
      "_score": null, 
      "fields": { 
       "name": "Xbox One" 
      }, 
      "sort": [ 
       "Xbox One" 
      ] 
     }, 
     { 
      "_index": "platforms", 
      "_type": "platform", 
      "_id": "1393602497", 
      "_score": null, 
      "fields": { 
       "name": "iPhone/iPod" 
      }, 
      "sort": [ 
       "iPhone/iPod" 
      ] 
     } 
    ] 
} 

万事俱备如预期的那样,除了iPhone/iPod结果在结尾(而不是在GameBoy Advance之后) - 为什么名称中的/对排序有影响?

感谢

回答

15

好了,所以我发现原因是没有什么做的/。 ES会按大写字母和小写字母排序。在我添加'analyzer': 'sortable'sort_name多域的域映射

{ 
    "analysis": { 
     "analyzer": { 
      "sortable": { 
       "tokenizer": "keyword", 
       "filter": [ 
        "lowercase" 
       ] 
      } 
     } 
    } 
} 

然后:

我添加自定义分析器到索引创建的settings

+0

这真的是可以实现不区分大小写的排序最简单的方法? –

+0

这个工作但速度很慢 - 结果需要3秒来确定何时排序升序,15秒排序降序! – danday74

相关问题