我正在将elasticsearch prod数据从1.4.3v迁移到5.5v,为此我使用的是reindex。当我尝试重新索引老ES指数新ES指数编制索引失败并抛出异常Failed Reason: mapper [THROUGHPUT_ROWS_PER_SEC] cannot be changed from type [long] to [float]. Failed Type: illegal_argument_exception
Elasticsearch数据与映射不匹配
为task_history指数ES映射ES在ES 5.5V task_history指数1.4.3v
{
"task_history": {
"mappings": {
"task_run_hist": {
"_all": {
"enabled": false
},
"_routing": {
"required": true,
"path": "org_id"
},
"properties": {
"RUN_TIME_IN_MINS": {
"type": "double"
},
"THROUGHPUT_ROWS_PER_SEC": {
"type": "long"
},
"account_name": {
"type": "string",
"index": "not_analyzed",
"store": true
}
}
}
}
}
}
ES映射(该映射被创建作为部分重新编制索引)
{
"task_history": {
"mappings": {
"task_run_hist": {
"_all": {
"enabled": false
},
"_routing": {
"required": true
},
"properties": {
"RUN_TIME_IN_MINS": {
"type": "float"
},
"THROUGHPUT_ROWS_PER_SEC": {
"type": "long"
},
"account_name": {
"type": "keyword",
"store": true
}
}
}
}
}
}
样本数据
{
"_index": "task_history",
"_type": "task_run_hist",
"_id": "1421955143",
"_score": 1,
"_source": {
"RUN_TIME_IN_MINS": 0.47,
"THROUGHPUT_ROWS_PER_SEC": 46,
"org_id": "xxxxxx",
"account_name": "Soma Acc1"
}
},
{
"_index": "task_history",
"_type": "task_run_hist",
"_id": "1421943738",
"_score": 1,
"_source": {
"RUN_TIME_IN_MINS": 1.02,
"THROUGHPUT_ROWS_PER_SEC": 65.28,
"org_id": "yyyyyy",
"account_name": "Choma Acc1"
}
}
个
2个问题
- 如何为
THROUGHPUT_ROWS_PER_SEC
类型是long
映射时elasticsearch 1.4.3在保存浮点数? - 如果这是旧ES中的数据问题,我怎么能在开始重新索引过程之前删除所有的浮点数?
对于第二个选项我想列出使用以下查询,这样我就可以验证一次,并删除它有浮点数的所有文件,但下面的查询仍然列出有THROUGHPUT_ROWS_PER_SEC
非浮点数文件。
注:Groovy脚本启用
GET task_history/task_run_hist/_search?size=100
{
"filter": {
"script": {
"script": "doc['THROUGHPUT_ROWS_PER_SEC'].value % 1 == 0"
}
}
}
一个由Val
提供当我尝试在下面重新索引脚本解决方案更新,我得到一个运行时错误。下面列出。任何关于在这里得到的东西的线索?我添加了附加条件,将RUN_TIME_IN_MINS
浮动为原始脚本在RUN_TIME_IN_MINS
字段中指出的错误。 mapper [RUN_TIME_IN_MINS] cannot be changed from type [long] to [float]"
POST _reindex?wait_for_completion=false
{
"source": {
"remote": {
"host": "http://esip:15000"
},
"index": "task_history"
},
"dest": {
"index": "task_history"
},
"script": {
"inline": "if (ctx._source.THROUGHPUT_ROWS_PER_SEC % 1 != 0) { ctx.op = 'noop' } ctx._source.RUN_TIME_IN_MINS = (float) ctx._source.RUN_TIME_IN_MINS;",
"lang": "painless"
}
}
运行时错误
{
"completed": true,
"task": {
"node": "wZOzypYlSayIRlhp9y3lVA",
"id": 645528,
"type": "transport",
"action": "indices:data/write/reindex",
"status": {
"total": 18249521,
"updated": 4691,
"created": 181721,
"deleted": 0,
"batches": 37,
"version_conflicts": 0,
"noops": 67076,
"retries": {
"bulk": 0,
"search": 0
},
"throttled_millis": 0,
"requests_per_second": -1,
"throttled_until_millis": 0
},
"description": """
reindex from [host=esip port=15000 query={
"match_all" : {
"boost" : 1.0
}
}][task_history] updated with Script{type=inline, lang='painless', idOrCode='if (ctx._source.THROUGHPUT_ROWS_PER_SEC % 1 != 0) { ctx.op = 'noop' } ctx._source.RUN_TIME_IN_MINS = (float) ctx._source.RUN_TIME_IN_MINS;', options={}, params={}} to [task_history]
""",
"start_time_in_millis": 1502336063507,
"running_time_in_nanos": 93094657751,
"cancellable": true
},
"error": {
"type": "script_exception",
"reason": "runtime error",
"script_stack": [],
"script": "if (ctx._source.THROUGHPUT_ROWS_PER_SEC % 1 != 0) { ctx.op = 'noop' } ctx._source.RUN_TIME_IN_MINS = (float) ctx._source.RUN_TIME_IN_MINS;",
"lang": "painless",
"caused_by": {
"type": "null_pointer_exception",
"reason": null
}
}
}
这是非常可能的,你在ES 1.x中创建的第一个文件有很长的值(参见'“THROUGHPUT_ROWS_PER_SEC”:46')和映射是在创建基础。然后所有后续值(无论是否浮动)都将被强制延长。您需要在启动reindex过程之前在ES 5 **中创建映射。 – Val
@Val:在这种情况下,具有浮点数的文档将抛出异常并停止重新索引过程,并且映射是正确的。它必须是'long'类型的。 – abi1964
您显然需要在ES 5.x映射中设置'double'以适应您的不同值 – Val