我在使用Elasticsearch Python客户端时遇到了一个问题。我有一个名为test.json的文件(有效!)JSON。我现在想要在elasticsearch中索引该JSON。我试过这个little Tutorial来检查我是否可以连接到我的本地elasticsearch实例,它的工作,所以我相信这个问题是不是在我与elasticsearch连接。Elasticsearch Python客户端索引JSON
当我跑我的小代码在这里:
from elasticsearch import Elasticsearch
import json
es = Elasticsearch([{'host': 'localhost', 'port': 9200}])
with open('test.json') as json_data:
es.index(index='testdata', doc_type='generated', id=1, body=json.load(json_data))
我在我的命令行得到这个异常(mapper_parsing_exception?):
Traceback (most recent call last):
File "app.py", line 13, in <module>
es.index(index='testdata', doc_type='generated', id=1, body=json.load(json_data))
File "/home/elk/Documents/pythonelastic/venv/local/lib/python2.7/site-packages/elasticsearch/client/utils.py", line 73, in _wrapped
return func(*args, params=params, **kwargs)
File "/home/elk/Documents/pythonelastic/venv/local/lib/python2.7/site-packages/elasticsearch/client/__init__.py", line 300, in index
_make_path(index, doc_type, id), params=params, body=body)
File "/home/elk/Documents/pythonelastic/venv/local/lib/python2.7/site-packages/elasticsearch/transport.py", line 318, in perform_request
status, headers, data = connection.perform_request(method, url, params, body, ignore=ignore, timeout=timeout)
File "/home/elk/Documents/pythonelastic/venv/local/lib/python2.7/site-packages/elasticsearch/connection/http_urllib3.py", line 128, in perform_request
self._raise_error(response.status, raw_data)
File "/home/elk/Documents/pythonelastic/venv/local/lib/python2.7/site-packages/elasticsearch/connection/base.py", line 124, in _raise_error
raise HTTP_EXCEPTIONS.get(status_code, TransportError)(status_code, error_message, additional_info)
elasticsearch.exceptions.RequestError: TransportError(400, u'mapper_parsing_exception', u'failed to parse')
你能指出我在赖特方向,什么可能是问题吗?
啊,是的,我打印了“json.load(json_data)”蚂蚁工作完美,这意味着从文件加载JSON没有问题。
感谢您的帮助! Greez
更新:
with open('test.json') as json_data:
#d = json.load(json_data)
print(json_data)
es.index(index='testdata', doc_type='generated', id=1, body=json_data)
此代码也不管用,我甚至不能打印JSON的CL。现在
错误:
<open file 'test.json', mode 'r' at 0x7f8329340c00>
Traceback (most recent call last):
File "app.py", line 14, in <module>
es.index(index='testdata', doc_type='generated', id=1, body=json_data)
File "/home/elk/Documents/pythonelastic/venv/local/lib/python2.7/site-packages/elasticsearch/client/utils.py", line 73, in _wrapped
return func(*args, params=params, **kwargs)
File "/home/elk/Documents/pythonelastic/venv/local/lib/python2.7/site-packages/elasticsearch/client/__init__.py", line 300, in index
_make_path(index, doc_type, id), params=params, body=body)
File "/home/elk/Documents/pythonelastic/venv/local/lib/python2.7/site-packages/elasticsearch/transport.py", line 284, in perform_request
body = self.serializer.dumps(body)
File "/home/elk/Documents/pythonelastic/venv/local/lib/python2.7/site-packages/elasticsearch/serializer.py", line 50, in dumps
raise SerializationError(data, e)
elasticsearch.exceptions.SerializationError: (<closed file 'test.json', mode 'r' at 0x7f8329340c00>, TypeError("Unable to serialize <open file 'test.json', mode 'r' at 0x7f8329340c00> (type: <type 'file'>)",))
多数民众赞成在test.json文件(只是一些随机生成的JSON)的内容:
[
{
"_id": "58ee19e75ffc814d4dff17da",
"index": 0,
"guid": "45476739-80b3-49de-8f00-9923f84f56ce",
"isActive": true,
"balance": "$2,882.08",
"picture": "http://placehold.it/32x32",
"age": 31,
"eyeColor": "blue",
"name": "Liliana Odom",
"gender": "female",
"company": "PLASTO",
"email": "[email protected]",
"phone": "+1 (983) 474-3785",
"address": "121 Sedgwick Place, Farmington, Marshall Islands, 2593",
"about": "Adipisicing veniam ex nulla irure minim incididunt et irure est nostrud ex ut. Occaecat eu proident eu pariatur deserunt aliquip. Commodo ullamco incididunt consequat quis commodo irure elit quis. Aute et reprehenderit ad ipsum magna cupidatat magna minim sunt labore mollit occaecat. Dolore sint veniam deserunt excepteur.",
"registered": "2015-05-07T05:40:28 -02:00",
"latitude": -46.141522,
"longitude": -157.943368,
"tags": [
"labore",
"quis"
],
"friends": [
{
"id": 0,
"name": "Earline Bass"
}
],
"greeting": "Hello, Liliana Odom! You have 5 unread messages.",
"favoriteFruit": "apple"
}
]
更新2:
我想这现在:
id = 1
with open('test.json') as json_data:
data = json.load(json_data)
for dat in data:
print(json.dumps(dat))
es.index(index='testdata', doc_type='generated', id=id, body=json.dumps(dat))
id += 1
打印(json.dumps(DAT))的作品,但我现在得到一个IllegalArgumentException:
Traceback (most recent call last):
File "app.py", line 15, in <module>
es.index(index='testdata', doc_type='generated', id=id, body=json.dumps(dat))
File "/home/elk/Documents/pythonelastic/venv/local/lib/python2.7/site-packages/elasticsearch/client/utils.py", line 73, in _wrapped
return func(*args, params=params, **kwargs)
File "/home/elk/Documents/pythonelastic/venv/local/lib/python2.7/site-packages/elasticsearch/client/__init__.py", line 300, in index
_make_path(index, doc_type, id), params=params, body=body)
File "/home/elk/Documents/pythonelastic/venv/local/lib/python2.7/site-packages/elasticsearch/transport.py", line 318, in perform_request
status, headers, data = connection.perform_request(method, url, params, body, ignore=ignore, timeout=timeout)
File "/home/elk/Documents/pythonelastic/venv/local/lib/python2.7/site-packages/elasticsearch/connection/http_urllib3.py", line 128, in perform_request
self._raise_error(response.status, raw_data)
File "/home/elk/Documents/pythonelastic/venv/local/lib/python2.7/site-packages/elasticsearch/connection/base.py", line 124, in _raise_error
raise HTTP_EXCEPTIONS.get(status_code, TransportError)(status_code, error_message, additional_info)
elasticsearch.exceptions.RequestError: TransportError(400, u'illegal_argument_exception', u'[Bloodstorm][127.0.0.1:9300][indices:data/write/index[p]]')
更新3: Hereis ES日志,貌似id字段是该指数定义了两次。
[2017-04-12 17:43:07,847][DEBUG][action.index ] [Bloodstorm] failed to execute [index {[testdata][generated][AVti1SY7fn4azWzi8gyQ], source[{"guid": "45476739-80b3-49de-8f00-9923f84f56ce", "index": 0, "favoriteFruit": "apple", "latitude": -46.141522, "company": "PLASTO", "email": "[email protected]", "picture": "http://placehold.it/32x32", "tags": ["labore", "quis"], "registered": "2015-05-07T05:40:28 -02:00", "eyeColor": "blue", "phone": "+1 (983) 474-3785", "address": "121 Sedgwick Place, Farmington, Marshall Islands, 2593", "friends": [{"id": 0, "name": "Earline Bass"}], "isActive": true, "about": "Adipisicing veniam ex nulla irure minim incididunt et irure est nostrud ex ut. Occaecat eu proident eu pariatur deserunt aliquip. Commodo ullamco incididunt consequat quis commodo irure elit quis. Aute et reprehenderit ad ipsum magna cupidatat magna minim sunt labore mollit occaecat. Dolore sint veniam deserunt excepteur.", "balance": "$2,882.08", "name": "Liliana Odom", "gender": "female", "age": 31, "greeting": "Hello, Liliana Odom! You have 5 unread messages.", "longitude": -157.943368, "_id": "58ee19e75ffc814d4dff17da"}]}] on [[testdata][3]]
java.lang.IllegalArgumentException: Field [_id] is defined twice in [generated]
at org.elasticsearch.index.mapper.MapperService.checkFieldUniqueness(MapperService.java:496)
at org.elasticsearch.index.mapper.MapperService.merge(MapperService.java:376)
at org.elasticsearch.index.mapper.MapperService.merge(MapperService.java:320)
at org.elasticsearch.cluster.metadata.MetaDataMappingService$PutMappingExecutor.applyRequest(MetaDataMappingService.java:306)
at org.elasticsearch.cluster.metadata.MetaDataMappingService$PutMappingExecutor.execute(MetaDataMappingService.java:230)
at org.elasticsearch.cluster.service.InternalClusterService.runTasksForExecutor(InternalClusterService.java:480)
at org.elasticsearch.cluster.service.InternalClusterService$UpdateTask.run(InternalClusterService.java:784)
at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:231)
at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:194)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
看来我要: 'with打开( 'test.json')作为json_data: #D = json.load(json_data) 打印(json_data) es.index(指数='TESTDATA ',doc_type ='generated',id = 1,body = json_data)' 给我这个新错误 'elasticsearch.exceptions.SerializationError :((type :) )似乎反引号不起作用来标记内联代码 –
PouletFreak
您应该更新您的问题与该错误,所以它更清晰。你也可以分享你的'test.json'文件的内容吗? – Val
对不起,我在这里比较新;-),更新了我的问题 – PouletFreak