2016-07-25 80 views
3

我正在使用 python。并在python中使用dsl驱动程序。Elasticsearch延迟存储和立即搜索

我的脚本如下。

import time 
from elasticsearch_dsl import DocType, String 
from elasticsearch import exceptions as es_exceptions 
from elasticsearch_dsl.connections import connections 

ELASTICSEARCH_INDEX = 'test' 

class StudentDoc(DocType): 
    student_id = String(required=True) 
    tags = String(null_value=[]) 

    class Meta: 
     index = ELASTICSEARCH_INDEX 

    def save(self, **kwargs): 
     ''' 
     Override to set metadata id 
     ''' 
     self.meta.id = self.student_id 
     return super(StudentDoc, self).save(**kwargs) 

# Define a default Elasticsearch client 
connections.create_connection(hosts=['localhost:9200']) 

# create the mappings in elasticsearch 
StudentDoc.init() 

student_doc_obj = \ 
    StudentDoc(
     student_id=str(1), 
     tags=['test']) 

try: 
    student_doc_obj.save() 
except es_exceptions.SerializationError as ex: 
    # catch both exception raise by elasticsearch 
    LOGGER.error('Error while creating elasticsearch data') 
    LOGGER.exception(ex) 
else: 
    print "*"*80 
    print "Student Created:", student_doc_obj 
    print "*"*80 


search_docs = \ 
    StudentDoc \ 
    .search().query('ids', 
        values=["1"]) 
try: 
    student_docs = search_docs.execute() 
except es_exceptions.NotFoundError as ex: 
    LOGGER.error('Unable to get data from elasticsearch') 
    LOGGER.exception(ex) 
else: 
    print "$"*80 
    print student_docs 
    print "$"*80 

time.sleep(2) 

search_docs = \ 
    StudentDoc \ 
    .search().query('ids', 
        values=["1"]) 
try: 
    student_docs = search_docs.execute() 
except es_exceptions.NotFoundError as ex: 
    LOGGER.error('Unable to get data from elasticsearch') 
    LOGGER.exception(ex) 
else: 
    print "$"*80 
    print student_docs 
    print "$"*80 

在此脚本中,我创建StudentDoc并尝试访问创建时相同的文档。我在得到empty的回应时做了search的记录。

输出

******************************************************************************** 
Student Created: {'student_id': '1', 'tags': ['test']} 
******************************************************************************** 
$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$ 
<Response: []> 
$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$ 
$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$ 
<Response: [{u'student_id': u'1', u'tags': [u'test']}]> 
$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$ 

save命令执行和存储数据,那么也就是为什么search不会返回TAT数据。在2第二次睡眠后,它返回数据。 :(

尝试用相同的命令curl,相同的输出。

echo "Create Data" 
curl http://localhost:9200/test/student_doc/2 -X PUT -d '{"student_id": "2", "tags": ["test"]}' -H 'Content-type: application/json' 

echo 
echo "Search ID" 
curl http://localhost:9200/test/student_doc/_search -X POST -d '{"query": {"ids": {"values": ["2"]}}}' -H 'Content-type: application/json' 
echo 

是否有将数据存储到elasticsearch任何延迟?

回答

2

是,一旦指数一个新的文件,它是不可用,直到索引发生刷新,但有几个选项,主要是

答:您可以refreshtest索引在保存后使用底层连接student_doc_obj和之前寻找它:

connections.get_connection.indices.refresh(index= ELASTICSEARCH_INDEX) 

B.可以get,而不是寻找它的文档,为get是完全实时,不需要等待刷新:

student_docs = StudentDoc.get("1") 

同样,使用curl,你可以简单地添加refresh查询字符串参数在PUT调用

echo "Create Data" 
curl 'http://localhost:9200/test/student_doc/2?refresh=true' -X PUT -d '{"student_id": "2", "tags": ["test"]}' -H 'Content-type: application/json' 

或者你可以简单地得到由文档编号

echo "GET ID" 
curl -XGET http://localhost:9200/test/student_doc/2 
+1

谢谢@val它的工作原理,我改变我的代码为'save(refresh = True)',并刷新索引。 – Nilesh

+1

很高兴工作! – Val