的Django如何提高速度草垛搜索

我想在我的简单的数据结构Django的环境中创建一个搜索引擎：的Django如何提高速度草垛搜索

| id   | comapany name | 
|:-----------|-----------------:| 
| 12345678 | company A's name | 
| 12345687 | peoples pizza a/s| 
| 87654321 | sub's for pugs |

将有大约公司，我只是想通过搜索名称。当找到名字时，我的django中会返回ID。

我试着大海捞针，嗖等，但我不断收到很慢搜索结果中的各种设置窗口，因为我从500〜我的测试数据集80万提高。搜索有时需要将近一个小时。

我使用的是PaaS的Heroku的，所以我想我会尝试一个集成的付费服务（searly的elasticsearch实现）。这有所帮助，但是当我到达大约8万家公司时，它又开始变得非常缓慢。

已安装的应用

INSTALLED_APPS = [ 
    'django.contrib.admin', 
    'django.contrib.auth', 
    'django.contrib.contenttypes', 
    'django.contrib.sessions', 
    'django.contrib.sites', 

    # Added. 
    'haystack', 

    # Then your usual apps... 
]

更多settings.py

import os 
from urlparse import urlparse 

es = urlparse(os.environ.get('SEARCHBOX_URL') or 'http://127.0.0.1:9200/') 

port = es.port or 80 

HAYSTACK_CONNECTIONS = { 
    'default': { 
     'ENGINE': 'haystack.backends.elasticsearch_backend.ElasticsearchSearchEngine', 
     'URL': es.scheme + '://' + es.hostname + ':' + str(port), 
     'INDEX_NAME': 'documents', 
    }, 


if es.username: 
    HAYSTACK_CONNECTIONS['default']['KWARGS'] = {"http_auth": es.username + ':' + es.password}

search_indexes.py

from haystack import indexes 

from hello.models import Article 


class ArticleIndex(indexes.SearchIndex, indexes.Indexable): 
    ''' 
    defines the model for the serach Engine. 
    ''' 
    text = indexes.CharField(document=True, use_template=True) 
    pub_date = indexes.DateTimeField(model_attr='pub_date') 
    # pub_date line was commented out previously 
    content_auto = indexes.EdgeNgramField(model_attr='title') 

    def get_model(self): 
     return Article 

    def index_queryset(self, using=None): 
     """Used when the entire index for model is updated.""" 
     return self.get_model().objects.all()

article_text.txt

{{ object.title }} 
{{ object.user.get_full_name }} 
{{ object.body }}

urls.py

url(r'^search/$', views.search_titles, name='search'),

views.py

def search_titles(request): 
    txt = request.POST.get('search_text', '') 
    if txt and len(txt) >= 4: 
     articles = SearchQuerySet().autocomplete(content_auto=txt) 
    # if the post request is empty, return nothing 
    # this prevents internal server error with jquery 
    else: 
     articles = [] 
    return render_to_response('scripts/ajax_search.html', 
           {'articles': articles})

search.html

{% if articles.count > 0 %} 
    <!-- simply prints the links to the cvr numbers--> 
    <!-- for article in articles --> 
    {% for article in "x"|rjust:"15" %} 
     <li><a href="{{ article.object.get_absolute_url }}">{{ article.object.title }}</a></li> 
    {% endfor %} 

{% else %} 

    <li>Try again, or try CVR + &#x23ce;</li> 

{% endif %}

的index.html（其中i调用搜索引擎）

{% csrf_token %} 
<input type="text" id="search" name="search" /> 

<!-- This <ul> all company names end up--> 
<ul id ="search-results"></ul>

来源

2017-04-21 MadsVJ

我改变了我的ves.py搜索方法H中，以：

txt = request.POST.get('search_text', '') 
articles = [] 
suggestedSearchTerm = "" 
if txt and len(txt) >= 4: 
    sqs = SearchQuerySet() 
    sqs.query.set_limits(low=0, high=8) 
    sqs = sqs.filter(content=txt) 
    articles = sqs.query.get_results() 
    suggestedSearchTerm = SearchQuerySet().spelling_suggestion(txt) 
    if suggestedSearchTerm == txt: 
     suggestedSearchTerm = '' 
    else: 
     suggestedSearchTerm = suggestedSearchTerm.lower()

来源

2017-04-26 07:58:18 MadsVJ

的Django如何提高速度草垛搜索

回答

相关问题