1

我有一个Rails 3.2.6应用程序,我使用的是Sphinx 0.9.9Thinking Sphinx 2.0.12如何让狮身人面像实时索引?

我需要Sphinx实时更新其索引。例如,当用户创建新帖子时,它会立即显示在搜索结果中。或者,如果他们删除了一条信息,它就不会显示,从删除它的时刻开始。

我跟着关于delta indexing的文档。

在此基础上建议,我有一个执行每二十分钟,运行bundle exec rake ts:index RAILS_ENV=production cron作业...

打开增量索引不会删除需要定期运行完全重新索引,否则三角洲指数本身将会变得和核心指数一样大,这就消除了保持分离的优势。它还会减慢对服务器的更改模型记录的请求。

只有在作业运行后才会显示新条目。

这里是我的define_index ...

define_index do 

    indexes(title) 
    indexes(entry) 

    has user_id 
    has created_at 
    has updated_at 

    set_property :delta => true 

end 

这里是我的production.sphinx.conf ...

indexer 
{ 
} 

searchd 
{ 
    listen = 127.0.0.1:9312 
    log = /opt/deployed_rails_apps/my_app/releases/20120713022228/log/searchd.log 
    query_log = /opt/deployed_rails_apps/my_app/releases/20120713022228/log/searchd.query.log 
    pid_file = /opt/deployed_rails_apps/my_app/releases/20120713022228/log/searchd.production.pid 
} 

source entry_core_0 
{ 
    type = mysql 
    sql_host = localhost 
    sql_user = abc 
    sql_pass = abc 
    sql_db = my_app_production 
    sql_query_pre = UPDATE `entries` SET `delta` = 0 WHERE `delta` = 1 
    sql_query_pre = SET NAMES utf8 
    sql_query_pre = SET TIME_ZONE = '+0:00' 
    sql_query = SELECT SQL_NO_CACHE `entries`.`id` * CAST(1 AS SIGNED) + 0 AS `id` , `entries`.`title` AS `title`, `entries`.`entry` AS `entry`, `entries`.`id` AS `sphinx_internal_id`, 0 AS `sphinx_deleted`, 3940594292 AS `class_crc`, `entries`.`user_id` AS `user_id`, UNIX_TIMESTAMP(`entries`.`created_at`) AS `created_at`, UNIX_TIMESTAMP(`entries`.`updated_at`) AS `updated_at` FROM `entries` WHERE (`entries`.`id` >= $start AND `entries`.`id` <= $end AND `entries`.`delta` = 0) GROUP BY `entries`.`id` ORDER BY NULL 
    sql_query_range = SELECT IFNULL(MIN(`id`), 1), IFNULL(MAX(`id`), 1) FROM `entries` WHERE `entries`.`delta` = 0 
    sql_attr_uint = sphinx_internal_id 
    sql_attr_uint = sphinx_deleted 
    sql_attr_uint = class_crc 
    sql_attr_uint = user_id 
    sql_attr_timestamp = created_at 
    sql_attr_timestamp = updated_at 
    sql_query_info = SELECT * FROM `entries` WHERE `id` = (($id - 0)/1) 
} 

index entry_core 
{ 
    source = entry_core_0 
    path = /opt/deployed_rails_apps/my_app/releases/20120713022228/db/sphinx/production/entry_core 
    charset_type = utf-8 
} 

source entry_delta_0 : entry_core_0 
{ 
    type = mysql 
    sql_user = abc 
    sql_pass = abc 
    sql_db = my_app_production 
    sql_query_pre = 
    sql_query_pre = SET NAMES utf8 
    sql_query_pre = SET TIME_ZONE = '+0:00' 
    sql_query = SELECT SQL_NO_CACHE `entries`.`id` * CAST(1 AS SIGNED) + 0 AS `id` , `entries`.`title` AS `title`, `entries`.`entry` AS `entry`, `entries`.`id` AS `sphinx_internal_id`, 0 AS `sphinx_deleted`, 3940594292 AS `class_crc`, `entries`.`user_id` AS `user_id`, UNIX_TIMESTAMP(`entries`.`created_at`) AS `created_at`, UNIX_TIMESTAMP(`entries`.`updated_at`) AS `updated_at` FROM `entries` WHERE (`entries`.`id` >= $start AND `entries`.`id` <= $end AND `entries`.`delta` = 1) GROUP BY `entries`.`id` ORDER BY NULL 
    sql_query_range = SELECT IFNULL(MIN(`id`), 1), IFNULL(MAX(`id`), 1) FROM `entries` WHERE `entries`.`delta` = 1 
    sql_attr_uint = sphinx_internal_id 
    sql_attr_uint = sphinx_deleted 
    sql_attr_uint = class_crc 
    sql_attr_uint = user_id 
    sql_attr_timestamp = created_at 
    sql_attr_timestamp = updated_at 
    sql_query_info = SELECT * FROM `entries` WHERE `id` = (($id - 0)/1) 
} 

index entry_delta : entry_core 
{ 
    source = entry_delta_0 
    path = /opt/deployed_rails_apps/my_app/releases/20120713022228/db/sphinx/production/entry_delta 
} 

index entry 
{ 
    type = distributed 
    local = entry_delta 
    local = entry_core 
} 

任何想法我可能是做错了?

+1

不知道什么是错的,但增量指标的设计正是这种需求。你能否包含模型中的define_index? – MrTheWalrus 2012-07-24 19:52:08

+0

好的,谢谢。我添加了它。 – Ethan 2012-07-24 19:57:53

+0

谢谢。怕我仍然看不到任何错误 - 所有看起来都与我的应用程序相媲美,Sphinx看起来工作正常。您的增量指标是否适用于非生产环境?你看到日志中的重新索引? (每当你保存一个Entry时,如果delta索引正常工作,你应该看到Sphinx的一些输出) – MrTheWalrus 2012-07-24 20:06:12

回答