给定一系列包含文本的文档,我想搜索短语并返回所有匹配并对它们进行排名。我知道如何让lucene/solr指出哪些文档匹配,并在文档中突出显示,但是如何获得包含来自同一文档的多个匹配的排名?在lucene索引文档中查找和排列多个短语匹配
First document. It has a single line of text.
Second document. This text line is quite short.
This is another line containing more text and is a bit longer.
如果我搜索 “文本行”,那么我想它找到的三场比赛,排名如下:
2nd document -> ...This "text line" is quite short.
1st document -> ...It has a single "line of text".
2nd document -> ...another "line containing more text" and is...
这可能吗?怎么样?
我本来有一个更复杂的问题,其中包括这一点,在这里:http://stackoverflow.com/questions/8883390/obtain-metadata-associated-with-matched-content-in-solr-lucene – 2012-01-17 13:40:02
为什么要在结果中两次使用document2?也许你应该将每一行索引为一个文档... – naresh 2012-01-18 09:44:02
这就是我所说的,如果你想匹配成行,每一行作为一个文档。 – milan 2012-01-18 10:24:19