感谢上面的好答案。只需在Solr 4.2.1中进行设置,然后再添加它们即可。 (在Solr 4之前,您只能在全局范围内更改所有字段的相似性。)
假设我们希望Solr对特定字段不使用逆文档频率(idf) - 我们应该为此编写自己的自定义相似度像上面提到的:
package com.mycompany.similarity;
import org.apache.lucene.search.similarities.DefaultSimilarity;
public class NoIDFSimilarity extends DefaultSimilarity
{
@Override
public float idf(long docFreq, long numDocs)
{
return 1.0f;
}
@Override
public String toString()
{
return "NoIDFSimilarity";
}
}
,然后在我们的schema.xml中定义一个新的字段类型如下:
<fieldType name="int_no_idf"
class="solr.TrieIntField"
precisionStep="0"
positionIncrementGap="0"
omitNorms="true">
<similarity class="com.mycompany.similarity.NoIDFSimilarity"/>
</fieldType>
,并用它在这样一个领域:
<field name="tag_id_no_idf"
type="int_no_idf"
indexed="true"
stored="false"
multiValued="true" />
如果我们这样做只是这么多,那么你会得到下面的异常:
SEVERE: Unable to create core: SimilarList
org.apache.solr.common.SolrException: FieldType 'int_no_idf' is configured with a similarity, but the global similarity does not support it: class org.apache.solr.search.similarities.DefaultSimilarityFactory
at org.apache.solr.schema.IndexSchema.readSchema(IndexSchema.java:466)
at org.apache.solr.schema.IndexSchema.<init>(IndexSchema.java:122)
at org.apache.solr.core.CoreContainer.createFromLocal(CoreContainer.java:1018)
at org.apache.solr.core.CoreContainer.create(CoreContainer.java:1051)
at org.apache.solr.core.CoreContainer$3.call(CoreContainer.java:634)
at org.apache.solr.core.CoreContainer$3.call(CoreContainer.java:629)
at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source)
at java.util.concurrent.FutureTask.run(Unknown Source)
at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source)
at java.util.concurrent.FutureTask.run(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
Apr 25, 2013 5:02:08 PM org.apache.solr.common.SolrException log
SEVERE: null:org.apache.solr.common.SolrException: Unable to create core: SimilarList
at org.apache.solr.core.CoreContainer.recordAndThrow(CoreContainer.java:1672)
at org.apache.solr.core.CoreContainer.create(CoreContainer.java:1057)
at org.apache.solr.core.CoreContainer$3.call(CoreContainer.java:634)
at org.apache.solr.core.CoreContainer$3.call(CoreContainer.java:629)
at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source)
at java.util.concurrent.FutureTask.run(Unknown Source)
at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source)
at java.util.concurrent.FutureTask.run(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
Caused by: org.apache.solr.common.SolrException: FieldType 'int_no_idf' is configured with a similarity, but the global similarity does not support it: class org.apache.solr.search.similarities.DefaultSimilarityFactory
at org.apache.solr.schema.IndexSchema.readSchema(IndexSchema.java:466)
at org.apache.solr.schema.IndexSchema.<init>(IndexSchema.java:122)
at org.apache.solr.core.CoreContainer.createFromLocal(CoreContainer.java:1018)
at org.apache.solr.core.CoreContainer.create(CoreContainer.java:1051)
... 10 more
谷歌搜索导致你this,所以只需添加这条线在schema.xml中,这将被应用到字段的其余部分:
<similarity class="solr.SchemaSimilarityFactory"/>
(从该链接:但是记住,坐标和queryNorm(= 1.0F)现在还没有实现,所以你会得到不同的分数为TF-IDF)
这很好。只需补充一点,4.0之前的版本只允许全局相似。 4.0+允许每场相似性。 (请参阅https://wiki.apache.org/solr/SchemaXml#Similarity) – arun 2013-04-22 21:54:56