2010-09-14 43 views
1

通过期限删除文件下面的代码不能按预期期限由删除文件:从Lucene的

 RAMDirectory idx  = new RAMDirectory(); 
     IndexWriter writer = new IndexWriter(idx, 
            new SnowballAnalyzer(Version.LUCENE_30, "English"), 
            IndexWriter.MaxFieldLength.LIMITED); 
     Document doc = new Document(); 
     doc.add(new Field("title", "mydoc", Field.Store.YES, Field.Index.NO)); 
     doc.add(new Field("content", "some content, deleteme", Field.Store.YES, Field.Inde 
x.ANALYZED)); 
     writer.addDocument(doc); 
     Document doc2 = new Document();   
     doc2.add(new Field("title", "mydoc2", Field.Store.YES, Field.Index.NO)); 
     doc2.add(new Field("content", "other content, don't deleteme", Field.Store.YES, Field.I 
ndex.ANALYZED)); 
     writer.addDocument(doc2); 
     writer.optimize(); 
     writer.close(); 

     /* 
     IndexReader reader = IndexReader.open(idx, false); 
     int docs_up_for_deletion = reader.docFreq(new Term("title")); 
     int before = reader.numDocs(); 
     int docs_deleted = reader.deleteDocuments(new Term("title", "mydoc")); 
     reader.close(); 
     */ 

     IndexWriter writer2 = new IndexWriter(idx, 
            new SnowballAnalyzer(Version.LUCENE_30, "English"), 
            IndexWriter.MaxFieldLength.LIMITED); 
     int before = writer2.numDocs(); 
     writer2.deleteDocuments(new Term("title", "mydoc")); 
     writer2.commit(); 
     writer2.optimize(); 
     int after = writer2.numDocs(); 
     writer2.close(); 
     int docs_deleted = before - after; 

我试着用的IndexReader和IndexWriter类既不作品删除。

我也试过在上面的代码之后添加另一个IndexReader搜索,以防数字在关闭writer2后出现更新(在this FAQ中提到),但这没有帮助。做一个writer.deleteAll()的作品,而不是根据术语删除。

我发现了一个旧引用的事实,只有类型Field.Keyword的字段可以被删除,但这不再是Lucene的3.x的一个有效字段类型

回答

1

你的标题字段不被索引。更改

new Field("title", "mydoc", Field.Store.YES, Field.Index.NO) 

new Field("title", "mydoc", Field.Store.YES, Field.Index.ANALYZED) 

new Field("title", "mydoc", Field.Store.YES, Field.Index.NOT_ANALYZED) 

取决于你是否希望你的领域进行分析。