我可以得到所有从ElasticSearch

的条款和的docId列表我怎样才能得到所有ES.For例如倒排索引数据看起来像下面的条款和文档的列表：我可以得到所有从ElasticSearch

word1: doc1,doc5,doc6... 
word2: doc3,doc9,doc12... 
word3: doc5,doc100...

我只是想获取所有的条款和相应的文档列表。任何API我可以做到这一点。谢谢！

来源

2016-04-26 Jack

我想尝试使用这个工具：https：//simpsora.wordpress.com/2014/05/06/using-luke-with- elasticsearch / –

为了检索这个，你应该了解一点关于Lucene的运作方式。在Lucene中，索引的结构是按照Fields-> Terms-> PostingLists（表示为PostingsEnums）结构化（您似乎知道）。

要检索这些值，你可以用这个作为模板Lucene的工具（假设你有机会获得基本读取器 - AtomicReader：

// get every one of the fields in the reader 
Fields fields = MultiFields.getFields(reader); 
for (String field: fields) { 
    // get the Terms for the field 
    TermsEnum terms = fields.terms(field).iterator(null); 

    // a term is represented by a BytesRef in lucene 
    // and we will iterate across all of them using 
    // the TermsEnum syntax (read lucene docs for this) 
    BytesRef t; 
    while ((t = terms.next()) != null) { 
     // get the PostingsEnum (not that this is called 
     // DocsEnum in Lucene 4.X) which represents an enumeration 
     // of all of the documents for the Term t 
     PostingsEnum docs = terms.postings(null, null); 
     String line = String.format("%s: ",t); 
     while (docs.nextDoc() != NO_MORE_DOCS) { 
      line += String.valueOf(docs.docID()); 
      line += ", " 
     } 
     System.out.println(line); 
    } 
}

我还没有真正有机会正好运行这段代码（我有一个类似的工具，我已经为我的特定的Lucene编写了比较索引的工具），但希望这可以让您对Lucene的结构有一般的了解，以便您可以编写自己的工具。

The棘手的部分将从您的索引获得明确的AtomicReader - 但我确定有其他S tackOverflow的答案，以帮助你！（作为一个小提示，你可能想看看用DirectoryReader#open(File f)#leaves()打开你的索引）

来源

2016-04-27 00:45:25 Almog

我可以得到所有从ElasticSearch

回答

相关问题