2014-01-09 25 views
1

Iam使用LUCENE 4.6搜索PDF中的短语。我写了下面的代码。但它在“Analyzer”和“QueryPhrase”行中引发错误。请帮助我这样做。使用LUCENE 4.6和PDF文件框搜索PDF文本的示例代码

  Analyzer analyzer = new Analyzer(Version.LUCENE_44); 

      // Store the index in memory:    
      Directory directory = new RAMDirectory(); 
      // To store an index on disk, use this instead: 
      //Directory directory = FSDirectory.open("/tmp/testindex"); 
      IndexWriterConfig config = new IndexWriterConfig(Version.LUCENE_44, analyzer); 
      IndexWriter iwriter = new IndexWriter(directory, config); 
      Document doc = new Document(); 
      String text = "This is the text to be indexed."; 
      doc.add(new Field("fieldname", text, TextField.TYPE_STORED)); 
      iwriter.addDocument(doc); 
      iwriter.close(); 

      // Now search the index 
      DirectoryReader ireader = DirectoryReader.open(directory); 
      IndexSearcher isearcher = new IndexSearcher(ireader); 
      // Parse a simple query that searches for "text": 
      QueryParser parser = new QueryParser(Version.LUCENE_44, "fieldname", analyzer); 
      Query query = parser.parse("text"); 
      ScoreDoc[] hits = isearcher.search(query, null, 1000).scoreDocs; 
      // Iterate through the results: 
      if(hits.length>0){ 
       System.out.println("Searched text existed in the PDF."); 
      } 
      ireader.close(); 
      directory.close(); 
     } 
     catch(Exception e){ 
      System.out.println("Exception: "+e.getMessage()); 
     } 
} 

回答

1

您不能实例化抽象类Analyzer。相反,你可以写这样的:

Analyzer analyzer = new EnglishAnalyzer(Version.LUCENE_44); 
+0

您可能还需要将这些库添加到CLASSPATH: ' \t org.apache.lucene \t 的Lucene分析仪常见 \t 4.6。 0 \t org.apache.lucene \t 的Lucene的QueryParser \t 4.6.0 ' – maksim07

+0

上述声明,试图为良好,但仍面临着problem..it不解决问题。 – Lucene1

+0

你有运行时错误还是编译错误?你是否在类路径中有这些库:lucene-analyers-common,lucene-queryparser? – maksim07