如何从对话中提取扬声器注释？

使用Stanford CoreNLP时，我在xml输出文件中得到结果。在它里面，我找到了一个以扬声器名称为例的列：如何从对话中提取扬声器注释？

 <word>Mike</word> 
     <lemma>Mike</lemma> 
     <CharacterOffsetBegin>0</CharacterOffsetBegin> 
     <CharacterOffsetEnd>4</CharacterOffsetEnd> 
     <POS>NNP</POS> 
     <NER>PERSON</NER> 

     *<Speaker>PER0</Speaker>* 

     <TrueCase>INIT_UPPER</TrueCase> 
     <TrueCaseText>Mike</TrueCaseText> 
     <sentiment>Neutral</sentiment>

那么，我该如何操作扬声器结果在Java代码？我怎样才能改善它的结果？例如在一次对话中，我想让迈克代替PER0

谢谢。

来源

2017-03-07 Walid Ghariani

使用DOM XML解析器：

How to read an XML File with the Java DOM Parser

来源

2017-03-07 15:29:59

是的，但我也需要改善生成的结果。我认为有一个扬声器注释器，我应该能够操纵。 –

这个XML片段深入DOM树中吗？所以这对于多个扬声器重复？您可以搜索包含Speaker的根元素作为子元素，然后返回Mike元素。 –

首先，感谢你@Thomas更大的对你的答案
我会尽量更清晰，
在这段代码，

PrintWriter xmlOut = new PrintWriter("xmlOutput.xml"); 
    Properties props = new Properties(); 
    props.setProperty("annotators","tokenize, ssplit, pos, lemma, truecase, ner, parse,quote, mention, dcoref, sentiment"); 
    props.put("truecase.overwriteText", "true"); 
    StanfordCoreNLP pipeline = new StanfordCoreNLP(props);  
    Annotation annotation = new Annotation("Mike said : \"I vote for Hillary.\"\n 
              peter said : \"I vote for Donald.\""); 
    pipeline.annotate(annotation); 
    pipeline.xmlPrint(annotation, xmlOut);

xmlOut.xml提供了两句话的分析：

First Sentence

<迈克说>，<：>，< “>和<”>被视为narator的讲话（PER0）。
<我投票给希拉里>被认为是人的演讲1.

Second Sentence

<彼得说>，<：>，< “>和<”>被视为narator的演讲（PER0）。
<我投给唐纳德。 >被认为是彼得的讲话。 =>这里唯一的区别是，我写彼得小写，当我用大写字母写扬声器的结果成为4

虽然在斯坦福CoreNLP搜索Javadoc中找到这样的课程讲的主讲这样：
- CoreAnnotations .SpeakerAnnotation
- CoreNLPProtos.SpeakerInfo
- CoreNLPProtos.SpeakerInfo.Builder
- CoreNLPProtos.SpeakerInfoOrBuilder
- SpeakerInfo
- SpeakerInfo
- SpeakerMatch

所以我首先要让我的xmlOut有更高效的结果，其次要知道如何在不使用DOM XML的情况下使用这些类来提取扬声器及其语音。

来源

2017-03-08 10:01:28

如何从对话中提取扬声器注释？

回答

相关问题