我正在构建一个15k行培训数据文档,名为:en-ner-person.train,按照在线手册(http://opennlp.apache.org/documentation/1.5.2-incubating/manual /opennlp.html)。打开NLP名称查找器培训
我的问题是:在我的培训文档中,是否包含整个报告?或者,我是否只包含名称为<START:person> John Smith <END>
的行?
因此,例如,我在我的训练数据使用此报告全文:
<START:person> Pierre Vinken <END> , 61 years old , will join the board as a nonexecutive director Nov. 29 .
A nonexecutive director has many similar responsibilities as an executive director.
However, there are no voting rights with this position.
Mr . <START:person> Vinken <END> is chairman of Elsevier N.V. , the Dutch publishing group .
还是我只包括我的培训文件中这两行:
<START:person> Pierre Vinken <END> , 61 years old , will join the board as a nonexecutive director Nov. 29 .
Mr . <START:person> Vinken <END> is chairman of Elsevier N.V. , the Dutch publishing group .