2013-03-19 203 views
3

有没有人知道如何解决TreeTagger这个文件读取错误,这是一个常用的自然语言处理工具,用于POS标记,引理和块句子?TreeTagger安装成功但无法打开.par文件

[email protected]:~/treetagger$ echo 'Hello world!' | cmd/tree-tagger-english 
     reading parameters ... 

ERROR: Can't open for reading: /home/alvas/treetagger/lib/english.par 
aborted. 

我没有遇到任何可能的安装问题上http://www.ims.uni-stuttgart.de/projekte/corplex/TreeTagger/installation-hints.txt的暗示。 我已经按照网页上的说明和它的正确安装(http://www.ims.uni-stuttgart.de/projekte/corplex/TreeTagger/#Linux):

[email protected]:~$ mkdir treetagger 
[email protected]:~$ cd treetagger 
[email protected]:~/treetagger$ wget ftp://ftp.ims.uni-stuttgart.de/pub/corpora/tree-tagger-linux-3.2.tar.gz 
[email protected]:~/treetagger$ wget ftp://ftp.ims.uni-stuttgart.de/pub/corpora/tagger-scripts.tar.gz 
[email protected]:~/treetagger$ wget ftp://ftp.ims.uni-stuttgart.de/pub/corpora/install-tagger.sh 
[email protected]:~/treetagger$ wget ftp://ftp.ims.uni-stuttgart.de/pub/corpora/dutch-par-linux-3.2-utf8.bin.gz 
[email protected]:~/treetagger$ wget ftp://ftp.ims.uni-stuttgart.de/pub/corpora/german-par-linux-3.2-utf8.bin.gz 
[email protected]:~/treetagger$ wget ftp://ftp.ims.uni-stuttgart.de/pub/corpora/italian-par-linux-3.2-utf8.bin.gz 
[email protected]:~/treetagger$ wget ftp://ftp.ims.uni-stuttgart.de/pub/corpora/spanish-par-linux-3.2-utf8.bin.gz 
[email protected]:~/treetagger$ wget ftp://ftp.ims.uni-stuttgart.de/pub/corpora/french-par-linux-3.2-utf8.bin.gz 

[email protected]:~/treetagger$ sh install-tagger.sh 

Linux version of TreeTagger installed. 
Tagging scripts installed. 
German parameter file (Linux, UTF8) installed. 
German chunker parameter file (Linux) installed. 
French parameter file (Linux, UTF8) installed. 
French chunker parameter file (Linux, UTF8) installed. 
Italian parameter file (Linux, UTF8) installed. 
Spanish parameter file (Linux, UTF8) installed. 
Dutch parameter file (Linux, UTF8) installed. 
Path variables modified in tagging scripts. 

You might want to add /home/alvas/treetagger/cmd and /home/alvas/treetagger/bin to the PATH variable so that you do not need to specify the full path to run the tagging scripts. 

但是当我尝试测试软件,我得到这些错误:

[email protected]:~/treetagger$ echo 'Hello world!' | cmd/tree-tagger-english 
    reading parameters ... 

ERROR: Can't open for reading: /home/alvas/treetagger/lib/english.par 
aborted. 
[email protected]:~/treetagger$ echo 'Das ist ein Test.' | cmd/tagger-chunker-german 

ERROR: Can't open for reading: /home/alvas/treetagger/lib/german-chunker.par 
aborted. 

ERROR: Can't open for reading: /home/alvas/treetagger/lib/german.par 
aborted. 
    reading parameters ... 

ERROR: Can't open for reading: /home/alvas/treetagger/lib/german.par 
aborted. 

回答

4

我想有有两个问题:首先,脚本的名字应该有“-utf8”,例如cmd/tagger-chunker-german-utf8,因为您下载了UTF-8数据。其次,标记和分块需要每个数据文件。参见主页上有“PC参数文件”和“PC的Chunker参数文件”部分 - 从两个部分下载文件,然后重新执行install-tagger.sh

0

你写CMD /树恶搞英语,但我认为正确的道路(其中有参数文件)是:

LIB /树恶搞英语