2016-02-19 63 views
3

我正在使用麦芽解析器与python nltk。我已经成功下载了培训数据并更新了最新的nltk。当我打电话给麦芽解析器时,它给了我一个插入错误。下面是python的代码,其中也包含了回溯。麦芽解析器给出断言错误,当与nltk一起使用

mp = MaltParser("C:/Users/mustufain/Desktop/Python Files/maltparser-1.8.1","C:/Users/mustufain/Desktop/Python Files/maltparser-1.7.2",additional_java_args=['-Xmx512m']) 

Traceback (most recent call last): 
    File "<pyshell#10>", line 1, in <module> 
    mp = MaltParser("C:/Users/mustufain/Desktop/Python Files/maltparser-1.8.1","C:/Users/mustufain/Desktop/Python Files/maltparser-1.7.2",additional_java_args=['-Xmx512m']) 
    File "C:\Python34\lib\site-packages\nltk\parse\malt.py", line 131, in __init__ 
    self.malt_jars = find_maltparser(parser_dirname) 
    File "C:\Python34\lib\site-packages\nltk\parse\malt.py", line 72, in find_maltparser 
    assert malt_dependencies.issubset(_jars) 
AssertionError 
>>> 
+0

你有没有设置:https://github.com/nltk/nltk/wiki/Installing-Third-Party-Software#malt-parser? – alvas

+0

你在'C:/ Users/mustufain/Desktop/Python Files/maltparser-1.8.1'中有['log4j.jar','libsvm.jar','liblinear-1.8.jar']吗? – alvas

+0

在命令提示符处输入'dir C:/ Users/mustufain/Desktop/Python Files/maltparser-1.8.1 /'是什么? – alvas

回答

1

如果所有的下载和环境变量设置是正确,最有可能是文件/目录路径是如何在nltk.parse.malt.py分裂,在https://github.com/nltk/nltk/blob/develop/nltk/parse/malt.py#L69,其将目录和文件名,专门为Linux:

def find_maltparser(parser_dirname): 
    """ 
    A module to find MaltParser .jar file and its dependencies. 
    """ 
    if os.path.exists(parser_dirname): # If a full path is given. 
     _malt_dir = parser_dirname 
    else: # Try to find path to maltparser directory in environment variables. 
     _malt_dir = find_dir(parser_dirname, env_vars=('MALT_PARSER',)) 
    # Checks that that the found directory contains all the necessary .jar 
    malt_dependencies = ['','',''] 
    _malt_jars = set(find_jars_within_path(_malt_dir)) 
    _jars = set(jar.rpartition('/')[2] for jar in _malt_jars) 
    malt_dependencies = set(['log4j.jar', 'libsvm.jar', 'liblinear-1.8.jar']) 

    assert malt_dependencies.issubset(_jars) 
    assert any(filter(lambda i: i.startswith('maltparser-') and i.endswith('.jar'), _jars)) 
    return list(_malt_jars) 

所述错误已被固定,在https://github.com/nltk/nltk/pull/1292

合并改变此行的过程:

_jars = set(jar.rpartition('/')[2] for jar in _malt_jars) 

这应该解决您的问题=)

_jars = set(os.path.split(jar)[1] for jar in _malt_jars) 

对于不相关的代码本身的答案,但你是如何设置环境变量或下载并保存麦芽解析器的目录或文件见https://github.com/nltk/nltk/issues/1294

+0

它因此未工作,我已经改变了线malt.py并重新启动它,它仍然给我一个说法erorr当我加载麦芽解析器 – Mustufain

+0

它让我改变malt.py行后,这个新的断言错误:任何断言(filter(lambda i:i.startswith('maltparser-')and i.ends('。jar'),_jars)) AssertionError – Mustufain

+0

它在这条线上抛出异常:malt_dependencies = set(['log4j.jar' ,'libsvm.jar','liblinear-1.8.jar']) – Mustufain

2

TL;DR(在PYTHON3 !!):

import urllib.request 
urllib.request.urlretrieve('http://www.maltparser.org/mco/english_parser/engmalt.poly-1.7.mco', 'C:\\Users\\mustufain\\Desktop\\engmalt.poly-1.7.mco') 
urllib.request.urlretrieve('http://maltparser.org/dist/maltparser-1.8.1.zip', 'C:\\Users\\mustufain\\Desktop\\maltparser-1.8.1.zip') 
zfile = zipfile.ZipFile('C:\\Users\\mustufain\\Desktop\\maltparser-1.8.1.zip') 
zfile.extractall('C:\\Users\\mustufain\\Desktop\\maltparser-1.8.1\\') 

然后:

from nltk.parse import malt 
mp = malt.MaltParser('C:\\Users\\mustufain\\Desktop\\maltparser-1.8.1\\', "C:\\Users\\mustufain\\Desktop\\engmalt.poly-1.7.mco") 
mp.parse_one('I shot an elephant in my pajamas .'.split()).tree() 
+0

Thanks @ L3viathan for the edit!有一个广泛的答案:https://github.com/nltk/nltk/issues/1294。 – alvas

+0

也许你应该在你的回答中链接那个。从我这里迎接萨尔布吕肯! – L3viathan

+0

@ L3viathan没问题; P – alvas