我已经在OS X(Lion 10.7.5)上安装了nltk的实现,用于Python2.7。NLTK FCFG的:超过最大递归深度
的前几章的基本上下文无关文法出色的工作,但是当我试图加载,基于特征的上下文无关文法的连基本的例子,如:
from __future__ import print_function
import nltk
from nltk import grammar, parse
g = """
% start DP
DP[AGR=?a] -> D[AGR=?a] N[AGR=?a]
D[AGR=[NUM='sg', PERS=3]] -> 'this' | 'that'
D[AGR=[NUM='pl', PERS=3]] -> 'these' | 'those'
D[AGR=[NUM='pl', PERS=1]] -> 'we'
D[AGR=[PERS=2]] -> 'you'
N[AGR=[NUM='sg', GND='m']] -> 'boy'
N[AGR=[NUM='pl', GND='m']] -> 'boys'
N[AGR=[NUM='sg', GND='f']] -> 'girl'
N[AGR=[NUM='pl', GND='f']] -> 'girls'
N[AGR=[NUM='sg']] -> 'student'
N[AGR=[NUM='pl']] -> 'students'
"""
grammar = grammar.FeatureGrammar.fromstring(g)
tokens = 'these girls'.split()
parser = parse.FeatureEarleyChartParser(grammar)
trees = parser.parse(tokens)
for tree in trees: print(tree)
(来源:http://www.nltk.org/howto/featgram.html)
...导致错误:
File "test_fcfg.py", line 18, in <module>
grammar = grammar.FeatureGrammar.fromstring(g)
File "/Library/Python/2.7/site-packages/nltk/grammar.py", line 796, in fromstring
encoding=encoding)
File "/Library/Python/2.7/site-packages/nltk/grammar.py", line 1270, in read_grammar
productions += _read_production(line, nonterm_parser, probabilistic)
File "/Library/Python/2.7/site-packages/nltk/grammar.py", line 1220, in _read_production
return [Production(lhs, rhs) for rhs in rhsides]
File "/Library/Python/2.7/site-packages/nltk/grammar.py", line 270, in __init__
self._hash = hash((self._lhs, self._rhs))
File "/Library/Python/2.7/site-packages/nltk/grammar.py", line 203, in __hash__
self.freeze()
File "/Library/Python/2.7/site-packages/nltk/featstruct.py", line 373, in freeze self._freeze(set())
File "/Library/Python/2.7/site-packages/nltk/featstruct.py", line 395, in _freeze
for (fname, fval) in sorted(self._items()):
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/functools.py", line 56, in <lambda>
'__lt__': [('__gt__', lambda self, other: other < self),
...
...
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/functools.py", line 56, in <lambda>
'__lt__': [('__gt__', lambda self, other: other < self),
RuntimeError: maximum recursion depth exceeded while calling a Python object
(省略号...表示的行的多次重复之前和之后其外观)
谷歌搜索没有多大用处;实际上,它对nltk的错误总体来说并没有太大的价值,这让我很惊讶。
我对错误信息的理解是,出于某种原因grammar.FeatureGrammar.fromstring(g)被抓到了看起来是无限循环的东西。用sys模块增加递归深度的大小根本没有帮助;在看到相同的错误信息之前,我只是稍等一会儿。
我注意到与其他nltk示例模块似乎已经移动;例如,文本“使用Python进行自然语言处理”经常使用形式为'lp = nltk.LogicParser()'的命令,但该类似乎已移至nltk.sem.logic.LogicParser()。但是,这似乎不是当前问题的原因。
是否有一个众所周知的或显而易见的错误信息记录在nltk中?而且,可能是一个纠正?