1
我收到一个错误,我不明白,当试图执行一些python代码。我试图通过优秀的NLTK教科书来学习使用自然语言工具包。在尝试下面的代码时(我为自己的数据修改了图2.1),我收到下面的错误。python断言错误nltk.ConditionalFreqDistribution
代码,我跑:
import os, re, csv, string, operator
import nltk
from nltk.corpus import PlaintextCorpusReader
dir = '/Dropbox/hearings'
corpus_root = dir
text = PlaintextCorpusReader(corpus_root, ".*")
cfd = nltk.ConditionalFreqDist(
(target, fileid[:3])
for fileid in text.fileids()
for w in text.words(fileid)
for target in ['budget','appropriat']
if w.lower().startswith(target))
cfd.plot()
错误我收到(全回溯):
In [6]: ---------------------------------------------------------------------------
AssertionError Traceback (most recent call last)
<ipython-input-6-abc9ff8cb2f1> in <module>()
----> 1 execfile(r'/Dropbox/hearings/hearings_ingest.py') # PYTHON-MODE
/Dropbox/hearings/hearings_ingest.py in <module>()
14 cfd = nltk.ConditionalFreqDist(
15 (target, fileid[:3])
---> 16 for fileid in text.fileids()
17 for w in text.words(fileid)
18 for target in ['budget','appropriat']
/Users/ian/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/nltk/probability.pyc in __init__(self, cond_samples)
1727 defaultdict.__init__(self, FreqDist)
1728 if cond_samples:
-> 1729 for (cond, sample) in cond_samples:
1730 self[cond].inc(sample)
1731
/Dropbox/hearings/hearings_ingest.py in <genexpr>((fileid,))
15 (target, fileid[:3])
16 for fileid in text.fileids()
---> 17 for w in text.words(fileid)
18 for target in ['budget','appropriat']
19 if w.lower().startswith(target))
/Users/ian/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/nltk/corpus/reader/util.pyc in iterate_from(self, start_tok)
341
342 # If we reach this point, then we should know our length.
--> 343 assert self._len is not None
344
345 # Use concat for these, so we can use a ConcatenatedCorpusView
AssertionError:
In [7]:
我包括新的IPython的线来表示,这是完全错误。 (在阅读其他问题时,我看到“AssertionError:”后面往往有更多的信息,在我的错误中它是空白的。)
我很感激任何帮助理解我的代码中的错误!谢谢!
非常感谢!这个伎俩。我正在处理大约13,000个文件,我错误地认为它们都具有正面的文件大小。我想我应该想到这个后,发现错误发生在len不是none的情况下。 –
对。虽然'AssertionError'没有留下任何消息,但回溯是有用的。 – unutbu