2017-05-09 87 views
0

我想在Unix服务器上启动TextBlob并运行一些队友,当我运行以root身份运行时使用TextBlob的脚本时,它似乎工作得很好,但是当我尝试新帐户时创建我得到以下错误:如何让TextBlob与Ubuntu上的所有用户一起工作?

********************************************************************** 
    Resource u'tokenizers/punkt/english.pickle' not found. Please 
    use the NLTK Downloader to obtain the resource: >>> 
    nltk.download() 
    Searched in: 
    - '/home/USERNAME/nltk_data' 
    - '/usr/share/nltk_data' 
    - '/usr/local/share/nltk_data' 
    - '/usr/lib/nltk_data' 
    - '/usr/local/lib/nltk_data' 
    - u'' 
********************************************************************** 
Traceback (most recent call last): 
    File "sampleClassifier.py", line 25, in <module> 
    cl = NaiveBayesClassifier(train) 
    File "/usr/local/lib/python2.7/dist-packages/textblob/classifiers.py", line 192, in __init__ 
    self.train_features = [(self.extract_features(d), c) for d, c in self.train_set] 
    File "/usr/local/lib/python2.7/dist-packages/textblob/classifiers.py", line 169, in extract_features 
    return self.feature_extractor(text, self.train_set) 
    File "/usr/local/lib/python2.7/dist-packages/textblob/classifiers.py", line 81, in basic_extractor 
    word_features = _get_words_from_dataset(train_set) 
    File "/usr/local/lib/python2.7/dist-packages/textblob/classifiers.py", line 63, in _get_words_from_dataset 
    return set(all_words) 
    File "/usr/local/lib/python2.7/dist-packages/textblob/classifiers.py", line 62, in <genexpr> 
    all_words = chain.from_iterable(tokenize(words) for words, _ in dataset) 
    File "/usr/local/lib/python2.7/dist-packages/textblob/classifiers.py", line 59, in tokenize 
    return word_tokenize(words, include_punc=False) 
    File "/usr/local/lib/python2.7/dist-packages/textblob/tokenizers.py", line 72, in word_tokenize 
    for sentence in sent_tokenize(text)) 
    File "/usr/local/lib/python2.7/dist-packages/textblob/base.py", line 64, in itokenize 
    return (t for t in self.tokenize(text, *args, **kwargs)) 
    File "/usr/local/lib/python2.7/dist-packages/textblob/decorators.py", line 38, in decorated 
    raise MissingCorpusError() 
textblob.exceptions.MissingCorpusError: 
Looks like you are missing some required data for this feature. 

To download the necessary data, simply run 

    python -m textblob.download_corpora 

or use the NLTK downloader to download the missing data: http://nltk.org/data.html 
If this doesn't fix the problem, file an issue at https://github.com/sloria/TextBlob/issues. 

我们正在使用的机器是非常小的,所以我不能下载语料库多次为不同的用户压倒了 - 没有人知道如何我可能会解决这个问题?我已经为root安装了它,但是我不知道软件包的位置或如何找到它们。

+0

您是否将其安装在自定义位置?默认情况下它会转到'/ usr/share/nltk_data',你的代码正在同一个文件夹中搜索。 – Rubbal

+0

我做了'pip安装textblob',它回来说“需求已经满足” - 显然服务器已经拥有了它?我不知道它在哪里 – unicornication32232

回答

0

按照docs中的说明应该可以工作。尝试设置NLTK_DATA环境变量并查看它是否适用于新用户。

+0

这样做的窍门,非常感谢我指出了正确的方向! – unicornication32232