什么是NLTK POS tagger要我下载？

我刚开始使用词性标注器，而且我面临很多问题。什么是NLTK POS tagger要我下载？

我开始词性标注下列要求：

import nltk 
text=nltk.word_tokenize("We are going out.Just you and me.")

当我想打印'text'，会发生以下情况：

print nltk.pos_tag(text) 
Traceback (most recent call last): 
File "<stdin>", line 1, in <module> 
File "F:\Python26\lib\site-packages\nltk\tag\__init__.py", line 63, in pos_tag 
tagger = nltk.data.load(_POS_TAGGER) 
File "F:\Python26\lib\site-packages\nltk\data.py", line 594, in load 
resource_val = pickle.load(_open(resource_url)) 
File "F:\Python26\lib\site-packages\nltk\data.py", line 673, in _open 
return find(path).open() 
File "F:\Python26\lib\site-packages\nltk\data.py", line 455, in find 
    raise LookupError(resource_not_found)` 
LookupError: 
Resource 'taggers/maxent_treebank_pos_tagger/english.pickle' not 
found. Please use the NLTK Downloader to obtain the resource: 

>>> nltk.download(). 

Searched in: 
    - 'C:\\Documents and Settings\\Administrator/nltk_data' 
    - 'C:\\nltk_data' 
    - 'D:\\nltk_data' 
    - 'E:\\nltk_data' 
    - 'F:\\Python26\\nltk_data' 
    - 'F:\\Python26\\lib\\nltk_data' 
    - 'C:\\Documents and Settings\\Administrator\\Application Data\\nltk_data'

我用nltk.download()，但没有奏效。

来源

2011-12-21 Pearl

为什么您要将所有文本加粗？这实际上没有必要。另外，请发布一个最小但完整的例子来说明你的错误。 – 2011-12-21 13:22:34

在那里，我为你清理它。请以此为例来说明如何格式化未来的问题。 – 2011-12-21 13:26:40

thankx ...现在问题已解决... – Pearl 2011-12-21 19:06:17

当您在Python中输入nltk.download()时，会自动显示NLTK Downloader界面。
点击型号并选择maxent_treebank_pos_。它会自动安装。

import nltk 
text=nltk.word_tokenize("We are going out.Just you and me.") 
print nltk.pos_tag(text) 
[('We', 'PRP'), ('are', 'VBP'), ('going', 'VBG'), ('out.Just', 'JJ'), 
('you', 'PRP'), ('and', 'CC'), ('me', 'PRP'), ('.', '.')]

来源

2011-12-22 04:43:48 Pearl

+16

此外，如果您指定标记名称'nltk.download（'maxent_treebank_pos_tagger'）;'，则可以直接在代码中下载它。看到这篇文章http：// stackoverflow。com/a/5208563/62921 – ForceMagic 2013-03-27 21:23:54

import nltk 
text = "Obama delivers his first speech." 

sent = nltk.sent_tokenize(text) 


loftags = [] 
for s in sent: 
    d = nltk.word_tokenize(s) 

    print nltk.pos_tag(d)

结果：

akshayy @ ubuntu的：〜/ SUMM $蟒nn1.py [（ '奥巴马'， 'NNP'），（ '提供'， 'NNS' ），（ '他'， 'PRP $'），（ '第一'， 'JJ'），（ '讲话'， 'NN'），（ ' ' '。'）]

（我刚才问了另一个问题在哪里使用此代码）

来源

2013-04-04 19:18:27 akshayb

值得注意的是，这个解析是不正确的** - POS标记器已经标记为“递送”作为复数名词... – simon 2015-07-16 22:15:21

nltk.download()

点击型号并选择maxent_treebank_pos_。它会自动安装。

import nltk 
text=nltk.word_tokenize("We are going out.Just you and me.") 
print nltk.pos_tag(text) 
[('We', 'PRP'), ('are', 'VBP'), ('going', 'VBG'), ('out.Just', 'JJ'), 
('you', 'PRP'), ('and', 'CC'), ('me', 'PRP'), ('.', '.')]

来源

2013-08-13 13:40:55

从外壳/终端，你可以使用：

python -m nltk.downloader maxent_treebank_pos_tagger

（可能需要须藤在Linux上）

它将NLTK安装maxent_treebank_pos_tagger（即标准的树库POS恶搞）并解决您的问题。

来源

2015-09-16 05:47:41

从比V3.2更高版本NLTK，请使用：

>>> import nltk 
>>> nltk.__version__ 
'3.2.1' 
>>> nltk.download('averaged_perceptron_tagger') 
[nltk_data] Downloading package averaged_perceptron_tagger to 
[nltk_data]  /home/alvas/nltk_data... 
[nltk_data] Package averaged_perceptron_tagger is already up-to-date! 
True

对于NLTK版本使用旧型号最大墒，即V3.1及以下，请使用：

>>> import nltk 
>>> nltk.download('maxent_treebank_pos_tagger') 
[nltk_data] Downloading package maxent_treebank_pos_tagger to 
[nltk_data]  /home/alvas/nltk_data... 
[nltk_data] Package maxent_treebank_pos_tagger is already up-to-date! 
True

对于有关默认pos_tag更改的更多详细信息，请参阅https://github.com/nltk/nltk/pull/1143

来源

2016-06-06 07:01:47 alvas

什么是NLTK POS tagger要我下载？

回答

相关问题