2016-04-14 176 views
0

我尝试使用Wordnet作为thearus,所以我有一个单词列表,我需要为每个单词收集其同义词。我想这查找wordnet中单词的同义词

from nltk.corpus import wordnet as wn 
for i,j in enumerate(wn.synsets('dog')): 
    print (j.lemma_names) 

该代码给出了下面的输出

<bound method Synset.lemma_names of Synset('dog.n.01')> 
<bound method Synset.lemma_names of Synset('frump.n.01')> 
<bound method Synset.lemma_names of Synset('dog.n.03')> 
<bound method Synset.lemma_names of Synset('cad.n.01')> 
<bound method Synset.lemma_names of Synset('frank.n.02')> 
<bound method Synset.lemma_names of Synset('pawl.n.01')> 
<bound method Synset.lemma_names of Synset('andiron.n.01')> 
<bound method Synset.lemma_names of Synset('chase.v.01')> 

但我想在列表中只同义词收集,所以输出会是这样

[“穿得邋里邋遢的女人” ,'cad','frank','pawl','andiron','chase']

+0

如果将最后一行'print(j.lemma_names)'更改为'print(j.lemma_names())',会发生什么? – davedwards

回答

0

正如您的输出所示,lemma_names是一种方法而不是属性。打击代码工作如你预期:

from nltk.corpus import wordnet as wn 
result = [st.lemma_names()[0] for st in wn.synsets('dog')] 
print(result) 

输出是:

[u'dog', u'frump', u'dog', u'cad', u'frank', u'pawl', u'andiron', u'chase'] 

请注意,在列表中的项目是Unicode字符串的。这就是为什么你在输出中看到领先的或者