使用wordnet获取单词的最佳同义词

我已经完成了从wordnet获取同义词的代码，并且它提供了每个单词的同义词的完整列表。所以，我希望我的代码根据句子从同义词列表中选择适当的同义词。使用wordnet获取单词的最佳同义词

例如：句子是：“我是他的哥哥”，我必须根据这句话找出每个单词的最佳同义词。

让我们选择“较旧”。 Wordnet将给出“老”的同义词列表：

['老'，'一次'，'前'，'sr。'，'一次'，'erstwhile'，'诚实对上帝' ，'老'，'老'，'过去'，'足够肯定'，'年长'，'高级'，'老'，'某个时候'，'诚实善良'，'过去'，' ]

从列表中最好的同义词基于这个句子是'老'，所以它应该被选中。

我该怎么做？

代码获取同义词：

from nltk.tokenize import word_tokenize 
from nltk.tag import pos_tag 
from nltk.corpus import wordnet as wn 

def tag(sentence): 
words = word_tokenize(sentence) 
words = pos_tag(words) 
return words 

def paraphraseable(tag): 
return tag.startswith('NN') or tag == 'VB' or tag.startswith('JJ') 

def pos(tag): 
if tag.startswith('NN'): 
    return wn.NOUN 
elif tag.startswith('V'): 
    return wn.VERB 

def synonyms(word, tag): 
    lemma_lists = [ss.lemmas() for ss in wn.synsets(word, pos(tag))] 
    lemmas = [lemma.name() for lemma in sum(lemma_lists, [])] 
    return set(lemmas) 

def synonymIfExists(sentence): 
for (word, t) in tag(sentence): 
    if paraphraseable(t): 
    syns = synonyms(word, t) 
    if syns: 
    if len(syns) > 1: 
     yield [word, list(syns)] 
     continue 
    yield [word, []] 

def paraphrase(sentence): 
return [x for x in synonymIfExists(sentence)] 
get=[] 
get=paraphrase("I am his older brother") 
print("paraphrase",get)

来源

2017-05-25 anashamidkh

为什么“老人”是最好的？（也就是说，判断最好的标准是什么，或者你用什么算法来决定这一点？）（顺便说一下，我认为“大哥”是“哥哥”的最佳代名词，但是你甚至没有在你的名单！） –

同义词同义词集列出了发生在自然语言和在特定环境中的频率无关。为了探索这两个缺失的区域，我会更多地使用双向预测模型，并检查同义词集中的哪些单词出现在要替换的语音的左上下文旁边。同样，您可以探索正确的上下文以及和/或更长的上下文。

另一种更简单的方法是根据足够大的语料库中的词频向WordNet引入频率顺序。假设将出现在语料库中的频率是对同义词的适当性的正确暗示。

来源

2017-05-25 11:00:11 sophros

使用wordnet获取单词的最佳同义词

回答

相关问题