我需要用一个单词输入文本文件。然后我需要找到使用wordnet的单词的词组名称,定义和例子。我已经阅读了这本书:“使用NLTK 2.0 Cookbook进行Python文本处理”以及“使用NLTK进行自然语言处理”来帮助我实现这一目标。虽然我已经理解了如何使用终端来完成此任务,但我无法使用文本编辑器进行同样的操作。使用WordNet查找同义词,定义和例句
例如,如果输入的文本具有单词“大吃一惊”时,输出必须以这种方式:
大吃一惊 (动词)惊奇,惊奇,碗过度 - 惊奇地克服; “这让人难以置信!” (形容词)目瞪口呆,模糊不清,惊呆了,惊呆了,惊呆了,哑口无言 - 好像惊呆了一样惊呆了; “一个警察的圈子因为拒绝看到这次事故而感到羞愧”; “令人fla目结舌的议员无言以对”; “被他的宣传消息吓坏了”
synsets,定义和例句是从WordNet直接获得的!
我有下面的代码:
from __future__ import division
import nltk
from nltk.corpus import wordnet as wn
tokenizer = nltk.data.load('tokenizers/punkt/english.pickle')
fp = open("inpsyn.txt")
data = fp.read()
#to tokenize input text into sentences
print '\n-----\n'.join(tokenizer.tokenize(data))# splits text into sentences
#to tokenize the tokenized sentences into words
tokens = nltk.wordpunct_tokenize(data)
text = nltk.Text(tokens)
words = [w.lower() for w in text]
print words #to print the tokens
for a in words:
print a
syns = wn.synsets(a)
print "synsets:", syns
for s in syns:
for l in s.lemmas:
print l.name
print s.definition
print s.examples
我得到以下输出:
flabbergasted
['flabbergasted']
flabbergasted
synsets: [Synset('flabbergast.v.01'), Synset('dumbfounded.s.01')]
flabbergast
boggle
bowl_over
overcome with amazement
['This boggles the mind!']
dumbfounded
dumfounded
flabbergasted
stupefied
thunderstruck
dumbstruck
dumbstricken
as if struck dumb with astonishment and surprise
['a circle of policement stood dumbfounded by her denial of having seen the accident', 'the flabbergasted aldermen were speechless', 'was thunderstruck by the news of his promotion']
有没有办法跟团引理的名字一起检索讲话的一部分?
如果你重新登录的话,你应该接受安德烈的回答,尤指因为他不仅回答了,而且还回应了你的意见来帮助你。 – 2012-11-25 21:02:49