如何建立文件中每个单词之后的所有单词列表？

我试图建立一个使用马尔科夫链的随机语句生成器，但当试图建立一个文件中的每个单词后面的单词列表时遇到问题。我一直在试图使用的代码是：如何建立文件中每个单词之后的所有单词列表？

word_list = [spot+1 for spot in words if spot == word]

我曾尝试变化，如：

word_list = [words[spot+1] for spot in words if spot == word]

但每次，我得到的错误：

TypeError: Can't convert 'int' object to str implicitly

如何我能否正确地将单词添加到给定单词后面的列表中？我觉得有一个明显的解决方案，这是我没有想到的。

来源

2016-11-11 nalydttirrem

是'spot'一个字符串？如果是这样，你想通过加1来完成什么？ – n1c9

Spot是一个字符串，我将它加1以获得它在列表中的单词。 – nalydttirrem

你只是告诉它在字符串中加1，而不是它在列表中的索引。所以如果单词出现不止一次，那么你必须编写'word_list = [words [word_list.index（spot）+ 1] for word in word in word if word == word' – n1c9

关键是要遍历对，而不是单个的单词：

words = ['the', 'enemy', 'of', 'my', 'enemy', 'is', 'my', 'friend'] 
word = 'my' 

[next_word for this_word, next_word in zip(words, words[1:]) if this_word == word]

结果：

['enemy', 'friend']

这种方法依赖于Python的zip()功能，和切片。

words[1:]是words一个副本遗漏了第一个：

>>> words[1:] 
['enemy', 'of', 'my', 'enemy', 'is', 'my', 'friend']

...所以，当你压缩的原始words有了它，你会得到对的列表：

>>> list(zip(words, words[1:])) 
[('The', 'enemy'), 
('enemy', 'of'), 
('of', 'my'), 
('my', 'enemy'), 
('enemy', 'is'), 
('is', 'my'), 
('my', 'friend')]

一旦你有了这个，你的列表理解只需要返回每一对中的第二个单词，如果第一个单词是你正在寻找的第一个单词：

word = 'enemy' 

[next_word for this_word, next_word in zip(words, words[1:]) if this_word == word]

结果：

['of', 'is']

来源

2016-11-13 12:44:08

如何建立文件中每个单词之后的所有单词列表？

回答

相关问题