获取蟒蛇fuzzywuzzy比赛

我使用Python fuzzywuzzy找到句子的列表匹配指数：它周围获取蟒蛇fuzzywuzzy比赛

def getMatches(needle): 
    return process.extract(needle, bookSentences, scorer=fuzz.token_sort_ratio, limit=3)

我想打印出来的比赛加上句：

for match in matches: 
    matchIndex = bookSentences.index(match) 
    sentenceIndices = range(matchIndex-2,matchIndex+2) 
    for index in sentenceIndices: 
     print bookSentences[index], 
    print '\n\n'

不幸的是，该脚本无法找到匹配原始列表：

ValueError: (u'Thus, in addition to the twin purposes mentioned above, this book is written for at least two groups: 1.', 59) is not in list

有找到原始列表中匹配索引的更好方法？可以fuzzywuzzy一些如何给我？在readme中似乎没有任何关于它的内容。

如何获得由fuzzywuzzy返回的匹配项的原始列表中的索引？

来源

2015-12-02 Nathan Arthur

我觉得有点笨。 fuzzywuzzy返回包含分数的元组，而不仅仅是匹配。解决方案：

for match in matches: 
    matchIndex = bookSentences.index(match[0]) 
    sentenceIndices = range(matchIndex-2,matchIndex+2) 
    for index in sentenceIndices: 
     print bookSentences[index], 
    print '\n\n'

来源

2015-12-02 17:40:19

这只适用于'process.extract'方法，并且只是因为返回的匹配保证在列表中。我使用'fuzzywuzzy'在一长段文本中搜索子字符串，使用'fuzz.partial_ratio'，它只返回一个分数。我想我将不得不为了我的目的检查SequenceMatcher。 –

[这篇文章]（http://stackoverflow.com/a/31433394/721305）似乎是一个不错的选择。 –

获取蟒蛇fuzzywuzzy比赛

回答

相关问题