使用字符串以任意顺序匹配数组元素

-2

我对python很陌生，试图找到tweet是否有任何查找元素。使用字符串以任意顺序匹配数组元素

例如，如果我能找到这个单词猫，它应该匹配猫，也可以任意顺序匹配可爱的小猫。但从我了解我无法找到解决方案。任何指导表示赞赏。

import re 
lookup_table = ['cats', 'cute kittens', 'dog litter park'] 
tweets = ['that is a cute cat', 
      'kittens are cute', 
      'that is a cute kitten', 
      'that is a dog litter park', 
      'no wonder that dog park is bad'] 
for tweet in tweets: 
    lookup_found = None 
    print re.findall(r"(?=(" + '|'.join(lookup_table) + r"))", tweet.lower())

输出

['cat'] 
[] 
[] 
['dog litter park'] 
[]

预期输出：

that is a cute cat > cats 
kittens are cute > cute kittens 
this is a cute kitten > cute kittens 
that is a dog litter park > dog litter park 
no wonder that dog park is bad > dog litter park

来源

2017-04-18 user6083088

?? ??使用单数形式。 –

你也应该告诉我们你实际需要的输出。 –

@KarolyHorvath我不确定你的意思是 – user6083088

对于查找的话这是只有一个字的文字，你可以使用

for word in tweet

而对于像查找单词“可爱的小猫”，你在哪里等任何订单。只需将它分开并在推文字符串中查找即可。

这是我试过的，它效率不高，但它的工作。尝试运行它。

lookup_table = ['cat', 'cute kitten', 'dog litter park'] 
tweets = ['that is a cute cat', 
      'kittens are cute', 
      'that is a cute kitten', 
      'that is a dog litter park', 
      'no wonder that dog park is bad'] 

for word in lookup_table: 
    for tweet in tweets: 
     if " " in word: 
      temp = word.split(sep=" ") 
     else: 
      temp = [word] 
     for x in temp: 
      if x in tweet: 
       print(tweet) 
       break

来源

2017-04-18 10:16:38

这是我该怎么做。我认为lookup_table不必太严格，我们可以避免复数;

import re 
lookup_table = ['cat', 'cute kitten', 'dog litter park'] 
tweets = ['that is a cute cat', 
     'kittens are cute', 
     'that is a cute kitten', 
     'that is a dog litter park', 
     'no wonder that dog park is bad'] 
for data in lookup_table: 
    words=data.split(" ") 
    for word in words: 
     result=re.findall(r'[\w\s]*' + word + '[\w\s]*',','.join(tweets)) 
     if len(result)>0: 
      print(result)

来源

2017-04-18 10:27:36 gr8tech

问题1：

单/复数： 只是为了让事情滚动我会用活用，Python包摆脱单一&复数，例如...

问题2：

分裂和加入： 我写了一个小脚本来演示率你如何使用它，没有稳健测试，但应该让你移动

import inflect 
p = inflect.engine() 
lookup_table = ['cats', 'cute kittens', 'dog litter park'] 
tweets = ['that is a cute cat', 
      'kittens are cute', 
      'that is a cute kitten', 
      'that is a dog litter park', 
      'no wonder that dog park is bad'] 

for tweet in tweets: 
    matched = [] 
    for lt in lookup_table: 
      match_result = [lt for mt in lt.split() for word in tweet.split() if p.compare(word, mt)] 
      if any(match_result): 
       matched.append(" ".join(match_result)) 
    print tweet, '>>' , matched

来源

2017-04-18 11:19:06

使用字符串以任意顺序匹配数组元素

回答

相关问题