2017-05-24 120 views
0
a = {'1330': ('John', 'Gold', '1330'), "0001":('Matt', 'Wade', '0001'), '2112': ('Bob', 'Smith', '2112')} 
com = {'6':['John Gold, getting no points', 'Matt played in this game? Didn\'t notice him','Love this shot!']} 
comments_table = [] 

我想实现这个替代品的功能是代替通过正则表达式唯一给他们的一个代码,在一个(字典)发现,在COM(字典)找到字符串别人的名字。用代码替换名称是可行的,但用代码而不是名称添加新字符串就是我错误的地方。For循环输出复制

def replace_first_name(): 
for k,v in a.items(): 
    for z, y in com.items(): 
     for item in y: 
      firstname = a[k][0] 
      lastname = a[k][1] 
      full_name = firstname + ' ' + lastname 
      if firstname in item: 
       if full_name in item: 
        t = re.compile(re.escape(full_name), re.IGNORECASE) 
        comment = t.sub(a[k][2], item) 
        print ('1') 
        comments_table.append({ 
         'post_id': z, 'comment': comment 
        }) 
        continue 

       else: 

        t = re.compile(re.escape(firstname), re.IGNORECASE) 
        comment = t.sub(a[k][2], item) 
        print ('2') 
        comments_table.append({ 
         'post_id':z, 'comment':comment 
        }) 
      else: 
       print ('3') 
       if fuzz.ratio(item,item) > 90: 
        comments_table.append({ 
         'post_id': z, 'comment': item 
        }) 
       else: 
        pass 

问题是与输出,如下图所示:

[{'comment': '1330, getting no points', 'post_id': '6'}, {'comment': "Matt played in this game? Didn't notice him", 'post_id': '6'}, {'comment': 'Love this shot!', 'post_id': '6'}, {'comment': 'John Gold, getting no points', 'post_id': '6'}, {'comment': "Matt played in this game? Didn't notice him", 'post_id': '6'}, {'comment': 'Love this shot!', 'post_id': '6'}, {'comment': 'John Gold, getting no points', 'post_id': '6'}, {'comment': "0001 played in this game? Didn't notice him", 'post_id': '6'}, {'comment': 'Love this shot!', 'post_id': '6'}] 

我不想说已经有了自己的名字用数字代替,以使他们进入最终名单的意见。因此,我希望我的预期输出看起来像这样:

[{'comment': '1330, getting no points', 'post_id': '6'},{'comment': '0001,played in this game? Didn\'t notice him', 'post_id': '6', {'comment':'Love this shot', 'post_id':'6'}] 

我已经调查通过使Y如果iter_list使用迭代器,但我没有得到任何地方。任何帮助,将不胜感激。谢谢!

回答

2

不知道你为什么要做正则表达式替换,因为你正在检查是否第一个名字/全名是in。也不知道什么fuzz.ratio(item, item)事情的情况下,3是应该做的,但这里的你如何能做到简单/幼稚更换:

#!/usr/bin/python 
import re 

def replace_names(authors, com): 
    res = [] 
    for post_id, comments in com.items(): 
     for comment in comments: 
      for author_id, author in authors.items(): 
       first_name, last_name = author[0], author[1] 
       full_name = first_name + ' ' + last_name 
       if full_name in comment: 
        comment = comment.replace(full_name, author_id) 
        break 
       elif first_name in comment: 
        comment = comment.replace(first_name, author_id) 
        break 
      res.append({'post_id': post_id, 'comment': comment}) 
    return res 

a = {'1330': ('John', 'Gold', '1330'), "0001":('Matt', 'Wade', '0001'), '2112': ('Bob', 'Smith', '2112')} 
com = {'6':['John Gold, getting no points', 'Matt played in this game? Didn\'t notice him','Love this shot!']} 
for comment in replace_names(a, com): 
    print comment 

其中产生这样的输出:

{'comment': '1330, getting no points', 'post_id': '6'} 
{'comment': "0001 played in this game? Didn't notice him", 'post_id': '6'} 
{'comment': 'Love this shot!', 'post_id': '6'} 

这是一个有点棘手以了解你的意图是什么与原始代码,但(其中之一)你得到重复的原因是你正在处理outher循环中的作者,这意味着你将为每个作者处理每个评论一次 。通过交换循环,您可以确保每个注释只处理一次。

你也可能打算有break你有continue,但我不完全确定我理解你的原代码应该如何工作。

全局变量的使用也有点混乱。