lookbehind in for循环

有点卡住这个问题，我需要使用for循环来找到以'ing'结尾的单词，前面有一个IN标签，我来自C和java的背景，容易做到，但我无法掌握如何在python中做到这一点！

我找遍四周，这里就是我想我需要做的：

for word, tag in list: 
    if word.endswith('ing'): 
     //use regular expression here which should look like this '(?<=\bIN\b)ing'

现在ofcourse有一些问题存在，首先，我将我需要看看以前的标签不字，正则表达式可能是错误的，更重要的是这听起来太复杂了，我在这里错过了什么，有没有一种方法可以使用以'ing'结尾的单词的索引来查看它背后的标签，就像我会使用java例如？？

谢谢你提前和对不起，如果它是一个愚蠢的问题，它就像我的第二次尝试编写Python和我仍然生疏吧=）

编辑：什么，我需要更多的解释这样做，这里的一个例子是我试图解决，有时pos_tag失误VBG的名词，所以我需要写一个给定的一个标记列表（例如[（“培养”，“NNP”），（”的方法（''''，''''），（'观察'，'NN'），（'正义'，'NN'）]纠正了这个问题并返回了[（'Cultivate'，' NNP '），（' 和平”， 'NN'），（ '通过'， 'IN'），（ '观察'， 'VBG '），（' 正义'， 'NN'）]）注意如何Ø bserving已经改变

EDIT2：现在问题解决了，这里是溶液DEF变换（LI）：为i的x范围（LEN（LI））：如果利[I] [0] .endswith （'ing'）和i> 0和li [i-1] [1]： li [i] =（li [i]，'VBG'）

谢谢各位的帮助=它

来源

2011-01-11 r3x

你有什么问题*实际上*试图解决？ – 2011-01-11 22:40:51

这是不是很清楚你的输入/输出是什么。你为什么从你的列表中提取2个值？它是元组列表吗？您也不应该使用变量名`list`，因为它会覆盖内置函数列表 – Falmarri 2011-01-11 22:42:09

尝试显示输入和相应输出的示例。 – 2011-01-11 22:43:23

根据您的评论，听起来像你想这样：

def transform(li): 
    new_li = [] 
    prev_tag = None 
    for word, tag in li: 
     if word.endswith('ing') and prev_tag == 'NN': 
      tag = 'VBG' 
     new_li += [(word, tag)] 
     prev_tag = tag 
    return new_li

你也可以做到这一点就地：

def transform(li): 
    for i in xrange(len(li)): 
     if li[i][0].endswith('ing') and i > 0 and li[i-1][1]: 
      li[i] = (li[i], 'VBG')

注意，我改名list到li。 list是Python列表的类型名称并覆盖它是一个坏主意。

来源

2011-01-11 22:47:46 marcog

previousWord = "" 
previousTag = "" 

for word, tag in list: 
    if word.endswith('ing'): 
     //use regular expression here which should look like this '(?<=\bIN\b)ing' 
     //use previousWord and previousTag here 
    previousWord = word 
    previousTag = tag

来源

2011-01-11 22:44:52 fredley

您的解决方案有点通过将不可变元组作为列表中的数据对来驱动。那么最简单的方法是创建你总希望新名单：如果你有成千上万

li=[('Cultivate', 'NNP'), 
    ('peace', 'NN'), 
    ('by', 'IN'), 
    ('observing', 'NN'), 
    ('justice', 'NN')] 

lnew=[]  

for word, tag in li: 
    if word.endswith('ing') and tag == 'NN': 
     tag='VBG' 
    lnew.append((word,tag)) 

for word, tag in lnew: 
    print word, tag

有点浪费......

如果这是您的数据和格式，你控制，你不妨考虑使用字典而不是元组列表。然后你就可以通过循环更加自然和修改的地方字典：

ld={'justice': 'NN', 'Cultivate': 'NNP', 'peace': 'NN', 
    'observing': 'NN', 'by': 'IN'} 

for word, tag in ld.items(): 
    if word.endswith('ing') and tag == 'NN': 
     ld[word]='VBG'

在大型数据集，字典的方法是更快，更高效的内存。考虑一下。

来源

2011-01-11 23:08:14 dawg

这不到位的变化

for index,(word, _tag) in enumerate(li): 
    if word.endswith('ing') and i > 0 and li[index-1][1] == 'IN': 
     li[index] = word, 'VBG'

枚举允许你迭代在foreach方式列表，但还可以访问当前索引。我很喜欢它，但是我有时会担心如果我过度使用它，而应该使用类似for i in xrange(10): ...的东西。

来源

2011-01-11 23:40:28 Dunes

lookbehind in for循环

回答

相关问题