我知道如何两条线之间的解析,当起“目标字”和最终“目标字”是不同的两条线之间的解析的Python:用相同的关键字
例如如果我想X和Y之间解析:
parse = False
for line in open(sys.argv[1]):
if Y in line:
parse = False
if parse:
print line
if X in line:
parse = True
我卡在一个稍微不同的问题,在这里我想与解析的词是同一个词。即,在此实例中,有4个不同的同系物基团,并且我想提取每个同系物组中的人/小鼠对,所以我想打开该文件:
1:_HomoloGene:_141209.Gene_conserved_in_Mammals
LOC102724657 Homo_sapiens
Gm12569 Mus_musculus
2:_HomoloGene:_141208.Gene_conserved_in_Euarchontoglires
LOC102724737 Homo_sapiens
LOC102636216 Mus_musculus
3:_HomoloGene:_141152.Gene_conserved_in_Euarchontoglires
LOC728763 Homo_sapiens
E030010N07Rik Mus_musculus
E030010N09Rik Mus_musculus
E030010N010Rik Mus_musculus
E030010N08Rik Mus_musculus
LOC102551034 Rattus_norvegicus
4:_HomoloGene:_141054.Gene_conserved_in_Boreoeutheria
LOC102723572 Homo_sapiens
LOC102157295 Canis_lupus_familiaris
LOC102633228 Mus_musculus
向一个Homo_sapiens /小家鼠比较像这样的:
Homo_sapiens Mus_musculus
LOC102724657 Gm12569
LOC102724737 LOC102636216
LOC728763 E030010N07Rik
LOC728763 E030010N09Rik
LOC728763 E030010N010Rik
LOC728763 E030010N08Rik
LOC102723572 LOC102633228
我没有几乎成功的代码来显示,这是什么,我已经试过一个例子(和我也试了正则表达式和分裂的字行“HomoloGene” ):
import sys
ListOfLines = open(sys.argv[1])
for line in ListOfLines:
if "HomoloGene" in line:
if "HomoloGene" in ListOfLines.next():
print line
print "**"
else:
print ListOfLines.next()
谢谢
你不认为组数会超过9? – alexis
好点。相应地解决了这个问题 – CDe
s /'if match!= None:'/'if match:'/。你忘了放弃'group'的旧定义,所以你的代码仍然被破坏。 – alexis