我有以下列格式的一些数据:发电机功能仅产生第一项
data = """
[Data-0]
Data = BATCH
BatProtocol = DIAG-ST
BatCreate = 20010724
[Data-1]
Data = SAMP
SampNum = 357
SampLane = 1
[Data-2]
Data = SAMP
SampNum = 357
SampLane = 2
[Data-9]
Data = BATCH
BatProtocol = VCA
BatCreate = 20010725
[Data-10]
Data = SAMP
SampNum = 359
SampLane = 1
[Data-11]
Data = SAMP
SampNum = 359
SampLane = 2
"""
结构是:
[Data-x]
其中x是一个数Data =
接着为无论BATCH
或SAMPLE
- 更多行
我想写一个函数,为每个'批'产生一个列表。列表中的第一项是包含行Data = BATCH
的文本块,列表中的以下项目是包含行Data = SAMP
的文本块。我现在有
def get_batches(data):
textblocks = iter([txt for txt in data.split('\n\n') if txt.strip()])
batch = []
sample = next(textblocks)
while True:
if 'BATCH' in sample:
batch.append(sample)
sample = next(textblocks)
if 'BATCH' in sample:
yield batch
batch = []
else:
batch.append(sample)
如果这样调用:
batches = get_batches(data)
for batch in batches:
print batch
print '_' * 20
它,但是,只返回第一个 '批':
['[Data-0]\nData = BATCH\nBatProtocol = DIAG-ST\nBatCreate = 20010724',
'[Data-1]\nData = SAMP\nSampNum = 357\nSampLane = 1',
'[Data-2]\nData = SAMP\nSampNum = 357\nSampLane = 2']
____________________
Wheras我的预期输出是:
['[Data-0]\nData = BATCH\nBatProtocol = DIAG-ST\nBatCreate = 20010724',
'[Data-1]\nData = SAMP\nSampNum = 357\nSampLane = 1',
'[Data-2]\nData = SAMP\nSampNum = 357\nSampLane = 2']
____________________
['[Data-9]\nData = BATCH\nBatProtocol = VCA\nBatCreate = 20010725',
'[Data-10]\nData = SAMP\nSampNum = 359\nSampLane = 1',
'[Data-11]\nData = SAMP\nSampNum = 359\nSampLane = 2']
____________________
我在想什么或者如何改善我的功能?
如果你想解析看起来像这样的文件,看看['ConfigParser'模块](http://docs.python.org/2/library/configparser.html)。 – Blender 2013-04-09 19:31:23
另外:不是'iter([some listcomp here])',你可以写'(某个genexp在这里)'。 – DSM 2013-04-09 19:35:43