函数处理文件中的多行和/或单行

如果我有一个文件，我应该如何实现一个函数，以便它可以读取单行和多行。例如：函数处理文件中的多行和/或单行

TimC 
Tim Cxe 
USA 
http://www.TimTimTim.com 
TimTim facebook! 
ENDBIO 
Charles 
Dwight 
END 
Mcdon 
Mcdonald 
Africa 
     # website in here is empty, but we still need to consider it 
     # bio in here is empty, but we need to include this in the dict 
     # bio can be multiple lines 
ENDBIO 
Moon 
King 
END 
etc

我只是想知道如果任何人都可以使用一些Python的初学者关键字（例如不使用产量，休息，继续）。

在我自己的版本中，我实际上定义了4个函数。 4个功能中的3个是辅助功能。

，我想一个函数返回：

dict = {'TimC':{'name':Tim Cxd, 'location':'USA', 'Web':'http://www.TimTimTim.com', 'bio':'TimTim facebook!','follows': ['Charles','Dwight']}, 'Mcdon':{'name':Mcdonald , 'location':'Africa', 'Web':'', 'bio':'','follows': ['Moon','King']}}

来源

2010-11-20 John

迭代通过文件收集各种数据，然后生成它，当你达到一个合适的前哨。

来源

2010-11-20 21:35:22

from itertools import izip 

line_meanings = ("name", "location", "web") 
result = {} 
user = None 

def readClean(iterable, sentinel=None): 
    for line in iterable: 
     line = line.strip() 
     if line == sentinel: 
      break 
     yield line 

while True: 
    line = yourfile.readline() 
    if not line: 
     break 
    line = line.strip() 
    if not line: 
     continue 
    user = result[line] = {} 
    user.update(izip(line_meanings, readClean(yourfile))) 
    user['bio'] = list(readClean(yourfile, 'ENDBIO')) 
    user['follows'] = set(readClean(yourfile, 'END')) 

print result

{'Mcdon': {'bio': [''], 
      'follows': set(['King', 'Moon']), 
      'location': 'Africa', 
      'name': 'Mcdonald', 
      'web': ''}, 
'TimC': {'bio': ['TimTim facebook!'], 
      'follows': set(['Charles', 'Dwight']), 
      'location': 'USA', 
      'name': 'Tim Cxe', 
      'web': 'http://www.TimTimTim.com'}}

来源

2010-11-20 21:49:04 nosklo

import sys 

def bio_gen(it, sentinel="END"): 
    def read_line(): 
     return next(it).partition("#")[0].strip() 

    while True: 
     key = read_line() 
     ret = { 
      'name': read_line(), 
      'location': read_line(), 
      'website': read_line(), 
      'bio': read_line(), 
      'follows': []} 
     next(it)     #skip the ENDBIO line 
     while True: 
      line = read_line() 
      if line == sentinel: 
       yield key, ret 
       break 
      ret['follows'].append(line) 

all_bios = dict(bio_gen(sys.stdin)) 
import pprint 
pprint.pprint(all_bios)

{'Mcdon': {'bio': '', 
      'follows': ['Moon', 'King'], 
      'location': 'Africa', 
      'name': 'Mcdonald', 
      'website': ''}, 
'TimC': {'bio': 'TimTim facebook!', 
      'follows': ['Charles', 'Dwight'], 
      'location': 'USA', 
      'name': 'Tim Cxe', 
      'website': 'http://www.TimTimTim.com'}}

来源

2010-11-20 22:40:49

代码不认为'bio'可能超过1个排队'ENDBIO' – nosklo 2010-11-21 00:04:56

@noskio，正确的，问题是未指定，但OP表明该生物应该是一个字符串，而不是一个列表。我选择拒绝猜测。当问题被更新以显示如果有多条生物线会发生什么，我可以更新我的答案 – 2010-11-21 03:34:50

函数处理文件中的多行和/或单行

回答

相关问题