2017-04-14 43 views
0

所以我需要提取一些costumers的细节,并将其保存在一个新的数据库中所有我有它的唯一一个TXT文件,所以我们正在谈论5000 costumers或更多,txt文件它保存所有这样:Python的TXT提取器和组织者

first and last name 
NAME SURNAME    
zip country n. phone number mobile 
United Kingdom  +1111111111 
e-mail 
[email protected] 
guest first and last name 1° 
NAME SURNAME 
guest first and last name 2° 
NAME SURNAME 
name address city province 
NAME SURNAME London London 
zip 
AAAAA 
Cancellation of the reservation. 

所以我因为文件总是这样我想可能有刮,所以我做了一些研究,远的方式,这就是我想出了,但算不上什么,我需要:

with open('input.txt') as infile, open('output.txt', 'w') as outfile: 
copy = False 
for line in infile: 
    if (line.find("first and last name") != -1): 
     copy = True 
    elif (line.find("Cancellation of the reservation.") != -1): 
     copy = False 
    elif copy: 
     outfile.write(line) 

该代码的作品,但只是读取一行到其他和复制内容的文件我需要的东西,会在其他格式的内容复制这样我能够在数据库上载我需要的格式是这样的:

first and last name | zip country n. phone number mobile|e-mail|guest first and last name 1°|name address city province|zip 

因此,在这种情况下,我需要这样的:

NAME SURNAME | United Kingdom  +1111111111|[email protected]|NAME SURNAME London London |AAAAA 

对于在output.txt的

每行你们认为它很难建立这个?可能有人帮助我吗?任何建议,将有助于充分

回答

0

这些都是一些很好的刮工具,你希望做什么:

data = '''first and last name 
     NAME SURNAME    
     zip country n. phone number mobile 
     United Kingdom  +1111111111 
     e-mail 
     [email protected] 
     guest first and last name 1 
     NAME SURNAME 
     guest first and last name 2 
     NAME SURNAME 
     name address city province 
     NAME SURNAME London London 
     zip 
     AAAAA 
     Cancellation of the reservation. 
     ''' 
# split on space, convert to list 
ldata = data.split() 
# strip leading and trailing white space from each item 
ldata = [i.strip() for i in ldata] 
# split on line break, convert to list 
ndata = data.split('\n') 
ndata = [i.strip() for i in ndata] 
#convert list to string 
sdata = ' '.join(ldata) 

print ldata 
print ndata 
print sdata 

# two examples of split after, split before 
name_surname = sdata.split('first and last name')[1].split('zip')[0] 
print name_surname 

country_phone = sdata.split('mobile')[1].split('e-mail')[0] 
print country_phone 

>>> 

['first', 'and', 'last', 'name', 'NAME', 'SURNAME', 'zip', 'country', 'n.', 'phone', 'number', 'mobile', 'United', 'Kingdom', '+1111111111', 'e-mail', '[email protected]', 'guest', 'first', 'and', 'last', 'name', '1', 'NAME', 'SURNAME', 'guest', 'first', 'and', 'last', 'name', '2', 'NAME', 'SURNAME', 'name', 'address', 'city', 'province', 'NAME', 'SURNAME', 'London', 'London', 'zip', 'AAAAA', 'Cancellation', 'of', 'the', 'reservation.'] 
['first and last name', 'NAME SURNAME', 'zip country n. phone number mobile', 'United Kingdom  +1111111111', 'e-mail', '[email protected]', 'guest first and last name 1', 'NAME SURNAME', 'guest first and last name 2', 'NAME SURNAME', 'name address city province', 'NAME SURNAME London London', 'zip', 'AAAAA', 'Cancellation of the reservation.', ''] 
first and last name NAME SURNAME zip country n. phone number mobile United Kingdom +1111111111 e-mail [email protected] guest first and last name 1 NAME SURNAME guest first and last name 2 NAME SURNAME name address city province NAME SURNAME London London zip AAAAA Cancellation of the reservation. 
NAME SURNAME 
United Kingdom +1111111111