2014-10-07 103 views
-1

文件拉过来的数据:的Python:比较2 CSV的,如果某个单元格匹配

id,desc,name 
12345,blah blah blah,jsmith 
6789,yada yada yada,ckast 
54321,yum yum yum,jpetersen 

文件B:

key,id 
AB-873,6789 
CF-395,54321 
HG-713,12345 

我想要做的就是拿来看在文件中每一行A,看是否id列在文件B的id列相匹配,并且如果它在“名称”单元格复制到文件B.所以在最后,文件B会是什么样子:

AB-873,6789,ckast 
CF-395,54321,jpetersen 
HG-713,12345,jsmith 

我知道'csv'Python模块可以读取单个行,但是我不知道该从哪里去。谢谢!

+0

是一个大小的文件,他们都将装入内存? – dawg 2014-10-07 18:49:33

回答

0

如果你想要一个简单的代码,这些代码对你的作品:

a_lines = open('FileA', 'r').readlines()[1:] 
b_lines = open('FileB', 'r').readlines()[1:] 
file_result = open('result', 'w') 

# Read content of FileA to a table (list of lists) 
a_table = [] 
for l in a_lines: 
    a_table.append([w.strip() for w in l.split(',')]) 

# Read content of FileB in a dictionary. 
# The 'id' field as dictionary key for simple look-up. 
b_dict = {} 
for l in b_lines: 
    words = l.split(',') 
    b_dict[words[1].strip()] = words[0].strip() 

# Do the actual work and save result. 
for row in a_table: 
    if row[0] in b_dict: 
     file_result.write(b_dict[row[0]] + ',' + row[0] + ',') 
     file_result.write(row[2] + '\n') 

我与你的样品进行了测试。

0

随着csv,你可以这样做:

import csv 

with open(fn1) as fa, open(fn2) as fb: 
    r1, r2=map(csv.reader, (fa, fb)) 
    a_header, b_header=map(next, (r1, r2)) 
    data_a, data_b=map(lambda header: {k:list() for k in header}, 
          (a_header, b_header)) 
    for line in r1: 
     for k, v in zip(a_header, line): 
      data_a[k].append(v) 
    for line in r2: 
     for k, v in zip(b_header, line): 
      data_b[k].append(v) 

b_header+=['name']   
data_b['name']=[]   
for e in data_b['id']: 
    try: 
     v=data_a['name'][data_a['id'].index(e)] 
    except ValueError: 
     v=None  
    data_b['name'].append(v)  

with open(fn3, 'w') as fout: 
    writer=csv.writer(fout) 
    writer.writerow([e for e in b_header]) 
    idx=0 
    while True: 
     try: 
      writer.writerow([data_b[key][idx] for key in b_header]) 
      idx+=1 
     except IndexError: 
      break 
相关问题