2014-09-22 66 views
0

此代码写作列表CSV

for record in SeqIO.parse(open(file, 'rU'), 'fasta', generic_protein): 
    record_id = re.sub(r'\d+_(\d+_\d\#\d+)_\d+', r'\1', record.id) 

    if (not reference_sequence): 
     reference_sequence = record.seq 
     reference_name  = record_id 
     #continue 
    print ",".join([reference_name, record_id, compare_seqs(reference_sequence, record.seq)]) 

使端子输出,看起来像

7065_8#1,8987_2#53, 
7065_8#1,8987_2#58, 
7065_8#1,8987_2#61, 
7065_8#1,8987_2#62,E-G [246] 
7065_8#1,8987_2#65,N-K [71],Y-D [223] 

我想通过线来写这条线到CSV,有什么建议?在嵌套列表

for record in SeqIO.parse(open(file, 'rU'), 'fasta', generic_protein): 
    record_id = re.sub(r'\d+_(\d+_\d\#\d+)_\d+', r'\1', record.id) 

if (not reference_sequence): 
    reference_sequence = record.seq 
    reference_name  = record_id 
    #continue 
line= ",".join([reference_name, record_id, compare_seqs(reference_sequence, record.seq)]) 
with open(csvfile, "w") as output: 
    writer = csv.writer(output, lineterminator='\n') 
    writer.writerow([line]) 

回答

1

包中的所有记录(即代替print ','.join(...)你做records.append([...])),然后就可以使用writerows(records),并写入到文件:

+0

也太棒了!谢谢 – user3234810 2014-09-22 14:16:15

1

可以SUSE writerow与以下保存输出。不需要像'.'.join()这样的东西,这是csv为你做的。

为了完整起见:

records = [] 
for record in SeqIO.parse(open(file, 'rU'), 'fasta', generic_protein): 
    record_id = re.sub(r'\d+_(\d+_\d\#\d+)_\d+', r'\1', record.id) 

    if (not reference_sequence): 
     reference_sequence = record.seq 
     reference_name  = record_id 
     #continue 
    records.append([reference_name, record_id, compare_seqs(reference_sequence, record.seq)]) 

with csv.writer(open('file.csv', 'w')) as fp: 
    fp.writerows(records) # note that it's not writerow but writerows which allows you to write muptiple rows 
+0

上面编辑更好的问题! – user3234810 2014-09-22 14:05:02

+0

你真棒!谢谢你! – user3234810 2014-09-22 14:09:40

+0

不客气!现在你可以通过接受答案告诉社区! – Kasramvd 2014-09-22 14:12:44

1

您也可以直接写逗号分隔字符串(与quotechar一起)的文件:

f = open("output.csv","w") 
for record in SeqIO.parse(open(file, 'rU'), 'fasta', generic_protein): 
    record_id = re.sub(r'\d+_(\d+_\d\#\d+)_\d+', r'\1', record.id) 

    if (not reference_sequence): 
    reference_sequence = record.seq 
    reference_name  = record_id 
    #continue 
    csvrow = '","'.join([reference_name, record_id, compare_seqs(reference_sequence, record.seq)]) 
    csvrow = '"'+csvrow+'"' 
    print >>f, csvrow 
f.close() 

使用这种方法,你可以打开文件并检查数据是否正在写入,即使脚本正在运行。