2015-04-04 39 views
0

我想在python的多个文件列中输出输出。我的代码生成两行输出。代码是如何在python中的多个coloumns上编写输出

f2 = open("C:/Python26/Semantics.txt",'w') 
sem = ["cells", "gene","factor","alpha", "receptor", "t","promoter"] 
with open("C:/Python26/trigram.txt") as f : 
for x in f: 
    x = x.strip().split("$") 
    f2.write(" ".join(x) + " " + str(len(set(sem) & set(x)))+"\n") 
f2.close() 

我的文件看起来像这样:

IL-2$gene$expression$and 
IL-2$gene$expression$and$NF-kappa 
IL-2$gene$expression$and$NF-kappa$B 
IL-2$gene$expression$and$NF-kappa$B$activation 
gene$expression$and$NF-kappa$B$activation$through 
expression$and$NF-kappa$B$activation$through$CD28 

我的电流输出

IL-2 gene expression and 1 
IL-2 gene expression and NF-kappa 1 
IL-2 gene expression and NF-kappa B 1 
IL-2 gene expression and NF-kappa B activation 1 
gene expression and NF-kappa B activation through 1 
expression and NF-kappa B activation through CD28 0 

我的期望输出

Token           cells gene factor……. promoter 
IL-2 gene expression and       0  1  0  ………  0 
IL-2 gene expression and NF-kappa     0  1  0  ………  0 
IL-2 gene expression and NF-kappa B    0  1  0  ………  0 
IL-2 gene expression and NF-kappa B activation 0  1  0  ………  0 
gene expression and NF-kappa B activation through 0  1  0  ………  0 
expression and NF-kappa B activation through CD28 0  0  0  ………  0 

我认为需要在代码中稍微改变一下我认为这样才能通过嵌套循环来解决。但我怎么样,我不知道。我这样做的代码是低于该不工作

sem = ["cells", "b","expression", "cell", "gene","factor","activation","protein","activity","transcription","alpha","receptor","t","promotor","mrna","site","kinase","nfkappa","human"]; 
    f2 = open("C:/Python26/Semantics.txt",'w') 
    with open("C:/Python26/trigram.txt") as file : 
    for s in sem: 
     for lines in file: 
      lines = lines.strip().split("$") 
      if s==lines: 
       f2.write(" ".join(lines) + "\t" +str(len(set(sem) & set(lines)))+"\n") 
     f2.write("\n") 
    f2.close() 
+2

http://stackoverflow.com/queue stions/5676646 /,填写-A-python的字符串与 - 空间 – huxley 2015-04-04 09:17:40

回答

0

pandas.DataFrame

数据帧是2维标记的数据结构与 潜在不同类型的列。您可以将它想象为电子表格或SQL表格或Series对象的字典。

您可以创建您的DataFrame对象,然后将其转换为一个字符串并将write()串入您的文件。

import pandas 

col_labels = ['Token', 'cells', 'gene'] 
row_labels = ['x', 'y', 'z'] 

values_array = [[1, 2, 3], 
       [10, 20, 30], 
       [100, 200, 300]] 

df = pandas.DataFrame(values_array, col_labels, row_labels)  
print(df) 

输出

  x y z 
Token 1 2 3 
cells 10 20 30 
gene 100 200 300 

要保存它,对象首先转换为字符串:

db_as_str = df.to_string() 

with open('my_text_file.txt', 'w') as f: 
    f.write(db_as_str) 

或保存为是,在CSV:

db.to_csv('my_text_file.txt')