0
我正在使用for循环搜索NCBI蛋白质数据库中的蛋白质ID列表,并尝试将这些ID转换为描述。这里有一个例子:如何将多个字符串放入for循环的列表中?
import pandas as pd
from Bio import Entrez
from Bio import SeqIO
df2=pd.read_csv('ID.txt', header=None)
df.columns = ['protein_ID'] # put a header 'protein_ID' to the dataframe
lists=df.protein_ID.tolist() #convert the column into a list of protein IDs.
description = ''
for num, line in enumerate(lists):
handle = Entrez.efetch(db="protein", id=line, rettype="gb", retmode="text")
record = SeqIO.read(handle, "genbank")
description += record.description
description
它返回一个巨大的字符串:
'hypothetical protein UR61_C0009G0014 [candidate division WS6 bacterium GW2011_GWE1_34_7]ATPase [candidate division WS6 bacterium GW2011_GWE2_33_157]hypothetical protein UR96_C0034G0007 [candidate division WS6 bacterium GW2011_GWC1_36_11]phosphoenolpyruvate synthase [Candidatus Komeilibacteria bacterium RIFOXYC1_FULL_37_11]'
我要的是新换行的字符串列表,像这样:
[
'hypothetical protein UR61_C0009G0014 [candidate division WS6 bacterium GW2011_GWE1_34_7]',
'ATPase [candidate division WS6 bacterium GW2011_GWE2_33_157]',
'hypothetical protein UR96_C0034G0007 [candidate division WS6 bacterium GW2011_GWC1_36_11]',
'phosphoenolpyruvate synthase [Candidatus Komeilibacteria bacterium RIFOXYC1_FULL_37_11]'
]
如何实现这个?非常感谢你!
Ma ke'description'列表 - 'description = []' - 并且执行'description.append(record.description)'。 –
噢,是的,谢谢,那简单! – stevex