2017-12-18 251 views
-4

文件的某些价值观我有这样一个文本文件(这是一个样本,在非常大的实际文件):计数在python

[52639 - 2017-12-08 11:56:58,680] INFO __main__.master 251 Finished pre-smap protein tag ('4h02', [], 35000, 665, '67') 
[52639 - 2017-12-08 11:57:37,686] INFO __main__.master 251 Finished pre-smap protein tag ('4nqk', [], 35000, 223, '18') 
[52639 - 2017-12-08 11:58:46,984] INFO __main__.master 251 Finished pre-smap protein tag ('3j60', [], 3500, 1052, '65') 
[52639 - 2017-12-08 12:01:10,073] INFO __main__.master 251 Finished pre-smap protein tag ('4ddg', [], 35000, 541, '38') 
[52639 - 2017-12-08 12:03:37,570] INFO __main__.master 251 Finished pre-smap protein tag ('4ksl', [], 35000, 1303, '68') 

,我想最后一个逗号前计数的值。结果将是665 + 223 + 1052 + 541 + 1303 = 3784.

我想不出如何实现这一点。任何帮助将不胜感激。

回答

0

在这里,你可以试试这个:

summation = 0 

with open("test.txt", "r") as infile: 
    for line in infile: 
     newLine = line.split(", ") 
     summation = summation + int(newLine[3]) 

print(summation) 

输出:

3784 

test.txt文件的内容结构是这样的:

[52639 - 2017-12-08 11:56:58,680] INFO main.master 251 Finished pre-smap protein tag ('4h02', [], 35000, 665, '67') 
[52639 - 2017-12-08 11:57:37,686] INFO main.master 251 Finished pre-smap protein tag ('4nqk', [], 35000, 223, '18') 
[52639 - 2017-12-08 11:58:46,984] INFO main.master 251 Finished pre-smap protein tag ('3j60', [], 3500, 1052, '65') 
[52639 - 2017-12-08 12:01:10,073] INFO main.master 251 Finished pre-smap protein tag ('4ddg', [], 35000, 541, '38') 
[52639 - 2017-12-08 12:03:37,570] INFO main.master 251 Finished pre-smap protein tag ('4ksl', [], 35000, 1303, '68') 

如果你想打印所有的数字,使得总和,你可以使用一个列表来存储每个数字:

summation = 0 
coefficients = [] 

with open("test.txt", "r") as infile: 
    for line in infile: 
     newLine = line.split(", ") 
     coefficients.append(newLine[3]) 
     summation = summation + int(newLine[3]) 

print("+".join(coefficients), end="=") 
print(summation) 

输出:

665+223+1052+541+1303=3784 
+0

谢谢vasilis。他们提出了问题-4,可能是想添加2行代码,如“打开”或类似的东西。这就是我们大多数人讨厌stackoverflow的原因。他们假装他们不是。无论如何谢谢你。 – Antonis

+0

嗨vasilis。假设我的fil这样更复杂,假设我有这样的文件和行:[52639 - 2017-12-08 11:43:44,850]信息__main __。master 251完成pre-smap蛋白标签('4py6', ['R78','EDO'],35000,33.207404136657715,'16')或[52639 - 2017-12-08 11:43:48,014] INFO __main __ master 251完成的pre-smap蛋白标签('1nw4',[ 'IMH','IPA','SO4'],3500,153.33520197868347,'64')。你有解决方案吗? – Antonis

+0

@Antonis,社区倾向于欣赏显示研究工作的问题,并挑战其成员试图找出最佳解决方案。此外,Stackoverflow是一个广泛的社区,从新手到高度熟练的成员都想要更进一步。所以,显示缺乏努力的问题或者可能重复的问题都没有吸引力。但是,无论如何,谢谢你接受我的答案。 –

0
import re 
s = """ 
[52639 - 2017-12-08 11:56:58,680] INFO main.master 251 Finished pre-smap protein tag ('4h02', [], 35000, 665, '67') 

[52639 - 2017-12-08 11:57:37,686] INFO main.master 251 Finished pre-smap protein tag ('4nqk', [], 35000, 223, '18') 

[52639 - 2017-12-08 11:58:46,984] INFO main.master 251 Finished pre-smap protein tag ('3j60', [], 3500, 1052, '65') 

[52639 - 2017-12-08 12:01:10,073] INFO main.master 251 Finished pre-smap protein tag ('4ddg', [], 35000, 541, '38') 

[52639 - 2017-12-08 12:03:37,570] INFO main.master 251 Finished pre-smap protein tag ('4ksl', [], 35000, 1303, '68') 
""" 

pattern = ', ([0-9]*), \'[0-9]*\'\)' 

print sum(int(i) for i in re.findall(pattern,s)) 

您是否尝试过使用正则表达式库?通过构建一个与“用括号关闭的数字之前的数字”匹配的模式,可以捕获所有这些数字,然后构建一个将它们转换为整数的生成器,并将它们相加。