在一行中查找图案并在括号中打印以下值

我试图从文件中提取一些信息。该文件有很多行像下面在一行中查找图案并在括号中打印以下值

"names":["DNSCR"],"actual_names":["RADIO_R"],"castime":[2,4,6,8,10] ......

我想在每一行名称和castime进行搜索，如果发现我想在括号在括号内的值在不同的线路改变打印值的一个。例如在上面的行中是DNSCR，casttime是2,3,6,8。但长度可能是不同的在下一行

我已经尝试了下面的代码，但它会一直给我10个字符，但我只需要在括号中只。

c_req = 10 
keyword = ['"names":','"castime":'] 

with open('mylogfile.log') as searchfile: 
    for line in searchfile: 
     for key in keywords: 
      left,sep,right = line.partition(key) 

      if sep: 

        print key + " = " + (right[:c_req])

来源

2017-02-24 Moe Siddig

该文件是否包含不一致的引号（如您的示例中）？ –

该文件实际上具有一致的报价。错误的副本 –

这看起来就像json一样，每条线周围是否有括号？如果是这样，整个内容是微不足道解析：

import json 
test = '{"names":["DNSCR"],"actual_names":["RADIO_R"],"castime":[2,4,6,8,10]}' 
result = json.loads(test) 
print(result["names"], result["castime"])

您也可以使用像熊猫库整个文件读入一个数据帧，如果它匹配的整体JSON文件。

来源

2017-02-24 18:15:29

否每个行周围都没有括号。这些线条实际上很长，没有特定的图案，我在这里展示的是我很有趣的。 –

如果你能展示整个结构，这将有所帮助。 –

使用正则表达式：

import re 

# should contain all lines 
lines = ['"names":["DNSCR"],"actual_names":["RADIO_R"],"castime":[2,4,6,8,10]'] 

# more efficient in large files 
names_pattern = re.compile('"names":\["(\w+)"\]') 
castime_pattern = re.compile('"castime":\[(.+)\],?') 

names, castimes = list(), list() 

for line in lines: 
    names.append(re.search(names_pattern, line).group(1)) 
    castimes.append(
     [int(num) for num in re.search(castime_pattern, line).group(1).split(',')] 
    )

添加异常处理和文件打开/读取

来源

2017-02-24 19:12:20

鉴于mylogfile.log：

"names":["DNSCR"],"actual_names":["RADIO_R"],"castime":[2,4,6,8,10] 
"names":["FOO", "BAR"],"actual_names":["RADIO_R"],"castime":[1, 2, 3]

使用正则表达式和ast.literal_eval。

import ast 
import re 

keywords = ['"names":', '"castime":'] 
keywords_name = ['names', 'castime'] 

d = {} 

with open('mylogfile.log') as searchfile: 
    for i, line in enumerate(searchfile): 
     d['line ' + str(i)] = {} 
     for key, key_name in zip(keywords, keywords_name): 
      d['line ' + str(i)][key_name] = ast.literal_eval(re.search(key + '\[(.*?)\]', line).group(1)) 
print(d) 

#{ 'line 0': {'castime': (2, 4, 6, 8, 10), 'names': 'DNSCR'}, 
# 'line 1': {'castime': (1, 2, 3), 'names': ('FOO', 'BAR')}}

re.search(key + '\[(.*?)\]', line).group(1)将抓住一切是在[]之间的keys后。

而且ast.literal_eval()将改变去除usless报价和空间在您的string，并在需要时自动创建tuples。我也使用enumerate来跟踪它在日志文件中的哪些行。

来源

2017-02-24 19:17:43

在一行中查找图案并在括号中打印以下值

回答

相关问题