Python：解释cProfile日志

我有一种方法可以确定每个其他人最兼容的人。基本上，在从一个人到一个列表（这里类似的列表决定了兼容性）的dict的项目上有两个嵌套循环，如果compat大于之前对于外部循环人员的最大值，则计算并保存compat 。
因此，我决定通过更新其他人（在内循环中的on）的兼容性来优化性能，因为兼容性是相同的，并且当外循环到达时我不必做相同的计算人2和内在的人1 [使用兼容关系的对称]。
好吧，我结束了慢了20倍。 c配置文件日志很奇怪，因为改进版本的所有操作比未改进代码中的更好（或类似）总时间。所以我绝对坚持找到瓶颈。 :(
能有人给我如何解释这个记录在哪里邪恶指令去咨询Python：解释cProfile日志

日志正常代码：？？

 $ python -m cProfile -s time ./jukebox.py sample.txt 
     92661414 function calls (92661412 primitive calls) in 124.355 CPU seconds 

    Ordered by: internal time 

    ncalls tottime percall cumtime percall filename:lineno(function) 
    10000 93.324 0.009 124.168 0.012 jukebox.py:88(solve_problem_4) 
42900000 16.616 0.000 16.616 0.000 {method 'intersection' of 'set' objects} 
42900000 10.831 0.000 10.831 0.000 {len} 
    6180396 2.212 0.000 2.212 0.000 {method 'append' of 'list' objects} 
    670000 1.185 0.000 1.185 0.000 {method 'items' of 'dict' objects} 
     1 0.170 0.170 124.353 124.353 jukebox.py:1(<module>) 
     1 0.009 0.009 0.013 0.013 heapq.py:31(<module>) 
     1 0.004 0.004 0.004 0.004 bisect.py:1(<module>) 
     1 0.002 0.002 124.355 124.355 {execfile} 
     66 0.001 0.000 0.001 0.000 jukebox.py:18(update_bands) 
     67 0.001 0.000 0.001 0.000 fileinput.py:166(isfirstline) 
     1 0.000 0.000 0.002 0.002 jukebox.py:9(__init__) 
     1 0.000 0.000 0.000 0.000 {open} 
     198 0.000 0.000 0.000 0.000 {method 'strip' of 'str' objects} 
     132 0.000 0.000 0.000 0.000 {method 'split' of 'str' objects} 
     1 0.000 0.000 124.355 124.355 <string>:1(<module>) 
     1 0.000 0.000 0.000 0.000 fileinput.py:240(__iter__) 
     68 0.000 0.000 0.000 0.000 fileinput.py:243(next) 
     1 0.000 0.000 0.000 0.000 {range} 
     1 0.000 0.000 0.000 0.000 {method 'close' of 'file' objects} 
     1 0.000 0.000 0.000 0.000 {isinstance} 
     1 0.000 0.000 0.000 0.000 fileinput.py:80(<module>) 
     1 0.000 0.000 0.000 0.000 fileinput.py:184(FileInput) 
     1 0.000 0.000 0.000 0.000 fileinput.py:197(__init__) 
     2 0.000 0.000 0.000 0.000 {method 'readlines' of 'file' objects} 
     1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects} 
     4/2 0.000 0.000 0.000 0.000 fileinput.py:292(readline) 
     396 0.000 0.000 0.000 0.000 {method 'setdefault' of 'dict' objects} 
     1 0.000 0.000 0.000 0.000 fileinput.py:91(input) 
     1 0.000 0.000 0.000 0.000 fileinput.py:266(nextfile) 
     1 0.000 0.000 0.000 0.000 jukebox.py:4(Reader) 
     67 0.000 0.000 0.000 0.000 fileinput.py:371(isfirstline)

日志的“优化”之一：

$ python -m cProfile -s time ./jukebox-imp.py sample.txt 
     49761414 function calls (49761412 primitive calls) in 2166.613 CPU seconds 

    Ordered by: internal time 

    ncalls tottime percall cumtime percall filename:lineno(function) 
    10000 2147.248 0.215 2165.759 0.217 jukebox-imp.py:88(solve_problem_4) 
21450000 8.952 0.000 8.952 0.000 {method 'intersection' of 'set' objects} 
21450000 5.951 0.000 5.951 0.000 {len} 
    6180396 2.152 0.000 2.152 0.000 {method 'append' of 'list' objects} 
    660000 1.441 0.000 1.441 0.000 {method 'items' of 'dict' objects} 
     1 0.837 0.837 2166.611 2166.611 jukebox-imp.py:1(<module>) 
    10000 0.015 0.000 0.015 0.000 {method 'keys' of 'dict' objects} 
     1 0.010 0.010 0.013 0.013 heapq.py:31(<module>) 
     1 0.003 0.003 0.003 0.003 bisect.py:1(<module>) 
     1 0.002 0.002 2166.613 2166.613 {execfile} 
     66 0.002 0.000 0.002 0.000 jukebox-imp.py:18(update_bands) 
     1 0.000 0.000 0.000 0.000 {open} 
     198 0.000 0.000 0.000 0.000 {method 'strip' of 'str' objects} 
     132 0.000 0.000 0.000 0.000 {method 'split' of 'str' objects} 
     1 0.000 0.000 2166.613 2166.613 <string>:1(<module>) 
     1 0.000 0.000 0.000 0.000 fileinput.py:240(__iter__) 
     1 0.000 0.000 0.002 0.002 jukebox-imp.py:9(__init__) 
     68 0.000 0.000 0.000 0.000 fileinput.py:243(next) 
     1 0.000 0.000 0.000 0.000 {range} 
     1 0.000 0.000 0.000 0.000 {method 'close' of 'file' objects} 
     1 0.000 0.000 0.000 0.000 {isinstance} 
     1 0.000 0.000 0.000 0.000 fileinput.py:80(<module>) 
     1 0.000 0.000 0.000 0.000 fileinput.py:184(FileInput) 
     1 0.000 0.000 0.000 0.000 fileinput.py:197(__init__) 
     2 0.000 0.000 0.000 0.000 {method 'readlines' of 'file' objects} 
     1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects} 
     4/2 0.000 0.000 0.000 0.000 fileinput.py:292(readline) 
     396 0.000 0.000 0.000 0.000 {method 'setdefault' of 'dict' objects} 
     67 0.000 0.000 0.000 0.000 fileinput.py:166(isfirstline) 
     1 0.000 0.000 0.000 0.000 fileinput.py:91(input) 
     1 0.000 0.000 0.000 0.000 fileinput.py:266(nextfile) 
     67 0.000 0.000 0.000 0.000 fileinput.py:371(isfirstline) 
     1 0.000 0.000 0.000 0.000 jukebox-imp.py:4(Reader)

// 编辑：
以防万一，我可以亲提供代码。以我卑微的理解，后者绝对没有理由比前者慢20倍。

正常之一：

def solve_problem_4(colleagues): 
MIN_COMPAT = 1 
compat_dict = dict() 

for colleague_1, bands_1 in colleagues.items(): 
    compat_dict[colleague_1] = (0,[]) 
    for colleague_2, bands_2 in colleagues.items(): 
     if colleague_1 == colleague_2: 
      continue 

     compat = len(set(bands_1).intersection(set(bands_2))) 
     if compat > MIN_COMPAT: 
      old_compat,top_colleagues = compat_dict[colleague_1] 
      if compat > old_compat: 
       compat_dict[colleague_1] = (compat,[colleague_2]) 
      elif compat == old_compat: 
       top_colleagues.append(colleague_2) 

return compat_dict

与 “optmized”：

def solve_problem_4(colleagues): 
MIN_COMPAT = 1 
compat_dict = defaultdict(lambda: (0,[])) #change here 
checked_pairs = [] 

for colleague_1, bands_1 in colleagues.items()[:-1]: 
    for colleague_2, bands_2 in colleagues.items(): 
     if colleague_1 == colleague_2 or (colleague_2,colleague_1) in checked_pairs: # change here, exclude used pairs 
      continue 

     checked_pairs += [(colleague_1,colleague_2)] # change here, note down checked pair 
     compat = len(set(bands_1).intersection(set(bands_2))) 

     if compat > MIN_COMPAT: 
      old_compat, top_colleagues = compat_dict[colleague_1] 
      if compat > old_compat: 
       compat_dict[colleague_1] = compat,[colleague_2] 
      elif compat == old_compat: 
       top_colleagues.append(colleague_2) 

      old_compat, top_colleagues = compat_dict[colleague_2] # change here, update symmetric pair 
      if compat > old_compat: # imagine extract method refactoring here ;) 
       compat_dict[colleague_2] = compat,[colleague_1] 
      elif compat == old_compat: 
       top_colleagues.append(colleague_1) 
return compat_dict

来源

2013-03-04 Zakum

应当更清楚，如果你排序cumtime。

来源

2013-03-04 19:13:46 shx2

不幸的是，不改变这种状况，除了：1（）{}的execfile，jukebox-imp.py:1（）和jukebox-imp.py:88(solve_problem_4）移动到顶部，这并不奇怪，因为我正在分析这种方法。 – Zakum 2013-03-04 19:22:47

您应该将功能分解为更小的功能，并对其进行配置。如果您使用图形浏览器，就像fvisconte建议的那样，应该很容易就能找到它。 – shx2 2013-03-05 06:48:06

接受评论制动下来分析的原因。如果checked_pairs是一个列表，则问题是位于O（n）中的checked_pairs'中的（（同事_2，同事_1））。所以我把它改为字典，并得到O（1）查找（平均）。 – Zakum 2013-03-05 15:25:23

或者，在cProfile转储上运行runsnakerun可以提供易于理解的图形视图。

python -m cProfile -o dump.cprofile script.py 
runsnakerun dump.cprofile

来源

2013-03-04 19:56:59 fvisconte

Python：解释cProfile日志

回答

相关问题