更新
这个问题已经在OR exchange上进行了彻底的讨论和更新,我已经在其中进行了交叉处理。CPLEX Python API性能开销?
原始问题
当运行在命令行CPLEX 12.5.0.0:
cplex -f my_instance.lp
最佳整数解在19056.99蜱找到。
但通过Python API,就很不相同的实例:
import cplex
problem = cplex.Cplex("my_instance.lp")
problem.solve()
现在所需要的时间将达到97407.10蜱(慢5倍以上)。
在这两种情况下,模式都是并行的,确定性的,最多2个线程。想知道如果这表现不佳是由于一些Python线程开销,我想:
problem = cplex.Cplex("my_instance.lp")
problem.parameters.threads.set(1)
problem.solve()
要求46513.04蜱(即使用一个核心比使用两个快两倍!)。
作为CPLEX和LP的新手,我发现这些结果令人困惑。有没有一种方法可以提高Python API的性能,还是应该切换到一些更成熟的API(即Java或C++)?
附件
这里是2线程的分辨率的全部细节,首先将(公共)前同步码:
Tried aggregator 3 times.
MIP Presolve eliminated 2648 rows and 612 columns.
MIP Presolve modified 62 coefficients.
Aggregator did 13 substitutions.
Reduced MIP has 4229 rows, 1078 columns, and 13150 nonzeros.
Reduced MIP has 1071 binaries, 0 generals, 0 SOSs, and 0 indicators.
Presolve time = 0.06 sec. (18.79 ticks)
Probing fixed 24 vars, tightened 0 bounds.
Probing time = 0.08 sec. (18.12 ticks)
Tried aggregator 1 time.
MIP Presolve eliminated 87 rows and 26 columns.
MIP Presolve modified 153 coefficients.
Reduced MIP has 4142 rows, 1052 columns, and 12916 nonzeros.
Reduced MIP has 1045 binaries, 7 generals, 0 SOSs, and 0 indicators.
Presolve time = 0.05 sec. (11.67 ticks)
Probing time = 0.01 sec. (1.06 ticks)
Clique table members: 4199.
MIP emphasis: balance optimality and feasibility.
MIP search method: dynamic search.
Parallel mode: deterministic, using up to 2 threads.
Root relaxation solution time = 0.20 sec. (91.45 ticks)
结果的命令行:
GUB cover cuts applied: 1
Clique cuts applied: 3
Cover cuts applied: 2
Implied bound cuts applied: 38
Zero-half cuts applied: 7
Gomory fractional cuts applied: 2
Root node processing (before b&c):
Real time = 5.27 sec. (2345.14 ticks)
Parallel b&c, 2 threads:
Real time = 35.15 sec. (16626.69 ticks)
Sync time (average) = 0.00 sec.
Wait time (average) = 0.00 sec.
------------
Total (root+branch&cut) = 40.41 sec. (18971.82 ticks)
结果来自Python API:
Clique cuts applied: 33
Cover cuts applied: 1
Implied bound cuts applied: 4
Zero-half cuts applied: 10
Gomory fractional cuts applied: 4
Root node processing (before b&c):
Real time = 6.42 sec. (2345.36 ticks)
Parallel b&c, 2 threads:
Real time = 222.28 sec. (95061.73 ticks)
Sync time (average) = 0.01 sec.
Wait time (average) = 0.00 sec.
------------
Total (root+branch&cut) = 228.70 sec. (97407.10 ticks)
对不起,最近删除了一个“答案”,用于将读者重定向到更新的讨论。我随后通过链接更新了我的帖子。希望它能存活下来;-) – Aristide 2013-09-14 13:10:01