partition
将会比split
更快,因为它在第一场比赛后不会继续检查。
定期slice
与index
将比正则表达式slice
更快。
由于匹配前的字符串部分变得更大,正则表达式切片也显着减慢。它比〜10个字符后的原始分割变得更慢,然后变得更糟。如果你有一个没有+
或*
匹配的正则表达式,我认为它会更好一些。
require 'benchmark'
n=1000000
def bench n,email
printf "\n%s %s times\n", email, n
Benchmark.bm do |x|
x.report('split ') do n.times{ email.split('@')[0] } end
x.report('partition') do n.times{ email.partition('@').first } end
x.report('slice reg') do n.times{ email[/[^@]+/] } end
x.report('slice ind') do n.times{ email[0,email.index('@')] } end
end
end
bench n, '[email protected]'
bench n, '[email protected]'
bench n, '[email protected]'
bench n, '[email protected]'
bench n, '[email protected]omain.com'
bench n, 'a'*254 + '@' + 'b'*253 # rfc limits
bench n, 'a'*1000 + '@' + 'b'*1000 # for other string processing
结果1.9.3p484:
[email protected] 1000000 times
user system total real
split 0.405000 0.000000 0.405000 ( 0.410023)
partition 0.375000 0.000000 0.375000 ( 0.368021)
slice reg 0.359000 0.000000 0.359000 ( 0.357020)
slice ind 0.312000 0.000000 0.312000 ( 0.309018)
[email protected] 1000000 times
user system total real
split 0.421000 0.000000 0.421000 ( 0.432025)
partition 0.374000 0.000000 0.374000 ( 0.379021)
slice reg 0.421000 0.000000 0.421000 ( 0.411024)
slice ind 0.312000 0.000000 0.312000 ( 0.315018)
[email protected] 1000000 times
user system total real
split 0.593000 0.000000 0.593000 ( 0.589034)
partition 0.531000 0.000000 0.531000 ( 0.529030)
slice reg 0.764000 0.000000 0.764000 ( 0.771044)
slice ind 0.484000 0.000000 0.484000 ( 0.478027)
[email protected]ously-extra-long-silly-domain.com 1000000 times
user system total real
split 0.483000 0.000000 0.483000 ( 0.481028)
partition 0.390000 0.016000 0.406000 ( 0.404023)
slice reg 0.406000 0.000000 0.406000 ( 0.411024)
slice ind 0.312000 0.000000 0.312000 ( 0.344020)
[email protected]omain.com 1000000 times
user system total real
split 0.639000 0.000000 0.639000 ( 0.646037)
partition 0.609000 0.000000 0.609000 ( 0.596034)
slice reg 0.764000 0.000000 0.764000 ( 0.773044)
slice ind 0.499000 0.000000 0.499000 ( 0.491028)
a<254>@b<253> 1000000 times
user system total real
split 0.952000 0.000000 0.952000 ( 0.960055)
partition 0.733000 0.000000 0.733000 ( 0.731042)
slice reg 3.432000 0.000000 3.432000 ( 3.429196)
slice ind 0.624000 0.000000 0.624000 ( 0.625036)
a<1000>@b<1000> 1000000 times
user system total real
split 1.888000 0.000000 1.888000 ( 1.892108)
partition 1.170000 0.016000 1.186000 ( 1.188068)
slice reg 12.885000 0.000000 12.885000 (12.914739)
slice ind 1.108000 0.000000 1.108000 ( 1.097063)
2.1.3p242持有约同%的差异,但在一切快约10-30%,除了正则表达式分割它减慢甚至更多。
我怀疑在处理电子邮件地址时会有明显的差异(除非您每秒处理数百万次......)。但你为什么不衡量自己并找出答案? –