计算红宝石汉明距离最有效的方法是什么？

在ruby中，计算两个无符号整数（例如汉明距离）之间的位差的最有效方法是什么？计算红宝石汉明距离最有效的方法是什么？

例如，我有一个整数= 2323409845和b = 1782647144.

它们的二进制表示是：

a = 10001010011111000110101110110101 
b = 01101010010000010000100101101000

之间的一个& b为17比特差..

我可以对它们进行逻辑异或，但是这会给我一个不同的整数！= 17，然后我必须迭代结果的二进制表示，并计算1的＃号。

什么是计算位差的最有效方法？

现在，为了计算多个整数序列的位差，答案是否改变？例如。给出2个无符号整数序列：

x = {2323409845,641760420,509499086....} 
y = {uint,uint,uint...}

什么是计算两个序列之间的位差的最有效的方法？

你会遍历序列，还是有更快的方法来计算整个序列的差异一次？

来源

2011-06-18 ch3rryc0ke

谢谢！我只是这样做了，它似乎比下面的方法快了3倍（使用Ruby的优化字符串函数） – ch3rryc0ke

我对这个派对很迟，但你可能想要[这个popcount基准]（http：// dalkescientific。 com/writings/diary/popcnt.cpp）。 '__builtin_popcount'是最慢的方法之一，如果你不[使用编译标志]（http://www.dalkescientific.com/writings/diary/archive/2011/11/02/faster_popcount_update.html） – x1a4

您可以使用Ruby中优化字符串函数做位的计数，而不是纯粹的算术。事实证明，通过一些快速基准测试，速度提高了大约6倍。

def h2(a, b) 
    (a^b).to_s(2).count("1") 
end

H1是计算正常方式，而H2转换XOR成一个字符串，并计算的数字 “1”

基准：

ruby-1.9.2-p180:001:0>> def h1(a, b) 
ruby-1.9.2-p180:002:1*> ret = 0 
ruby-1.9.2-p180:003:1*> xor = a^b 
ruby-1.9.2-p180:004:1*> until xor == 0 
ruby-1.9.2-p180:005:2*> ret += 1 
ruby-1.9.2-p180:006:2*> xor &= xor - 1 
ruby-1.9.2-p180:007:2*> end 
ruby-1.9.2-p180:008:1*> ret 
ruby-1.9.2-p180:009:1*> end 
# => nil 
ruby-1.9.2-p180:010:0>> def h2(a, b) 
ruby-1.9.2-p180:011:1*> (a^b).to_s(2).count("1") 
ruby-1.9.2-p180:012:1*> end 
# => nil 
ruby-1.9.2-p180:013:0>> h1(2323409845, 1782647144) 
# => 17 
ruby-1.9.2-p180:014:0>> h2(2323409845, 1782647144) 
# => 17 
ruby-1.9.2-p180:015:0>> quickbench(10**5) { h1(2323409845, 1782647144) } 
Rehearsal ------------------------------------ 
    2.060000 0.000000 2.060000 ( 1.944690) 
--------------------------- total: 2.060000sec 

     user  system  total  real 
    1.990000 0.000000 1.990000 ( 1.958056) 
# => nil 
ruby-1.9.2-p180:016:0>> quickbench(10**5) { h2(2323409845, 1782647144) } 
Rehearsal ------------------------------------ 
    0.340000 0.000000 0.340000 ( 0.333673) 
--------------------------- total: 0.340000sec 

     user  system  total  real 
    0.320000 0.000000 0.320000 ( 0.326854) 
# => nil 
ruby-1.9.2-p180:017:0>>

来源

2011-06-18 15:47:53 Dogbert

谢谢吨，我发现这也快得多。按照您的建议，使用内置的字符串函数进行大约21K的比较需要大约3秒钟，而传统方式需要花费两倍的时间 – ch3rryc0ke

韦格纳的算法：

def hamm_dist(a, b) 
    dist = 0 
    val = a^b 

    while not val.zero? 
    dist += 1 
    val &= val - 1 
    end 
    dist 
end 

p hamm_dist(2323409845, 1782647144) # => 17

来源

2011-06-18 15:33:54

％的建议mu太短，我写了一个简单的C扩展名使用__builtin_popcount，并使用基准验证它至少比ruby的优化字符串函数快3倍..

我看着以下两个教程：

在我的程序：

require './FastPopcount/fastpopcount.so' 
include FastPopcount 

def hamming(a,b) 
    popcount(a^b) 
end

然后在目录包含我的计划，我创建一个文件夹 “PopCount” 具有以下文件。

extconf.rb：

# Loads mkmf which is used to make makefiles for Ruby extensions 
require 'mkmf' 

# Give it a name 
extension_name = 'fastpopcount' 

# The destination 
dir_config(extension_name) 

# Do the work 
create_makefile(extension_name)

popcount。C：

// Include the Ruby headers and goodies 
#include "ruby.h" 

// Defining a space for information and references about the module to be stored internally 
VALUE FastPopcount = Qnil; 

// Prototype for the initialization method - Ruby calls this, not you 
void Init_fastpopcount(); 

// Prototype for our method 'popcount' - methods are prefixed by 'method_' here 
VALUE method_popcount(int argc, VALUE *argv, VALUE self); 

// The initialization method for this module 
void Init_fastpopcount() { 
    FastPopcount = rb_define_module("FastPopcount"); 
    rb_define_method(FastPopcount, "popcount", method_popcount, 1); 
} 

// Our 'popcount' method.. it uses the builtin popcount 
VALUE method_popcount(int argc, VALUE *argv, VALUE self) { 
    return INT2NUM(__builtin_popcount(NUM2UINT(argv))); 
}

然后在popcount目录下运行：

红宝石extconf.rb 使

然后运行该程序，有你有它做汉明距离....最快的方法在红宝石。

来源

2011-06-19 01:29:24 ch3rryc0ke

如果您打算遵循基于c的路径，最好将编译器标记-msse4.2添加到您的makefile中。这允许编译器生成基于硬件的popcnt指令，而不是使用表生成popcount。在我的系统上，这大约快了2.5倍。

来源

2017-08-14 14:13:17 williambellwisdo

计算红宝石汉明距离最有效的方法是什么？

回答

相关问题