简单的哈希函数技术

我很新来哈希在Java中，我一直在卡住的几个部分。我有一个400个物品的清单（并存储在1.5x = 600的列表中），其中物品ID的范围从1-10k。我一直在寻找一些散列函数，并且我最初将这些例子复制到只使用折叠的数据包中。我注意到我已经得到了大约50-60％的空节点，这显然太多了。我还注意到，只是将id改为600会使其减少到50％的零值。简单的哈希函数技术

我现在的哈希函数看起来像，并用以作为丑，因为它是，它仅在空值下降1％，从一个简单的改装，以1.32的平均列表长度...

public int getHash(int id) 
    { 
     int hash = id; 

     hash <<= id % 3; 
     hash += id << hash % 5; 

     /* let's go digit by digit! */   
     int digit; 
     for(digit = id % 10; 
      id != 0; 
      digit = id % 10, id /= 10) 
     { 
     if (digit == 0) /* prevent division by zero */ 
      continue; 
     hash += digit * 2; 
     } 

     hash >>= 5; 
     return (hash % 600); 
    }

什么是一些好的技术来创建简单散列函数？

来源

2011-04-12 Kevin Evans

我会保持简单。将元素的id作为散列码返回，并让散列表担心在需要时重新散列它。 你的目标应该是为你的对象设置一个独一无二的哈希码。

Java的HashMap中使用以下方法重散列：

/** 
* Applies a supplemental hash function to a given hashCode, which 
* defends against poor quality hash functions. This is critical 
* because HashMap uses power-of-two length hash tables, that 
* otherwise encounter collisions for hashCodes that do not differ 
* in lower bits. Note: Null keys always map to hash 0, thus index 0. 
*/ 
static int hash(int h) { 
    // This function ensures that hashCodes that differ only by 
    // constant multiples at each bit position have a bounded 
    // number of collisions (approximately 8 at default load factor). 
    h ^= (h >>> 20)^(h >>> 12); 
    return h^(h >>> 7)^(h >>> 4); 
}

来源

2011-04-12 23:54:47 corsiKa

有一篇不错的评论文章here。此外，Wikipedia article on hash functions是一个很好的概述。它建议使用卡方检验来评估散列函数的质量。

来源

2011-04-12 23:51:40

简单的哈希函数技术

回答

相关问题