高效的散列code()实现高效、code

2023-09-10 23:53:29 作者:沉浸在离心的边缘 -

我经常自动生成一个类的散code()方法使用的IntelliJ IDEA和一般方法的形式为:

I often auto-generate an class's hashCode() method using IntelliJ IDEA and typically the method takes the form:

result = 31 * result + ...

我的问题是什么是乘以31的目的是什么?我知道这是一个素数,但为什么挑31具体一点吗?另外,如果实施散code()的特别小/大数据集的人会解决这个问题不同?

My question is what is the purpose of multiplying by 31? I know this is a prime number but why pick 31 specifically? Also, if implementing a hashCode() for a particularly small / large dataset would people approach this problem differently?

推荐答案

31乘快,因为JIT可以将其转换为5位向左移位和减法:

Multiplying by 31 is fast because the JIT can convert it to a shift left by 5 bits and a subtract:

x * 31 == (x << 5) - x

没有任何特殊的额外的信息,我会坚持这种做法。这是相当快的,并有可能最终获得相当良好的分布式哈希codeS,而且也很容易得到正确的:)

Without any particular extra information, I'd stick to this approach. It's reasonably fast and likely to end up with reasonably well-distributed hash codes, and it's also easy to get right :)

数据集的大小并不重要,但如果你有关于特定值的额外信息,你就可以用(例如,它总是甚至)的工作,那么你的可以的能设计一个更好的哈希函数。我会等待,直到它是一个真正的问题,第一,虽然:)

The size of the dataset doesn't really matter, but if you have particular extra information about the values you'll be work with (e.g. "it's always even") then you may be able to design a better hash function. I'd wait until it's an actual problem first though :)