什么是实现这个复合GetHash code的最佳方式()方式、GetHash、code

2023-09-02 01:46:46 作者:淡若輕風

我有一个简单的类:

public class TileName {
    int Zoom, X, Y;

    public override bool Equals (object obj)
    {
        var o = obj as TileName;
        return (o != null) && (o.Zoom == Zoom) && (o.X == X) && (o.Y == Y);
    }

    public override int GetHashCode ()
    {
        return (Zoom + X + Y).GetHashCode();
    }
}

我很好奇,如果我会得到更好的分配散列codeS,如果我不是做这样的:

I was curious if I would get a better distribution of hash codes if I instead did something like:

    public override int GetHashCode ()
    {
        return Zoom.GetHashCode() + X.GetHashCode() + Y.GetHashCode();
    }

本类将被用作词典关键的,所以我要确保有一个体面的分配。

This class is going to be used as a Dictionary key, so I do want to make sure there is a decent distribution.

推荐答案

想通过乔恩斯基特in这太回答,最好的做法是挑选一些质数和单散列codeS,再总结的一切行动繁殖这些。

Like described by Jon Skeet in this SO answer, it is best practice to pick some prime numbers and multiply these with the single hash codes, then sum everything up.

public int GetHashCode()
{
    unchecked
    {
        int hash = 17;
        // Maybe nullity checks, if these are objects not primitives!
        hash = hash * 23 + Zoom.GetHashCode();
        hash = hash * 23 + X.GetHashCode();
        hash = hash * 23 + Y.GetHashCode();
        return hash;
    }
}

XOR 哈希的问题是:

如果 X 等于那么你的哈希将只是放大,因为那样的话 X ^ Y = X ^ X = 0 保存 XOR 是一个对称的运营商,它会产生完全相同的哈希对象 [变焦= 3,X = 5,Y = 7 ] [变焦= 3,X = 7,Y = 5] [变焦= 7,X = 5 ,Y = 3] 等 if X is equal to Y then your hash will be just Zoom, because then X ^ Y = X ^ X = 0 holds xor is a symmetric operator, it will produce the exact same hashes for the objects [Zoom = 3, X = 5, Y = 7], [Zoom = 3, X = 7, Y = 5], [Zoom = 7, X = 5, Y = 3] etc.

这些事实使得异或方法更可能引起碰撞。

These facts make the xor-method more likely to cause collisions.

在除了JONS后,可以考虑使用选中背景下,明确地忽略溢出。因为像 MSDN 说:

In addition to Jons post, consider using a unchecked context, for explicitly ignoring overflows. Because like the MSDN says:

如果没有检查选中是   所使用的,恒定的前pression使用   默认溢出检查编译   时间,这是最好的。否则,如果   前pression是非恒定的,   运行时的溢出检查取决于   其他因素如编译器选项   和环境配置。

If neither checked nor unchecked is used, a constant expression uses the default overflow checking at compile time, which is checked. Otherwise, if the expression is non-constant, the run-time overflow checking depends on other factors such as compiler options and environment configuration.

因此​​,尽管通常溢出将不加以控制,则可能是失败somewhen在一些环境或建有一些编译器选项。但在这种情况下,要明确规定不检查这些溢出。

So while usually overflows will be unchecked, it may be that it fails somewhen in some environment or built with some compiler option. But in this case you want to explicitly not check these overflows.

更新:

顺便说一句: someInt.GetHash code()返回 someInt 。这样,当然可能的最快和没有一个单一的碰撞完美散列分布。要不然你映射一个int一个int哈希? :)所以我想说:你的第一个方法:

By the way: someInt.GetHashCode() returns someInt. Like this, it is of course the fastest possible and a perfect hash distribution without a single collision. How else would you map an int to an int-hash? :) So what I wanted to say: Your first approach:

return (Zoom + X + Y).GetHashCode();

和你的第二个:

return Zoom.GetHashCode() + X.GetHashCode() + Y.GetHashCode();

是完全一样的。你甚至必须调用 GetHash code 无一不是很可能有冲突。甚至犹有过的 XOR 的方法,如果你很可能有小的整数值的所有三个整数。

are exactly the same. You dont even have to call GetHashCode and both are very likely to have collisions. Maybe even worse than the xor method, if you very likely have small integer values for all three ints.

更新2:

正如我在注释ChaosPandions写道发布:如果你只是有这三个INT值, X 缩放相对较小的数字(小于1000或10000),这其中可能也有好的哈希生成器:

As I wrote in the comment to ChaosPandions post: If you just have those three int values, and X, Y and Zoom are relatively small numbers (smaller than 1000 or 10000) this one may be also a good hash generator:

public int GetHashCode()
{
    return (X << 16) ^ (Y << 8) ^ Zoom;
}

这只是分布在散列值(例如,在大端为便于阅读)的位:

It just distributes the bits in the hash value (example in big-endian for readability):

00000000 00000000 00000011 00110001    X = 817
00000000 00000000 00011011 11111010    Y = 7162
00000000 00000000 00000010 10010110    Zoom = 662

00000011 00110001 00000000 00000000    X << 16
00000000 00011011 11111010 00000000    Y << 8
00000000 00000000 00000010 10010110    Zoom

00000011 00101010 11111000 10010110    (X << 16) ^ (Y << 8) ^ Zoom