.NET元组和equals性能性能、NET、equals

2023-09-02 11:54:01 作者:男权Meng!

这是我没有注意到,直到今天。显然,.NET实现很多常用的元组类(元组的LT; T> 元组LT; T1,T2> 等)引起的拳击处罚的值类型当执行平等基础的业务。

This is something I had not noticed until today. Apparently, the .NET implementation of the much used tuple classes (Tuple<T>, Tuple<T1, T2> etc) causes boxing penalties for value types when equality based operations are performed.

下面是怎么样的该类实施框架(通过ILSpy源):

Here is how the class is kind of implemented in the framework (source via ILSpy):

public class Tuple<T1, T2> : IStructuralEquatable 
{
    public T1 Item1 { get; private set; }
    public T2 Item2 { get; private set; }

    public Tuple(T1 item1, T2 item2)
    {
        this.Item1 = item1;
        this.Item2 = item2;
    }

    public override bool Equals(object obj)
    {
        return this.Equals(obj, EqualityComparer<object>.Default);
    }

    public override int GetHashCode()
    {
        return this.GetHashCode(EqualityComparer<object>.Default);
    }

    public bool Equals(object obj, IEqualityComparer comparer)
    {
        if (obj == null)
        {
            return false;
        }

        var tuple = obj as Tuple<T1, T2>;
        return tuple != null 
            && comparer.Equals(this.Item1, tuple.Item1) 
            && comparer.Equals(this.Item2, tuple.Item2);
    }

    public int GetHashCode(IEqualityComparer comparer)
    {
        int h1 = comparer.GetHashCode(this.Item1);
        int h2 = comparer.GetHashCode(this.Item2);

        return (h1 << 5) + h1 ^ h2;
    }
}

我看到的问题是它会导致一个两级装箱拆箱,说了等于电话,一个,在 comparer.Equals 这箱的项目,二, EqualityComparer&LT;对象&gt; 会调用的非通用的等于这反过来将在内部有项目拆箱到原单类型。

The problem I see is it causes a two stage boxing-unboxing, say for Equals calls, one, at the comparer.Equals which boxes the item, two, the EqualityComparer<object> calls the non-generic Equals which in turn will internally have to unbox the item to orginal type.

相反,他们为什么不这样做:

Instead why wouldn't they do something like:

public override bool Equals(object obj)
{
    var tuple = obj as Tuple<T1, T2>;
    return tuple != null
        && EqualityComparer<T1>.Default.Equals(this.Item1, tuple.Item1)
        && EqualityComparer<T2>.Default.Equals(this.Item2, tuple.Item2);
}

public override int GetHashCode()
{
    int h1 = EqualityComparer<T1>.Default.GetHashCode(this.Item1);
    int h2 = EqualityComparer<T2>.Default.GetHashCode(this.Item2);

    return (h1 << 5) + h1 ^ h2;
}

public bool Equals(object obj, IEqualityComparer comparer)
{
    var tuple = obj as Tuple<T1, T2>;
    return tuple != null
        && comparer.Equals(this.Item1, tuple.Item1)
        && comparer.Equals(this.Item2, tuple.Item2);
}

public int GetHashCode(IEqualityComparer comparer)
{
    int h1 = comparer.GetHashCode(this.Item1);
    int h2 = comparer.GetHashCode(this.Item2);

    return (h1 << 5) + h1 ^ h2;
}

我很惊讶地看到平等实现这种方式在.NET中的元组类。我是用元组类型在所述字典中的一个关键。

I was surprised to see equality implemented this way in .NET tuple class. I was using tuple type as a key in one of the dictionaries.

是否有任何理由,这有如图所示的第一code来实现?它有点沮丧地利用这一类在这种情况下。

Is there any reason why this has to be implemented as shown in the first code? Its a bit discouraging to make use of this class in that case.

我不认为code重构和非重复数据应该是关注的主要问题。同样非通用/拳击实施已经落后 IStructuralComparable 过,但因为 IStructuralComparable.CompareTo 使用较少的不一问题常常

I dont think code refactoring and non-duplicating data should have been the major concerns. The same non-generic/boxing implementation has gone behind IStructuralComparable too, but since IStructuralComparable.CompareTo is less used its not a problem often.

我的基准上述两种方法与第三个方法,它仍然是少征税,这样(只要领):

I benchmarked the above two approaches with a third approach which is still less taxing, like this (only the essentials):

public override bool Equals(object obj)
{
    return this.Equals(obj, EqualityComparer<T1>.Default, EqualityComparer<T2>.Default);
}

public bool Equals(object obj, IEqualityComparer comparer)
{
    return this.Equals(obj, comparer, comparer);
}

private bool Equals(object obj, IEqualityComparer comparer1, IEqualityComparer comparer2)
{
    var tuple = obj as Tuple<T1, T2>;
    return tuple != null
        && comparer1.Equals(this.Item1, tuple.Item1)
        && comparer2.Equals(this.Item2, tuple.Item2);
} 

一对夫妇的元组LT,日期时间,日期时间&GT; 字段一百万等于通话。这是结果:

for a couple of Tuple<DateTime, DateTime> fields a 1000000 Equals calls. This is the result:

1号的方法(原.NET实现) - 310毫秒

1st approach (original .NET implementation) - 310 ms

第二个方法 - 60毫秒

2nd approach - 60 ms

3的办法 - 130毫秒

3rd approach - 130 ms

默认的实现比最佳的解决方案要慢4-5倍。

The default implementation is about 4-5 times slower than the optimal solution.

推荐答案

您想知道,如果它有实施的方式。总之,我会说不:有许多功能上等同实施

You wondered if it 'has to' be implemented that way. In short, I would say no: there are many functionally equivalent implementations.

但是,为什么现有的实现做出了这样明确的使用 EqualityComparer&LT;对象&gt; .DEFAULT ?这可能只是谁写的这个弱智优化的错误的,或者至少是不同的东西比速度的你在一个内部循环方案的人的情况。根据他们的基准可能会出现是正确的事情。

But why does the existing implementation make such explicit usage of EqualityComparer<object>.Default? It may just be a case of the person who wrote this mentally optimizing for the 'wrong', or at least different thing than your scenario of speed in an inner loop. Depending on their benchmark it may appear be the 'right' thing.

但是,基准情景可能会导致他们做出这样的选择?那么,他们有针对性的优化似乎是优化的EqualityComparer类模板实例的最小数量。他们可能会选择这个,因为模板实例自带内存或加载时的成本。如果是这样,我们可以将他们的基准情景可能基于应用程序,启动时间和内存使用情况,而不是有些吃紧的循环方案被猜中。

But what benchmark scenario could lead them to make that choice? Well the optimization they have targeted seems to be to optimize for the minimum number of EqualityComparer class template instantiations. They might likely choose this because template instantiation comes with memory or load-time costs. If so, we can guess their benchmark scenario could have been based on app-startup-time or memory usage rather than some tight looping scenario.

这里有一个知识点支持理论(通过确认偏误:)发现 - EqualityComparer实现方法体不能如果T是一个结构的共享。从http://blogs.microsoft.co.il/sasha/2012/09/18/runtime-re$p$psentation-of-genericspart-2/

Here is one knowledge point to support the theory (found by using confirmation bias :) - EqualityComparer implementations method bodies cannot be shared if T is a struct. Excerpted from http://blogs.microsoft.co.il/sasha/2012/09/18/runtime-representation-of-genericspart-2/

在CLR需要创建一个封闭泛型类型的实例,   如列表,它会基于一个方法表和EEClass   开放式。与往常一样,该方法表包含方法指针,这   编译上通过JIT编译器飞行。然而,有一个   这里关键的优化:在封闭的通用编译的方法体   有引用类型参数的类型可以共享。   [...]   一样   想法并不值类型的工作。例如,当T是长,   赋值语句项目[大小] =项目需要不同   指令,因为8个字节必须被复制,而不是4,即使大   值类型甚至可能需要多于一个的指令;等。

When the CLR needs to create an instance of a closed generic type, such as List, it creates a method table and EEClass based on the open type. As always, the method table contains method pointers, which are compiled on the fly by the JIT compiler. However, there is a crucial optimization here: compiled method bodies on closed generic types that have reference type parameters can be shared. [...] The same idea does not work for value types. For example, when T is long, the assignment statement items[size] = item requires a different instruction, because 8 bytes must be copied instead of 4. Even larger value types may even require more than one instruction; and so on.