为什么Java 6的阵列#排序(对象[])从归并变化insertionsort为小数组?数组、阵列、对象、Java

2023-09-11 02:10:51 作者:妳我即是江湖

Java 6的在 Arrays.java 归并实现使用的插入排序,如果数组长度小于某个阈值。这个值是硬codeD〜7的算法是递归的,这最终会发生多次为一个大阵。该规范归并排序算法只使用合并排序一路下跌,直到有只有并没有这样做, 1元素列表中的

Java 6's mergesort implementation in Arrays.java uses an insertion-sort if the array length is less than some threshold. This value is hard-coded to 7. As the algorithm is recursive, this eventually happens many times for a large array. The canonical merge-sort algorithm does not do this, just using merge-sort all the way down until there is only 1 element in the list.

这是一个最优化?如果是这样,它是如何应该帮助?为什么 7 ?插入排序(甚至< = 7 的东西)增加大阵大幅度排序需要比较的数量 - 这样会增加成本来排序,其中的compareTo()电话是缓慢的。

Is this an optimisation? If so, how is it supposed to help? And why 7? The insertion sort (even of <=7 things) increases the number of comparisons required to sort a large array dramatically - so will add cost to a sort where compareTo() calls are slow.

(X轴是阵列的大小,y轴#比较,对不同的值 INSERTIONSORT_THRESHOLD

(x-axis is size of array, y-axis is # of comparisons, for different values of INSERTIONSORT_THRESHOLD)

推荐答案

是的,这是故意的。虽然大澳归并的小于二次排序,如插入排序,它的操作比较复杂,因而更慢。

Yes this is intentional. While the Big-O of mergesort is less than that of quadratic sorts such as insertion sort, the operations it does are more complex and thus slower.

考虑排序的长度8归并排序的数组使得〜14递归调用本身除了7合并操作。每次递归调用贡献一些不平凡的开销运行时。每个合并操作涉及全部循环,指标变量必须初始化,递增,并且相比,临时数组必须复制等,所有你能想到超过300简单的操作。

Consider sorting an array of length 8. Merge sort makes ~14 recursive calls to itself in addition to 7 merge operations. Each recursive call contributes some non-trivial overhead to the run-time. Each merge operation involves a loop where index variables must be initialized, incremented, and compared, temporary arrays must be copied, etc. All in all, you can expect well over 300 "simple" operations.

在另一方面,插入排序是固有的简单和使用大约8 ^ 2 = 64的操作是要快得多。

On the other hand, insertion sort is inherently simple and uses about 8^2=64 operations which is much faster.

想想看,这种方式。当你解决了10个号码用手列表,你用归并排序?不,因为你的大脑在做简单的事情像像插入排序要好得多。但是,如果我给你一年的时间进行排序100000号的列表,你可能会更倾向于排序合并。

Think about it this way. When you sort a list of 10 numbers by hand, do you use merge sort? No, because your brain is much better at doing simple things like like insertion sort. However if I gave you a year to sort a list of 100,000 numbers, you might be more inclined to merge sort it.

至于幻数7,它是根据经验得出的是最佳的。

As for the magic number 7, it is empirically derived to be optimal.

编辑:在8元一个标准的插入排序,最坏的情况下会导致〜36比较。在规范归并排序,你有〜24比较。添加从方法调用和操作的复杂性的开销,插入排序应该会更快。此外,如果你平均情况下,插入排序将使少得多的比较比36。

In a standard insertion sort of 8 elements, the worst case scenario leads to ~36 comparisons. In a canonical merge sort, you have ~24 comparisons. Adding in the overhead from the method calls and complexity of operations, insertion sort should be faster. Additionally if you look at the average case, insertion sort would make far fewer comparisons than 36.