为什么在近排序的列表,这快速排序的原因堆栈溢出排序的列表?列表、堆栈、原因、快速

2023-09-11 22:43:14 作者:再来一次

我目前正在写一个Java的快速排序算法,整数随机数组排序,然后用System.nanoTime时间他们()。这些数组的大小是10的幂,从10 ^ 3日开始,并在10 ^ 7结束。此外,随机列表有不同的特性。我整理纯随机列表,列出了一些相同的值(fewUnique),反向排序列表,分类列表和近排序的列表。

I'm currently writing a quick sort algorithm in Java to sort random arrays of integers and then timing them using System.nanoTime(). The size of these arrays are powers of ten, starting from 10^3 and ending at 10^7. Furthermore, the random lists have different properties. I am sorting purely random lists, lists with some identical values (fewUnique), reversed sorted lists, sorted lists and nearly sorted lists.

排序工作。它执行一个快速排序递归阵列上,直到它需要30个元素的阵列或更小,在这种情况下,它执行一个插入排序排序

The sort works. It performs a quick sort recursively on the array until it needs to sort 30 elements or less of the array, in which case it performs an insertion sort.

所有被罚款10 ^ 3和10 ^ 4,但一旦我得到了10 ^ 5个值,只将随机,一些独特的和随机的列表进行排序,但会产生一个堆栈溢出错误时排序近排序,排序的列表。

All was fine for 10^3 and 10^4 but once I got to 10^5 values it would only sort the random, few unique and random lists but would incur a Stack Overflow error when sorting nearly sorted and sorted lists.

我不相信在名单中产生,因为堆栈溢出的排序算法中出现的方式,问题在于(这是由编译器所指的线是findPivot()方法中的第一个)。此外,它经常会1和6列表之间崩溃之前进行排序。因此,它必须有某种方式,算法本身拥有近排序和分类列表交互。此外,新一代的逆转名单涉及调用$ C $下创建一个排序的列表(再倒转的话)。

I don't believe the issue lies in the way the lists are generated because the Stack Overflow occurs within the sorting algorithm (the line which is being referred to by the compiler is the first within the findPivot() method). Also, it will often sort between 1 and 6 lists before crashing. It must therefore be some way that the algorithm itself interacts with nearly sorted and sorted lists. Also, the generation for reversed lists involves invoking the code for creating a sorted list (and then reversing it).

另外,我发现它不太可能的问题仅仅是具有到,由于某种原因,通过递归调用分区关闭阵列的部分中在几乎排序和排序的列表显著更多次,而不是其他列表的算法类型,因为它可以随机列表具有10 ^ 7值进行排序,这将需要多得多的分区将比近排序的列表10 ^ 5的值。

Also, I find it unlikely that the issue is just to the algorithm having to, for some reason, invoke the partitioning off of sections of the array by recursion in a nearly sorted and sorted list significantly more times rather than the other list types, as it can sort random lists with 10^7 values, which would require far more partitioning than would a nearly sorted list with 10^5 values.

我知道一定有事情做这些列表类型与我的快速排序的递归是如何相互作用的,但如果有人能揭示它的光,将是巨大的。我已经把code这两个在下面全面和随机列表生成器的快速排序算法。

I realise it must have something to do with how these list types interact with the recursion of my quick sort, but if someone could shed light on it that would be great. I've put the code to both the quick sort algorithm in full and the random list generators below.

快速排序算法

/**
 * Performs a quick sort given the indexes of the bounds of an integer array
 * @param arr The array to be sorted
 * @param highE The index of the upper element
 * @param lowE The index of the lower element
 */
public static void quickSort(int[] arr, int highE, int lowE)
{       
    //Only perform an action if arr.length > 30, otherwise insertion sort [recursive base case])
    if (lowE + 29 < highE)
    {
        //Get the element and then value of the pivot
        int pivotE = findPivot(arr, highE, lowE);
        int pivotVal = arr[pivotE], storeE = lowE;

        //Swap the pivot and the last value.
        swapElements(arr, pivotE, highE);

        //For each element in the selection that is not the pivot, check to see if it is lower than the pivot and if so, move it to the leftmost untouched element.
        for (int i = lowE; i < highE; i++)
        {
            if (arr[i] < pivotVal)
            {
                swapElements(arr, storeE, i);

                //Increment storeE so that the element that is being switched moves to the right location
                storeE++;
            }
        }

        //Finally swap the pivot into its proper position and recrusively call quickSort on the lesser and greater portions of the array
        swapElements(arr, storeE, highE);                   
        //Lesser
        quickSort(arr, storeE - 1, lowE);
        //Greater
        quickSort(arr, highE, storeE + 1);
    }
    else
    {
        insertSort(arr, highE, lowE);
    }
}




/**
 * Finds the pivot element
 * @param arr The array to be sorted
 * @param highE The index of the top element
 * @param lowE The index of the bottom element
 * @return The index of the pivot.
 */
public static int findPivot(int[] arr, int highE, int lowE)
{
    //Finds the middle element
    int mid = (int) Math.floor(lowE + (highE - lowE) / 2);

    //Returns the value of the median of the first, middle and last elements in the array.
    if ((arr[lowE] >= arr[mid]) && (arr[lowE] >= arr[highE])) 
    {
        if (arr[mid] > arr[highE]) {return mid;}
        else {return highE;}
    }
    else if ((arr[mid] >= arr[lowE]) && (arr[mid] >= arr[highE])) 
    {
        if (arr[lowE] > arr[highE]) {return lowE;}
        else {return highE;}
    }
    else 
    {
        if (arr[lowE] > arr[mid]) {return lowE;}
    }

    return mid;
}




/**
 *Performs an insertion sort on part of an array
 * @param arr The array to be sorted.
 * @param highE The index of the top element.
 * @param lowE The index of the low element.
 */
public static void insertSort(int[] arr, int highE, int lowE)
{
    //Sorts elements lowE to i in array, with i being gradually incremented.
    for (int i = lowE + 1; i <= highE; i++)
    {
        for (int j = i; j > lowE; j--)
        {
            if (arr[j] < arr[j - 1])
            {
                swapElements(arr, j, j-1);
            }
            else {break;}
        }
    }
}

随机选择发电机

/**
 * Creates a random list
 * @param arr The array to place the list inside of
 */
public static void randomList(int[] arr)
{
    //Places a random number at each element of the array

    for (int i = 0; i < arr.length; i++)
    {
        arr[i] = (int) Math.floor(Math.random() * RAND_MAX);
    }
}




/**
 * Creates a nearly sorted list of random numbers
 * @param arr the array to place the list inside of
 */
public static void nearSortList(int[] arr)
{
    //Creates a sorted list in arr
    sortList(arr);



    int swaps = (int) (Math.ceil(Math.pow((Math.log(arr.length)), 2.0)));

    //The two values to be switched each time
    int a, b;

    //Performs a number of swaps equal to swaps [log(N) ^ 2] rounded up, with numbers switched no more than ln(N) places away
    for (int i = 0; i < swaps; i++)
    {
        a = (int) Math.floor(Math.random() * arr.length);

        b = (int) (a + Math.random() * 2 * Math.log(arr.length) - Math.log(arr.length));

        //Accounts for cases in which b is either greater or smaller than the array bounds
        if (b < 0)
        {
            b = -b;
        }
        else if (b >= arr.length)
        {
            b = -1 * (arr.length - b);
        }

        swapElements(arr, a, b);
    }
}




/**
 * Creates a random list with many unique values in
 * @param arr the array to place the list inside of
 */
public static void fewUniqueList(int[] arr)
{
    int[] smallArr = new int[(int) Math.floor(Math.pow(arr.length, 9.0 / 10.0))];


    //Creates a smaller array of random values
    randomList(smallArr);



    //Fills the larger list up with values from the smaller list, ensuring aproximately N / N ^ (9/10) instances of each number in the array and ensuring, at most, there are N ^ (9/10) (rounded down) unique values in the large array
    for (int i = 0; i < arr.length; i++)
    {
        arr[i] = smallArr[(int) Math.floor(Math.random() * smallArr.length)];
    }
}




/**
 * Creates a reversed list of random numbers
 * @param arr the array to place the list inside of
 */
public static void reversedList(int[] arr)
{
    //Creates a sorted list in arr
    sortList(arr);




    //Switches each ith elements with its mirror on the other end of the array until the value of i reaches the middle of the array
    for (int i = 0; i < (int) (arr.length / 2.0); i++)
    {
        swapElements(arr, i, arr.length - 1 - i);
    }
}




/**
 * Creates a sorted list of random numbers using a merge sort
 * @param arr the array to place the list inside of
 */
public static void sortList(int[] arr)
{
    //Creates a random list in arr
    randomList(arr);

    Arrays.sort(arr);
}

编辑: [已关闭]

编辑2:

我已经替换为以下code中的基本递归调用,以便只调用中最小的两个分区的EJPs推荐的,它仍然是不固定的问题。

I've replaced the basic recursive calls with the following code in order to only call the smallest of the two partitions at EJPs recommendation and it still is not fixing the issue.

if (storeE - 1 - lowE < highE - storeE + 1)
{
    //Lesser
    quickSort(arr, storeE - 1, lowE);
    //Greater
    quickSort(arr, highE, storeE + 1);
}
else
{
    //Greater
    quickSort(arr, highE, storeE + 1);
    //Lesser
    quickSort(arr, storeE - 1, lowE);
}

修改3:

好了,现在明显的是递归深度远远大于我给它的信用为近排序,排序的列表。但现在我需要工作,为什么是这样的话,为什么随机只列出了28个深度为10 ^ 7价值观,但近排序,排序的列表具有深度超过3000

Ok, it is now evident that the recursion depth is far greater than I gave it credit for for nearly sorted and sorted lists. But now I need to work out why this is the case, and why random lists only had a depth of 28 for 10^7 values but nearly sorted and sorted lists have depths of over 3000

推荐答案

对于一个随机序列,你可以关闭分区中的数据的大规模块。 但是,对于一个(几乎)排序后的数组,你会大多是被分割掉1元的时间。

For a random array, you could partition off massive chunks of the data. But for a (nearly) sorted array, you'd mostly be partitioning off 1 element at a time.

因此​​,对于一个排序后的数组,您的堆栈大小将结束是相同的阵列的大小,同时,对于一个随机阵列,它更可能是关于该大小的对数。

So, for a sorted array, your stack size would end up being the same as the size of the array, while, for a random array, it's much more likely to be about a logarithm of that size.

所以,即使随机序列比近排序的高很多,这并不奇怪,较小的一个抛出一个异常,但更大的一个没有。

So, even if the random array is much larger than a nearly sorted one, it's not surprising that the smaller one throws an exception, but the larger one doesn't.

修改您的code

在修复方面,由于 EJP指出,你应该做的更小的分区第一,限制堆栈的生长。但是,这本身不会解决问题,Java不支持尾调用优化(当然,这是可选的实现,因为我知道这个问题)。

In terms of a fix, as EJP pointed out, you should do the smaller partition first to limit stack growth. But this in itself won't fix the problem as Java doesn't support tail-call optimization (well, it's optional for an implementation, as I understand that question).

下面一个相当简单的解决方法是把你的功能,成为一个while循环,基本上是硬编码尾调用优化。

A fairly simple fix here is to throw your function into a while-loop, essentially hard-coding the tail-call optimization.

要提供一个更好地了解我的意思是:

To give a better idea of what I mean:

public static void quickSort(int[] arr, int highE, int lowE)
{
    while (true)
    {
        if (lowE + 29 < highE)
        {
            ...
            quickSort(arr, storeE - 1, lowE);

            // not doing this any more
            //quickSort(arr, highE, storeE + 1);

            // instead, simply set the parameters to their new values
            // highE = highE;
            lowE = storeE + 1;
        }
        else
        {
            insertSort(arr, highE, lowE);
            return;
        }
    }
}

好了,现在你有基本的想法,这会更好看(功能等同于上述情况,只是更简洁):

Well, now that you have the basic idea, this would look better (functionally equivalent to the above, just more concise):

public static void quickSort(int[] arr, int highE, int lowE)
{
    while (lowE + 29 < highE)
    {
        ...
        quickSort(arr, storeE - 1, lowE);
        lowE = storeE + 1;
    }
    insertSort(arr, highE, lowE);
}

这当然是不实际做较小的一个第一,但我会留给你弄清楚(看来你已经有一个如何做到这一点公平的想法)。

This of course doesn't actually do the smaller one first, but I'll leave that to you to figure out (seems you already have a fair idea of how to do this).

这是如何工作

对于一些由值...

您当前的code做到这一点:(缩进表示函数调用里面发生了什么 - 从而增加缩进表示递归)

Your current code does this: (an indent indicates what happens inside that function call - thus increasing indentation means recursion)

quickSort(arr, 100, 0)
   quickSort(arr, 49, 0)
      quickSort(arr, 24, 0)
         insertion sort
      quickSort(arr, 49, 26)
         insertion sort
   quickSort(arr, 100, 51)
      quickSort(arr, 76, 0)
         insertion sort
      quickSort(arr, 100, 74)
         insertion sort

修改后的code做到这一点:

The modified code does this:

quickSort(arr, 100, 0)
   quickSort(arr, 49, 0)
      quickSort(arr, 24, 0)
         break out of the while loop
         insertion sort
   lowE = 26
   break out of the while loop
      insertion sort
lowE = 51
run another iteration of the while-loop
    quickSort(arr, 76, 0)
      break out of the while loop
      insertion sort
lowE = 74
break out of the while loop
   insertion sort

增加堆栈大小

不知道你是否考虑过这一点,或者是否将与您的参数,但你总是可以考虑简单的随着堆栈大小的 -Xss 命令行参数。

Not sure whether you've considered this, or whether it would work with your parameters, but you can always consider simply increasing the stack size with the -Xss command-line parameter.