是什么让这个桶排序功能的慢?功能

2023-09-11 03:27:43 作者:曲终人尽散╮物是已人非

该函数被定义为

void bucketsort(Array& A){
  size_t numBuckets=A.size();
  iarray<List> buckets(numBuckets);

  //put in buckets
  for(size_t i=0;i!=A.size();i++){
    buckets[int(numBuckets*A[i])].push_back(A[i]);
  }

  ////get back from buckets
  //for(size_t i=0,head=0;i!=numBuckets;i++){
  //size_t bucket_size=buckets[i].size();
  //for(size_t j=0;j!=bucket_size;j++){
  //  A[head+j] = buckets[i].front();
  //  buckets[i].pop_front();
  //}
  //head += bucket_size;
  //}
 for(size_t i=0,head=0;i!=numBuckets;i++){
   while(!buckets[i].empty()){
     A[head]          = buckets[i].back();
     buckets[i].pop_back();
     head++;
   }
 }

  //inseration sort
  insertionsort(A);
}

其中,列表只是列表&LT;双&GT; 在STL

阵列的内容是随机产生的 [0,1)。理论上桶排序应该比快速排序更快的大尺寸为它的为O(n),但它未能如下面的图。

The content of array are generate randomly in [0,1). Theoretically bucket sort should be faster than quicksort for large size for it's O(n),but it fails as in the following graph.

我用谷歌perftools 来分析它在千万双阵列。据报道如下

I use google-perftools to profile it on a 10000000 double array. It reports as follow

看来我不应该使用STL的名单,但我不知道为什么?这确实 std_List_node_base_M_hook 吗?我应该写list类自己?

It seems I should not use STL list,but I wonder why? Which does std_List_node_base_M_hook do? Should I write list class myself?

PS:实验和改进  我曾尝试只留下投入桶的codeS,这说明,大部分时间都是用在建立桶。 下面的改进是由: - 使用STL向量桶和储备合理的空间桶 - 使用两个辅助数组存储在建筑物桶所使用的信息,从而避免了使用链表,如在以下code

PS:The experiment and improvement I have tried just leave the codes of putting in buckets and this explained that most time is used on building up buckets. The following improvement is made: - Use STL vector as buckets and reserve reasonable space for buckets - Use two helper array to store the information used in building buckets,thus avoiding the use of linked list,as in following code

void bucketsort2(Array& A){
  size_t    numBuckets = ceil(A.size()/1000);
  Array B(A.size());
  IndexArray    head(numBuckets+1,0),offset(numBuckets,0);//extra end of head is used to avoid checking of i == A.size()-1

  for(size_t i=0;i!=A.size();i++){
    head[int(numBuckets*A[i])+1]++;//Note the +1
  }
  for(size_t i=2;i<numBuckets;i++){//head[1] is right already
    head[i] += head[i-1];
  }

  for(size_t i=0;i<A.size();i++){
    size_t  bucket_num         = int(numBuckets*A[i]);
    B[head[bucket_num]+offset[bucket_num]] = A[i];
    offset[bucket_num]++;
  }
  A.swap(B);

  //insertionsort(A);
  for(size_t i=0;i<numBuckets;i++)
    quicksort_range(A,head[i],head[i]+offset[i]);
}

结果在下面的图 其中,线开始使用清单列表,水桶,用向量作为水桶矢量开始,使用辅助arrays.By默认插入排序开始2是用在最后的,有的使用快速排序的桶大小大。 请注意清单和列表中,只有放,载体,储备8和载体,储备2几乎重叠。 我会尽量体积小并保留足够的存储空间。

The result in the following graph where line start with list using list as buckets,start with vector using vector as buckets,start 2 using helper arrays.By default insertion sort is used at last and some use quick sort as the bucket size is big. Note "list" and "list,only put in" ,"vector,reserve 8" and "vector,reserve 2" nearly overlap. I will try small size with enough memory reserved.

推荐答案

通过

iarray<List> buckets(numBuckets);

你基本上是创建列表的很多事,可以花费你很多尤其是在内存访问,它在理论上线性的,但是,这不是在实践中的情况。

you are basically creating a LOT of lists and that can cost you a lot especially in memory access which it theoretically linear but that's not the case in practice.

尽量减少桶的数量。

要验证我的说法,分析你的code的速度只有创建列表。

To verify my assertion analyse your code speed with only the creation of the lists.

另外要遍历你不应该使用 .size(),而是

Also to iterate over the elements of the lists you should not use .size() but rather

//get back from buckets
for(size_t i=0,head=0;i!=numBuckets;i++)
  while(!buckets[i].empty())
  {
    A[head++] = buckets[i].front();
    buckets[i].pop_front();
  }

在一些实施 .size()可以在O(N)。不可能的,但...

In some implementations .size() can be in O(n). Unlikely but...

经过一番研究,我发现, 这个页面解释什么是$ C $下的std :: _ List_node_base ::挂钩。

After some research I found this page explaining what is the code for std::_List_node_base::hook.

好像它是仅在一个列表中的给定的位置插入的元件。不应该花大钱..

Seems it is only to insert an element at a given place in a list. Shouldn't cost a lot..

 
精彩推荐
图片推荐