与提示二进制搜索提示

2023-09-11 03:16:43 作者:国产祖宗

我有一个简单的的std ::矢量含有一些数字,这是排序(升序排列)。我想查找一个元素,到目前为止,我使用:

I have a simple std::vector containing some numbers, which are sorted (in ascending order). I want to lookup an element, so far I use:

return std::lower_bound(vec.begin(), vec.end(), needle);

其中,是我期待已久的元素。然而,我的载体往往相当长(百万元),但大多数时候的内容相对$在某种意义上p $ pdictable,如果的第一个元素是零和最后一个元素是Ñ ,然后在中间的元素具有值接近(N *指数)/ vec.size(),因而是predictable。

Where needle is the element I look for. However, my vector tends to be quite long (millions of elements), but most of the time the contents are relatively predictable in a sense that if the first element is zero and the last element is N, then the elements in between have value close to (N * index) / vec.size() and are hence predictable.

是否有下界的修改,这将接受一个提示(类似于如何std::map::emplace_hint()一样),如:

Is there a modification of the lower bound, which would accept a hint (similarly to how std::map::emplace_hint() does), such as:

assert(!vec.empty());
std::vector<int>::iterator hint = vec.begin() + std::min(vec.size() - 1,
    (needle * vec.size()) / vec.back());
if(*hint > needle)
    return std::lower_bound(vec.begin(), hint, needle);
else
    return std::lower_bound(hint, vec.end(), needle);

这会工作,但 LOWER_BOUND 忽略,这是接近解决方案,将最有可能开始分裂间隔半(看,我们知道针头最可能不是),无需采取许多步骤。我知道,有一个算法步骤1,它加倍,直到过冲针,然后呢,在给定的时间间隔二分查找。

This will work, but the lower_bound ignores that it is close to the solution and will most likely start splitting the interval to halves (looking where we know that the needle most likely isn't), taking unnecessarily many steps. I know that there was an algorithm which starts with step 1, which it doubles until it overshoots the needle, and then does binary search in the given interval.

我忘了什么是算法的名称。难道在STL实现?

I forgot what is the name of the algorithm. Is it implemented in the STL?

推荐答案

我想你要找的算法称为插值搜索,这是二进制搜索的变化,展望,而不是在数组的中点,阵列端点之间的线性插值来猜测,其中的关键应该是。在多数民众赞成结构化的方式,你的是数据,预计运行时间为O(log log n)的,指数比标准二进制搜索速度更快。

I think the algorithm you're looking for is called interpolation search, which is a variation on binary search that, instead of looking at the midpoint of the array, linearly interpolates between the array endpoints to guess where the key should be. On data that's structured the way that yours is, the expected runtime is O(log log n), exponentially faster than a standard binary search.

没有标准实现这一算法的C ++的,但(作为一个完全无耻的插头)我正好有codeD这一项在C ++中。 我的实现是,如果你有兴趣看它是如何工作可在网上。

There is no standard implementation of this algorithm in C++, but (as a totally shameless plug) I happened to have coded this one up in C++. My implementation is available online if you're interested in seeing how it works.

希望这有助于!