给定一个二维数组排序以递增的顺序从左至右,从上到下,什么是要搜索的目标数的最佳方法?数组、顺序、从上到下、目标

2023-09-10 22:31:22 作者:你说的后来没来。

最近我给这次采访的问题,我很好奇什么很好的解决了这将是。

  

说我是给一个二维数组,所有的   数组中的数字是在增加   为了从左至右和顶部   底部。

     

什么是搜索的最佳途径,   确定目标数是在   阵列?

现在,我的第一个倾向是使用二进制搜索,因为,我的数据进行排序。我能确定一个数是O型单排(日志N)的时间。然而,这是两个方向抛出了我。

另一种解决方案我认为可能的工作是从中间开始的地方。如果中间值低于我的目标,那么我可以肯定它是从中间矩阵的左方部分。我再往前diagnally并再次检查,降低该目标可能是直到我的目标数量磨练在广场的大小。

没有人有任何好的想法解决这个问题呢?

例如数组:

排序从左到右,从上到下。

  1 2 3 4 5 6
2 3 5 7 8
4 6 8 9 10
5 8 9 10 11
 

解决方案 剑指思路及梳理 二维数组查找 牛客可编

下面是一个简单的方法:

开始在底部左下角。 如果目标低于该值,它必须是在我们之上,所以上移一个 ,否则我们知道目标不能在此列,所以向右移动一个 转到2。

对于 N×M个数组,这个运行在 O(N + M)。我认为这将是很难做的更好。 :)

编辑:很多很好的讨论。我说的是上面的一般情况;显然,如果 N M 小,你可以使用一个二进制搜索的方法来做到这一点的东西接近对数时间。

下面是一些细节,对于那些谁是好奇:

历史

这简单的算法称为马鞍搜索。它已经存在了一段时间,它是最佳的当ñ==中号。一些参考资料:

大卫·格里斯, 规划的科学 。 施普林格出版社,1989年的 Edsgar Dijkstra算法, 的马鞍搜索 。 注意EWD-934,1985年的

然而,当 N'LT;中号,直觉认为,二进制搜索应该能够做的比 O(N + M)更好的:例如,当ñ== 1 ,一个纯粹的二进制搜索将在对数运行,而不是线性的时间。

最差情况下的必然

理查德·伯德检查这种直觉的二进制搜索可以改善鞍形算法在2006年的文章:

在理查德·伯德, 提高马鞍搜索:在算法设计的教训 ,在项目建设,第82--89,成交量4014,2006年。

使用相当不寻常的对话技术,伯德告诉我们,对于 N'LT; = M ,这个问题有一个下限Ω的(N *日志(M / N))。这必然是有意义的,因为它给我们,当ñ==中号和线性性能对数性能,当ñ== 1

算法的矩形阵列

这是使用了一行一行地二进制搜索的一种方法是这样的:

在开始一个矩形阵列,其中 N'LT;中号。比方说, N 是行 M 是列。 请在中间一排的二进制搜索。如果我们找到它,我们就大功告成了。 否则我们发现相邻的数字对取值先按g ,其中 S<值小于;摹。 在数字上方和左侧的矩形取值小于,这样我们就可以消除 下方和先按g 权大于的矩形,这样我们就可以消除了。 转到步骤(2)每个剩余两个矩形的。

在最坏情况下的复杂性而言,这种算法确实日志(M)努力消除一半的可能的解决方案,然后递归的两个小问题调用自身的两倍。我们确实有重复日志(M)的每一行作品的缩小版,但是,如果比起列数行数小,那么能够消除所有的对数,那些列开始成为值得

这给出了算法的复杂性T(N,M)=日志(M)+ 2 * T(M / 2,N / 2),这鸟显示为 O(N *日志(M / N))

另一种方法发表描述了上述的算法类似的方法克雷格Gidney:它检查一排在同一时间使用 M / N 的步长。他的分析表明,这将导致 O(N *日志(M / N))性能好。

性能比较

大澳分析是一切都很好,但做的有多好这些方法在实践中?下面的图表检查四种算法为越来越多的广场的数组:

(该幼稚算法简单地将搜索数组中的每个元素的递归算法如上所述。的混合型算法是 Gidney算法的。对于每一个数组的大小,性能通过定时每个算法在固定的百万随机生成的阵列测量。)

一些著名的要点:

正如所料,二进制搜索算法提供矩形阵列和马鞍算法的工作最好在正方形阵列的最佳性能。 的马鞍算法一维数组的性能比天真的算法糟糕的是,presumably,因为它在每个项目上多比较。 在性能打了二进制搜索算法,取方形阵列是presumably由于运行重复二进制搜索的开销。

摘要

巧妙地利用二进制搜索可以提供 O(N *日志(M / N)表现为长方形和正方形阵列的 O( N + M)马鞍的算法是非常简单的,但是从性能下降遭受的阵列变得越来越矩形。

I was recently given this interview question and I'm curious what a good solution to it would be.

Say I'm given a 2d array where all the numbers in the array are in increasing order from left to right and top to bottom.

What is the best way to search and determine if a target number is in the array?

Now, my first inclination is to utilize a binary search since my data is sorted. I can determine if a number is in a single row in O(log N) time. However, it is the 2 directions that throw me off.

Another solution I thought may work is to start somewhere in the middle. If the middle value is less than my target, then I can be sure it is in the left square portion of the matrix from the middle. I then move diagnally and check again, reducing the size of the square that the target could potentially be in until I have honed in on the target number.

Does anyone have any good ideas on solving this problem?

Example array:

Sorted left to right, top to bottom.

1 2 4 5 6  
2 3 5 7 8  
4 6 8 9 10  
5 8 9 10 11  

解决方案

Here's a simple approach:

Start at the bottom-left corner. If the target is less than that value, it must be above us, so move up one. Otherwise we know that the target can't be in that column, so move right one. Goto 2.

For an NxM array, this runs in O(N+M). I think it would be difficult to do better. :)

Edit: Lots of good discussion. I was talking about the general case above; clearly, if N or M are small, you could use a binary search approach to do this in something approaching logarithmic time.

Here are some details, for those who are curious:

History

This simple algorithm is called a Saddleback Search. It's been around for a while, and it is optimal when N == M. Some references:

David Gries, The Science of Programming. Springer-Verlag, 1989. Edsgar Dijkstra, The Saddleback Search. Note EWD-934, 1985.

However, when N < M, intuition suggests that binary search should be able to do better than O(N+M): For example, when N == 1, a pure binary search will run in logarithmic rather than linear time.

Worst-case bound

Richard Bird examined this intuition that binary search could improve the Saddleback algorithm in a 2006 paper:

Richard S. Bird, Improving Saddleback Search: A Lesson in Algorithm Design, in Mathematics of Program Construction, pp. 82--89, volume 4014, 2006.

Using a rather unusual conversational technique, Bird shows us that for N <= M, this problem has a lower bound of Ω(N * log(M/N)). This bound make sense, as it gives us linear performance when N == M and logarithmic performance when N == 1.

Algorithms for rectangular arrays

One approach that uses a row-by-row binary search looks like this:

Start with a rectangular array where N < M. Let's say N is rows and M is columns. Do a binary search on the middle row for value. If we find it, we're done. Otherwise we've found an adjacent pair of numbers s and g, where s < value < g. The rectangle of numbers above and to the left of s is less than value, so we can eliminate it. The rectangle below and to the right of g is greater than value, so we can eliminate it. Go to step (2) for each of the two remaining rectangles.

In terms of worst-case complexity, this algorithm does log(M) work to eliminate half the possible solutions, and then recursively calls itself twice on two smaller problems. We do have to repeat a smaller version of that log(M) work for every row, but if the number of rows is small compared to the number of columns, then being able to eliminate all of those columns in logarithmic time starts to become worthwhile.

This gives the algorithm a complexity of T(N,M) = log(M) + 2 * T(M/2, N/2), which Bird shows to be O(N * log(M/N)).

Another approach posted by Craig Gidney describes an algorithm similar the approach above: it examines a row at a time using a step size of M/N. His analysis shows that this results in O(N * log(M/N)) performance as well.

Performance Comparison

Big-O analysis is all well and good, but how well do these approaches work in practice? The chart below examines four algorithms for increasingly "square" arrays:

(The "naive" algorithm simply searches every element of the array. The "recursive" algorithm is described above. The "hybrid" algorithm is an implementation of Gidney's algorithm. For each array size, performance was measured by timing each algorithm over fixed set of 1,000,000 randomly-generated arrays.)

Some notable points:

As expected, the "binary search" algorithms offer the best performance on rectangular arrays and the Saddleback algorithm works the best on square arrays. The Saddleback algorithm performs worse than the "naive" algorithm for 1-d arrays, presumably because it does multiple comparisons on each item. The performance hit that the "binary search" algorithms take on square arrays is presumably due to the overhead of running repeated binary searches.

Summary

Clever use of binary search can provide O(N * log(M/N) performance for both rectangular and square arrays. The O(N + M) "saddleback" algorithm is much simpler, but suffers from performance degradation as arrays become increasingly rectangular.