2D峰值寻找算法在O(n)的最坏情况下的时间?峰值、算法、最坏、情况下

2023-09-12 21:19:55 作者:遗失的流年。

我在做这个的课程,从麻省理工学院的算法。在第一个讲座教授presents以下问题: -

2D阵列中的峰是一个值,使得所有这4邻居都小于或等于它,即。为

A [1] [J] 是本地最大的,

  A [1 + 1] [J] LT = A [1] [J]。
&功放;&安培;一个[I-1] [j]的&其中; = A [1] [j]的
&功放;&安培; A [1] [J + 1]; = A [1] [j]的
&功放;&安培;一个[I + 1] [J-1]; = A [1] [j]的
 

现在给出的N×N二维数组,找到一个峰值到数组中。

这个问题可以在迎刃而解O(N ^ 2)时间遍历所有元素,并返回一个高峰。

//课程:

不过它可以通过使用分而治之的解决方案作为解释的这里。

但他们都表示,存在一个 O(N)时间算法,解决了这个问题。请建议我们如何能够解决 O(N)这个问题的时间。

PS(对于那些谁知道蟒蛇)课程的工作人员解释的方法的这里(问题1-5。峰发现证明),并在他们的习题集也提供了一些蟒蛇code。但解释的做法是完全不明显,很难破译。蟒蛇code也同样混乱。所以,我抄了下面的那些谁知道蟒蛇,并能告诉什么算法正在从code使用的code中的主要部分。

 高清algorithm4(问题,bestSeen =无,rowSplit = TRUE,跟踪=无):
    #如果是空的,我们就大功告成了
    如果problem.numRow&所述; = 0或problem.numCol&所述; = 0:
        返回None

    子问题= []
    分= []

    如果row​​Split:
        #递归子问题将涉及的行数的一半
        中期= problem.numRow // 2

        关于两个子#信息
        (subStartR1,subNumR1)=(0,MID)
        (subStartR2,subNumR2)=(中间+ 1,problem.numRow  - (中间+ 1))
        (subStartC,subNumC)=(0,problem.numCol)

        subproblems.append((subStartR1,subStartC,subNumR1,subNumC))
        subproblems.append((subStartR2,subStartC,subNumR2,subNumC))

        #得到的分列中的所有位置的列表
        分=双重交叉​​([MID],范围(problem.numCol))
    其他:
        #递归子问题将涉及的列数的一半
        中期= problem.numCol // 2

        关于两个子#信息
        (subStartR,subNumR)=(0,problem.numRow)
        (subStartC1,subNumC1)=(0,MID)
        (subStartC2,subNumC2)=(中间+ 1,problem.numCol  - (中间+ 1))

        subproblems.append((subStartR,subStartC1,subNumR,subNumC1))
        subproblems.append((subStartR,subStartC2,subNumR,subNumC2))

        #得到的分列中的所有位置的列表
        分=双重交叉​​(范围(problem.numRow),[MID])

    #找出最大的分割行或列中
    bestLoc = problem.getMaximum(分频器,跟踪)
    邻居= problem.getBetterNeighbor(bestLoc,跟踪)

    #更新,我们已经看到,到目前为止在此基础上新的最大最好的
    如果bestSeen是无或problem.get(邻居)> problem.get(bestSeen):
        bestSeen =邻居
        如果不跟踪是无:trace.setBestSeen(bestSeen)

    当我们知道我们#回报已经找到了高峰
    如果邻居== bestLoc和problem.get(bestLoc)> = problem.get(bestSeen):
        如果不跟踪是无:trace.foundPeak(bestLoc)
        返回bestLoc

    #找出哪些子问题包含了我们所见过的最大数量
    #远远的,递归,对行和分裂分裂之间交替
    #在列
    分= problem.getSubproblemContaining(子问题,bestSeen)
    newBest = sub.getLocationInSelf(问题,bestSeen)
    如果不跟踪是无:trace.setProblemDimensions(子)
    结果= algorithm4(分,newBest,不rowSplit,跟踪)
    返回problem.getLocationInSelf(子,结果)

#Helper方法
高清双重交叉(List1中,list2中):
    
    返回所有对与来自第一列表中的一个项目,并从一个项
    第二个列表。 (两表中笛卡尔乘积。)

    在code是等效于下面的列表COM prehension:
        返回[(A,B)为List1中的b在list2中]
    但更容易阅读和分析,我们包括更明确的code。
    

    答案= []
    为List1中:
        对B在list2中:
            answer.append((A,B))
    返回答案
 

解决方案 让我们假设数组的宽度大于高度,否则我们将分裂的另一个方向。 拆分阵列分为三个部分:中央纵队,左侧和右侧 穿过中心柱和两个相邻列,寻找最大。 如果它在中部列 - 这是我们的高峰期 如果它在左边,执行这个算法的子阵 left_side + central_column 如果它在右边,运行该算法的子阵 right_side + central_column

为什么这个作品:

排序模板 含C 模板代码

有关的情况下的最大因素是在中央柱 - 明显。如果不是的话,我们可以从最大步增加元素绝对不会越过中央行,所以高峰肯定会存在相应的一半。

为什么这是O(n):

第3步需要小于或等于 max_dimension 迭代和 max_dimension 在对每两个算法步骤至少半。这给 N + N / 2 + N / 4 + ... O(N)。重要的细节:我们分裂的最大方向。对于方形阵列,这意味着拆分的方向将交替出现。这是从你链接到PDF中最后一次尝试的差

的说明:我不知道它究竟是在算法中你给了code匹配,这可能会或可能不会是一个不同的方法

I was doing this course on algorithms from MIT. In the very first lecture the professor presents the following problem:-

A peak in a 2D array is a value such that all it's 4 neighbours are less than or equal to it, ie. for

a[i][j] to be a local maximum,

a[i+1][j] <= a[i][j] 
&& a[i-1][j] <= a[i][j]
&& a[i][j+1] <= a[i][j]
&& a[i+1][j-1] <= a[i][j]

Now given an NxN 2D array, find a peak in the array.

This question can be easily solved in O(N^2) time by iterating over all the elements and returning a peak.

However it can be optimized to be solved in O(NlogN) time by using a divide and conquer solution as explained here.

But they have said that there exists an O(N) time algorithm that solves this problem. Please suggest how can we solve this problem in O(N) time.

PS(For those who know python) The course staff has explained an approach here (Problem 1-5. Peak-Finding Proof) and also provided some python code in their problem sets. But the approach explained is totally non-obvious and very hard to decipher. The python code is equally confusing. So I have copied the main part of the code below for those who know python and can tell what algorithm is being used from the code.

def algorithm4(problem, bestSeen = None, rowSplit = True, trace = None):
    # if it's empty, we're done 
    if problem.numRow <= 0 or problem.numCol <= 0:
        return None

    subproblems = []
    divider = []

    if rowSplit:
        # the recursive subproblem will involve half the number of rows
        mid = problem.numRow // 2

        # information about the two subproblems
        (subStartR1, subNumR1) = (0, mid)
        (subStartR2, subNumR2) = (mid + 1, problem.numRow - (mid + 1))
        (subStartC, subNumC) = (0, problem.numCol)

        subproblems.append((subStartR1, subStartC, subNumR1, subNumC))
        subproblems.append((subStartR2, subStartC, subNumR2, subNumC))

        # get a list of all locations in the dividing column
        divider = crossProduct([mid], range(problem.numCol))
    else:
        # the recursive subproblem will involve half the number of columns
        mid = problem.numCol // 2

        # information about the two subproblems
        (subStartR, subNumR) = (0, problem.numRow)
        (subStartC1, subNumC1) = (0, mid)
        (subStartC2, subNumC2) = (mid + 1, problem.numCol - (mid + 1))

        subproblems.append((subStartR, subStartC1, subNumR, subNumC1))
        subproblems.append((subStartR, subStartC2, subNumR, subNumC2))

        # get a list of all locations in the dividing column
        divider = crossProduct(range(problem.numRow), [mid])

    # find the maximum in the dividing row or column
    bestLoc = problem.getMaximum(divider, trace)
    neighbor = problem.getBetterNeighbor(bestLoc, trace)

    # update the best we've seen so far based on this new maximum
    if bestSeen is None or problem.get(neighbor) > problem.get(bestSeen):
        bestSeen = neighbor
        if not trace is None: trace.setBestSeen(bestSeen)

    # return when we know we've found a peak
    if neighbor == bestLoc and problem.get(bestLoc) >= problem.get(bestSeen):
        if not trace is None: trace.foundPeak(bestLoc)
        return bestLoc

    # figure out which subproblem contains the largest number we've seen so
    # far, and recurse, alternating between splitting on rows and splitting
    # on columns
    sub = problem.getSubproblemContaining(subproblems, bestSeen)
    newBest = sub.getLocationInSelf(problem, bestSeen)
    if not trace is None: trace.setProblemDimensions(sub)
    result = algorithm4(sub, newBest, not rowSplit, trace)
    return problem.getLocationInSelf(sub, result)

#Helper Method
def crossProduct(list1, list2):
    """
    Returns all pairs with one item from the first list and one item from 
    the second list.  (Cartesian product of the two lists.)

    The code is equivalent to the following list comprehension:
        return [(a, b) for a in list1 for b in list2]
    but for easier reading and analysis, we have included more explicit code.
    """

    answer = []
    for a in list1:
        for b in list2:
            answer.append ((a, b))
    return answer

解决方案

Let's assume that width of the array is bigger than height, otherwise we will split in another direction. Split the array into three parts: central column, left side and right side. Go through the central column and two neighbour columns and look for maximum. If it's in the central column - this is our peak If it's in the left side, run this algorithm on subarray left_side + central_column If it's in the right side, run this algorithm on subarray right_side + central_column

Why this works:

For cases where the maximum element is in the central column - obvious. If it's not, we can step from that maximum to increasing elements and will definitely not cross the central row, so a peak will definitely exist in the corresponding half.

Why this is O(n):

step #3 takes less than or equal to max_dimension iterations and max_dimension at least halves on every two algorithm steps. This gives n+n/2+n/4+... which is O(n). Important detail: we split by the maximum direction. For square arrays this means that split directions will be alternating. This is a difference from the last attempt in the PDF you linked to.

A note: I'm not sure if it exactly matches the algorithm in the code you gave, it may or may not be a different approach.