子为O(n ^ 2)算法的计算区间套?区间、算法

2023-09-10 23:39:57 作者:没心没肺

我们有如下形式的间隔列表 [A 我,B 我] 。对于每一个区间,我们要计算嵌套在它内部的其他间隔的数量。

We have a list of intervals of the form [ai, bi]. For each interval, we want to count the number of other intervals that are nested within it.

例如,如果我们有两个区间, A = [1,4] B = [2,3] 。那么对于 B计数 0 ,因为对 B没有嵌套间隔;和计数 A 1 B 配合在 A

For example, if we had two intervals, A = [1,4] and B = [2,3]. Then the count for B would be 0 as there are no nested intervals for B; and the count for A would be 1 as B fits within A.

我的问题是,是否存在一个子 - 为O(n 2 )算法针对此问题在哪里 N 是间隔数?

My question is, does there exist a sub- O(n2) algorithm for this problem where n is the number of intervals?

编辑:在这里的条件是间隔满足。区间的结束点是浮点数。对于一个的下限我S / B 我的为0,上限是什么最大浮动的。此外,还有的条件,一个我< b 我,所以没有时间间隔长度为0。

Here are the conditions the intervals meet. The end points of the intervals are floating point numbers. The lower limit for the ai's/bi's is 0 and the upper limit is whatever max float is. Also, there is the condition that ai < bi, so no intervals of length 0.

推荐答案

是的,这是可能的。

我们将借用典型的计算几何扫描线的把戏。

We will borrow the typical computational geometry "scan line" trick.

首先,让我们回答一个简单的(但密切相关的)问题。相反,报告有多少时间间隔每个区间包含的,让我们每一个报告是有多少时间间隔的包含在的。因此,对于只有两个间隔的​​例子,间隔 I 0 = [1,4] 的值为零,因为它包含在零区间,而 I1 = [2,3] 有,因为它是包含在一个区间值之一。

First, let's answer an easier (but closely related) question. Instead of reporting how many other intervals each interval contains, let's report how many intervals each is contained in. So for your example with only two intervals, interval I0 = [1,4] has value zero because it is contained in zero intervals, while I1 = [2,3] has value one because it is contained in one interval.

您会在一分钟内看到(一)为什么这个问题是比较容易和(二)如何导致对原来的问题的答案。

You will see in a minute (a) why this question is easier and (b) how it leads to the answer for the original question.

要解决这个问题更容易:采取一切开始的和结束的点 - 所有的一个我和b 我 - 放他们进入的主列表。称此列表中的事件的每个元素。因此,事件会像区间I 37 启动或区间I 23 结束了。

To solve this easier question: Take all starting and ending points -- all of the ai and bi -- and put them into a master list. Call each element of this list an "event". So an event would be something like "interval I37 started" or "interval I23 ended".

排序事件列表,并且为了处理它。

Sort this list of events and process it in order.

正如你处理事件的清单,维持活动间隔的组S.的间隔是积极的,如果我们遇到了它的启动事件,但不是它的结束事件;也就是,如果我们说的时间间隔内

As you process the list of events, maintain a set S of "active intervals". An interval is "active" if we have encountered its start event but not its ending event; that is, if we are within that interval.

现在,每当我们看到一个结束事件B Ĵ,我们已经准备好来计算多少间隔包含I Ĵ(= [A Ĵ,B Ĵ])。我们需要做的是考察活动的时间间隔的集合S,并确定其中有多少的Ĵ之前启动。这就是我们的答案多少时间间隔包含区间I Ĵ。

Now, whenever we see an ending event bj, we are ready to compute how many intervals contain Ij (= [aj, bj]). All we need to do is examine the set S of active intervals and determine how many of them started before aj. That is our answer for how many intervals contain interval Ij.

要有效地做到这一点,让小号本身的排序条件为出发点;例如,通过使用自平衡二叉树

To do this efficiently, keep S itself sorted by starting point; e.g., by using a self-balancing binary tree.

排序的事件列表是O(2N日志2N)=为O(n log n)的。添加或从自平衡二叉树移除元素是O(log n)的。问自平衡二叉树有多少个元素小于X?也是O(log n)的。因此,这整个算法是O(n log n)的。

Sorting the list of events is O(2n log 2n) = O(n log n). Adding or removing an element from a self-balancing binary tree is O(log n). Asking "how many elements of the self-balancing binary tree are less than x?" is also O(log n). Therefore this entire algorithm is O(n log n).

所以,这解决简单的问题。称之为简单的算法。现在,为你居然问。

So, that solves the easy question. Call that the "easy algorithm". Now for what you actually asked.

想想数线延伸至无穷大,环绕着为负无穷,以及定义的间隔用b 我&LT;一个我开始在我,延伸到无穷大,换到负无穷大,并最终在B处我。

Think of the number line as extending to infinity and wrapping around to -infinity, and define an interval with bi < ai to start at ai, stretch to infinity, wrap to minus infinity, and end at bi.

有关的任​​何区间I Ĵ = [A Ĵ,B Ĵ],定义补(I Ĵ )为区间[B Ĵ,一个Ĵ。 (例如,在区间[2,3]开始于并结束于3;如此补体([2,3])= [3,2]开始于3,延伸到无穷大,换行到-infinity,并结束于2。)

For any interval Ij = [aj, bj], define Complement(Ij) as the interval [bj, aj]. (For example, the interval [2, 3] starts at 2 and ends at 3; so Complement([2,3]) = [3,2] starts at 3, stretches to infinity, wraps to -infinity, and ends at 2.)

观察到间隔I包含间隔J当且仅当补体(J)包含补(Ⅰ)。 (证明这一点。)

Observe that interval I contains interval J if and only if Complement(J) contains Complement(I). (Prove this.)

所以,我们可以简单地通过运行易算法所设定的所有时间间隔的互补的回答你原来的问题。也就是说,启动扫描,-infinity用含活动间隔的集合S的所有的时间间隔(因为所有的补充含有无穷/ -infinity)。守小号排序结束点(即起点补)。

So, we can answer your original question simply by running the "easy algorithm" on the set of complements of all of the intervals. That is, start your scan at -infinity with the set S of "active intervals" containing all intervals (because all complements contain infinity/-infinity). Keep S sorted by end point (i.e. start point of complement).

排序都开始点和结束点并处理它们的顺序。当你遇到一个出发点区间I Ĵ(= [A Ĵ,B Ĵ]),实际上是打的终点它的补...所以我删除Ĵ从S,查询S,看看有多少它的终点(即补充开始点)来b Ĵ之前,并报告为因为我的答案Ĵ。如果以后遇到终点I Ĵ,你遇到它的补的起点,所以你需要将其添加回活跃区间的集合S。

Sort all start points and end points and process them in order. When you encounter a starting point for interval Ij (= [aj, bj]), you are actually hitting the end point of its complement... So remove Ij from S, query S to see how many of its endpoints (i.e. complement start points) come before bj, and report that as the answer for Ij. If you later encounter the end point of Ij, you are encountering the start point of its complement, so you need to add it back into the set S of active intervals.

这最后的算法是O(n log n)的出于同样的原因易算法了。

This final algorithm is O(n log n) for the same reasons the "easy algorithm" was.

[更新]

澄清一点,一个校正,一个评论...

One clarification, one correction, one comment...

澄清:当然,自平衡二叉树必须增加,使得每个子树知道有多少元素它包含。否则,你不能回答有多少个元素都小于X?此增加是简单的维护,但它是不是每一个实现提供;例如在C ++ 的std ::设为不,我的知识。

Clarification: Of course, the "self-balancing binary tree" has to be augmented such that each sub-tree knows how many elements it contains. Otherwise, you cannot answer "how many elements are less than x?" This augmentation is straightforward to maintain, but it is not something that every implementation provides; e.g. the C++ std::set does not, to my knowledge.

更正:你的没有的要添加的任何元素重新以积极的区间的集合S;其实,这样做可能会导致错误的答案。例如,如果间隔是刚刚[1,2]和[3,4],则命中1(和从该组中删除[1,2]),然后用2(和添加再次插入),然后3 ......而且,由于2'; 4,你会得出这样的结论[3,4]包含[1,2]。这是不对的。

Correction: You do not want to add any elements back in to the set S of active intervals; in fact, doing so can result in the wrong answer. For example, if the intervals are just [1,2] and [3,4], you would hit 1 (and remove [1,2] from the set), then 2 (and add it back in again), then 3... And since 2<4, you would conclude that [3,4] contains [1,2]. Which is wrong.

从概念上讲,你已经处理了所有的启动事件为补一次;这就是为什么小号开始在其内部将所有的时间间隔。因此,所有你需要担心的是结束点;你的没有的要添加至S的任何元素,直到永远。

Conceptually, you already processed all of the "start events" for the complement intervals; that is why S begins will all intervals inside of it. So all you need to worry about are the ending points; you do not want to add any elements to S, ever.

换句话说,而不是具有间隔环绕,你能想到的[B 我,一个我](其中b 我>一我)的意思[B 我 - 无穷远,一个我]不带环绕。逻辑仍然有效,但加工较为明显:首先,你处理所有的无所谓 - 无穷大条款(即终点),然后你处理别人(即起点)

Put another way, instead of having the intervals wrap around, you can think of [bi,ai] (where bi > ai) as meaning [bi - infinity, ai] with no wrap-around. The logic still works, but the processing is more clear: First you process all of the "whatever - infinity" terms (i.e. the end points), then you process the others (i.e. the start points).

通过这种修正,我是pretty的肯定,我的解决方案的实际工作。这一提法也延伸 - 我认为 - 您具有正常和落后的区间集中在一个输入的情况下

With this correction, I am pretty sure my solution actually works. This formulation also extends -- I think -- to the case where you have both normal and "backward" intervals together in one input.

注释:这个问题是棘手的,因为如果你要的枚举的设定包含每个时间间隔内的所有间隔,输出本身可以为O(n ^ 2)。因此,任何的工作方法有以某种方式计算的间隔,甚至没有能够识别它们: - )

Comment: This problem is tricky because if you have to enumerate the set of all intervals contained within every interval, the output itself can be O(n^2). So any working approach has to somehow count the intervals without even being able to identify them :-).