算法来找到最大总和在重叠的时间间隔的序列总和、间隔、序列、算法

2023-09-10 22:52:00 作者:漫步あ雨中

我试图解决的问题有间隔数轴上,每一个pre定义得分列表。我需要返回最大可能的总成绩。

The problem I am trying to solve has a list of intervals on the number line, each with a pre-defined score. I need to return the maximum possible total score.

美中不足的是,该区间重叠,重叠的区间,我可以只用一个。下面是一个例子。

The catch is that the intervals overlap, and of the overlapping intervals I can use only one. Here is an example.

Intervals   - Score  
   0- 5     -  15  
   4- 9     -  18  
  10-15     -  12  
   8-21     -  19  
  25-30     -  25    

在这里,间隔0-5,4-9和8-21重叠。 该区间10-15和8-21也有重叠。 最大的一笔是55(18 + 12 + 25)。

Here, the intervals 0-5, 4-9 and 8-21 overlap. The intervals 10-15 and 8-21 also overlap. The maximum sum would be 55 (18+12+25).

,这里要注意,我们选择了第一批重叠,即使它不具有最高分数的三间隔的间隔4-9是很重要的。

It is important to note here that we select the interval 4-9 of the first batch of overlapping intervals even though it does not have the highest score of the three.

这是因为在选择的时间间隔8-21将prevent我们从使用间隔:10-15以后,从而降低整体总和(在此情况下,整体的总和将是19 + 25 = 44)。

This is because selecting the interval 8-21 would prevent us from using the interval 10-15 later on, thereby reducing the overall sum (In that case, the overall sum would be 19+25=44).

我要寻找一个O(nlogn)或者O(n)的解决这个问题。我认为,动态规划可以用,但我可能是错的。可能有人提出一个解决方案/算法(S),可以做的伎俩吗?

I am looking for an O(nlogn) or O(n) solution to this problem. I think dynamic programming can be used, but I may be wrong. Could someone suggest a solution/algorithm(s) that could do the trick here?

编辑:间隔是没有特定的顺序。

The intervals are in no particular order.

推荐答案

这是间隔调度加权变化;它是可解的 O(N日志N)与动态规划。

This is a weighted variation of interval scheduling; it's solvable in O(N log N) with dynamic programming.

让的间隔 G(启动,停止,得分),让他们可以通过排序停止。为简单起见,我们假设现在所有的停止是独一无二的。

Let an interval be g(start, stop, score), and let them be sorted by stop. For simplicity, let's assume for now that all stop is unique.

最好的[I] 是最好的成绩时,我们允许使用,我们可以得到政[1],...,政[I] 。我们不必使用,当然所有这些,以及通常我们不能因为间隔我们使用了子集必须是非重叠的。

Let best[i] be the best score we can get when we're allowed to use g[1], ..., g[i]. We don't have to use them all, of course, and generally we can't because the subset of intervals we use must be non-overlapping.

很明显最好的[0] = 0 。也就是说,因为我们不能使用任何的间隔,最好的成绩,我们可以得到的是0。 对于任何 1&LT; = K&LT; = N ,我们有: 最好的[K] = MAX(最好[K-1],最好的研究[J] + G [K] .score),其中 Ĵ是最大的指数,这样政[J] .stop&LT;政[K]。开始Ĵ可能为零) Clearly best[0] = 0. That is, since we can't use any interval, the best score we can get is 0. For any 1 <= k <= N, we have: best[k] = max( best[k-1], best[j] + g[k].score ), where j is the largest index such that g[j].stop < g[k].start (j may be zero)

这是,因为我们允许使用政[1],...政[K] ,我们能做的最好的是更好的得分王这两个选项:

That is, given that we're allowed to use g[1], ... g[k], the best we can do is the better scoring of these two options:

我们不包括政[K] 。因此,此选项的得分最好[K-1] 。 ...因为这是我们可以用政[1],...政[K-1] 请最好的 We do not include g[k]. Thus, the score of this option is best[k-1]. ... because that's the best we can do with g[1], ... g[k-1]

(注意动态规划体现在上式的最优子和重叠的子组件)。

(Note the optimal substructure and overlapping subproblems components of dynamic programming embodied in the above equation).

总体问题的答案是最好[N] ,即最好成绩时,我们可以使用所有的基因,我们可以得到的。哎呀,我说的基因?我的意思是时间间隔。

The overall answer to the question is best[N], i.e. the best score we can get when we're allowed to use all the genes. Oops, did I say genes? I mean intervals.

这是 O(N日志N),因为:

排序的所有时段采用 O(N日志N) 查找Ĵ每个 K O(日志N)使用二进制搜索 Sorting all the intervals takes O(N log N) Finding j for each k is O(log N) using binary search

如果几个基因可以有相同的停止的值,然后什么都没有改变:你还是要寻找最右侧的Ĵ。在如Python中这是很容易与 bisect_right 。在Java中,其中标准库中的二进制搜索不保证其索引的情况下返回的关系,你可以(在许多选项)与线性搜索(对 O(N)最坏情况的性能),或另一系列二进制搜​​索以发现最右索引

If several genes can have the same stop values, then nothing changed: you still have to search for the rightmost j. In e.g. Python this is easy with bisect_right. In Java where the standard library binary search doesn't guarantee which index is returned in case of ties, you can (among many options) follow it with a linear search (for O(N) worst-case performance), or another series of binary searches to find the right most index.

抱歉没我再说一遍基因呢?我的意思是时间间隔。

Oops did I say genes again? I mean intervals.

Extension二进制搜索,找到键值 的第一个和最后一个索引 Extension of binary search to find the first and last index of the key value
 
精彩推荐