5排序数组的中位数中位数、数组

2023-09-10 22:59:35 作者:╰︶梦落ペ

我试图找到5阵列排序中位数的解决方案。这是一个面试问题。

I am trying to find the solution for median of 5 sorted arrays. This was an interview questions.

我能想到的解决方案是合并5阵列,然后找到中位数[O(L + M + N + O + P)。

The solution I could think of was merge the 5 arrays and then find the median [O(l+m+n+o+p)].

我知道,2排序相同大小的数组,我们可以在日志(2N)做到这一点。 [通过比较两个阵列的中值,然后抛出1半每个阵列的并重复该过程。 ..查找位数可以在有序阵列固定的时间..所以我认为这是不记录(N)? ..什么是时间复杂度为这个?

I know that for 2 sorted arrays of same size we can do it in log(2n). [by comparing the median of both arrays and then throwing out 1 half of each array and repeating the process]. .. Finding median can be constant time in sorted arrays .. so I think this is not log(n) ? .. what is the time complexity for this ?

1]有5阵列类似的解决方案。如果数组是相同的大小,有没有更好的解决办法呢?

1] Is there a similar solution for 5 arrays . What if the arrays are of same size , is there a better solution then ?

2]我想因为这是要求5,会有一些解决方案,对于N排序数组?

2] I assume since this was asked for 5, there would be some solution for N sorted arrays ?

感谢您的指针。

一些澄清/问题,我问回给面试官: 是相同的长度阵列 =>没有 我想会有的数组的值重叠 =>是

Some clarification/questions I asked back to the interviewer: Are the arrays of same length => No I guess there would be an overlap in the values of arrays => Yes

作为练习,我觉得2阵列逻辑犯规延伸。这是一个尝试: 应用2阵列的上述逻辑,说3个数组: [3,7,9] [4,8,15] [2,3,9] ...位数7,8,3 扔元素[3,7,9] [4,8] [3,9] ..位数7,6,6 扔元素[3,7] [8] [9] ..medians 5,8,9 ... 扔元素[7] [8] [9] ..位数= 8 ...这似乎不正确?

As an exercise, I think the logic for 2 arrays doesnt extend . Here is a try: Applying the above logic of 2 arrays to say 3 arrays: [3,7,9] [4,8,15] [2,3,9] ... medians 7,8,3 throw elements [3,7,9] [4,8] [3,9] .. medians 7,6,6 throw elements [3,7] [8] [9] ..medians 5,8,9 ... throw elements [7] [8] [9] .. median = 8 ... This doesnt seem to be correct ?

排序元素的合并=> [2,3,4,7,8,9,15] =>预期中值= 7

The merge of sorted elements => [2,3,4,7,8,9,15] => expected median = 7

推荐答案

(这是你的想法,两个数组的推广。)

(This is a generalization of your idea for two arrays.)

如果您先来看看五个阵列的五个位数,显然中位必须在最小和最大的五个中位数。

If you start by looking at the five medians of the five arrays, obviously the overall median must be between the smallest and the largest of the five medians.

证明是这样的:如果一个是中位数的分,和b为中位数的最大值,那么每个阵列具有小于一半的元件小于且小于其一半元素大于b,的。结果如下:

Proof goes something like this: If a is the min of the medians, and b is the max of the medians, then each array has less than half of its elements less than a and less than half of its elements greater than b. Result follows.

所以,在阵列中包含,扔掉数不到;在包含数组b,扔掉数大于b,......但只有扔掉两个数组的元素数相同。

So in the array containing a, throw away numbers less than a; in the array containing b, throw away numbers greater than b... But only throw away the same number of elements from both arrays.

也就是说,如果一个是从它的数组的开始Ĵ元件,以及b是从它的数组末尾k个元素,则来自一个的阵列和最后分钟扔掉第一分钟(J,K)元素( J,K)从步骤b的数组元素。

That is, if a is j elements from the start of its array, and b is k elements from the end of its array, you throw away the first min(j,k) elements from a's array and the last min(j,k) elements from b's array.

迭代,直到你下降到1或2个元素总量。

Iterate until you are down to 1 or 2 elements total.

每个这些操作(即找到一个排序的数组的中位数从一开始或数组的结束扔掉k个元素)是恒定的时间。所以每次迭代是不变的时间。

Each of these operations (i.e., finding median of a sorted array and throwing away k elements from the start or end of an array) is constant time. So each iteration is constant time.

每个迭代扔掉(以上)的一半来自至少一个数组中的元素,你只能做的log(n)次,每次五阵列的......所以整体算法的log(n)。

Each iteration throws away (more than) half the elements from at least one array, and you can only do that log(n) times for each of the five arrays... So the overall algorithm is log(n).

[更新]

由于Himadri乔杜里指出,在评论中,我的解决方案是不完整的;有很多的细节和角落的情况下可担心的。因此,充实的东西敲了一下......

As Himadri Choudhury points out in the comments, my solution is incomplete; there are a lot of details and corner cases to worry about. So, to flesh things out a bit...

有关每五个阵列R的定义其低级位数为R [N / 2-1]和它的上部正中为R [N / 2],其中n是阵列中的元素数(和数组是从0开始的索引,并除以2回合下来)。

For each of the five arrays R, define its "lower median" as R[n/2-1] and its "upper median" as R[n/2], where n is the number of elements in the array (and arrays are indexed from 0, and division by 2 rounds down).

让一个是最小的较低位数的,和b是最大的上部中位数。如果有多个阵列具有最小较低的中位数和/或多个阵列的最大上限位,选择A和B从不同阵列(这是其中的一角案件之一)。

Let "a" be the smallest of the lower medians, and "b" be the largest of the upper medians. If there are multiple arrays with the smallest lower median and/or multiple arrays with the largest upper median, choose a and b from different arrays (this is one of those corner cases).

现在,借用Himadri的建议:删除所有元素最多的和包括的一个从它的阵列,并应用到所有元素的和包括的b从它的阵列,同时注意除去从两个阵列元件的数目相同。需要注意的是,a和b可以是在相同的阵列;但是,如果是这样,他们不能具有相同的值,因为否则我们将已经能够从不同的阵列选择其中之一。因此,它是确定的,如果此步骤卷起从同一阵列的开始和结束扔掉元素

Now, borrowing Himadri's suggestion: Erase all elements up to and including a from its array, and all elements down to and including b from its array, taking care to remove the same number of elements from both arrays. Note that a and b could be in the same array; but if so, they could not have the same value, because otherwise we would have been able to choose one of them from a different array. So it is OK if this step winds up throwing away elements from the start and end of the same array.

迭代,只要你有三个或多个阵列。但是,一旦你到一个或两个数组,你必须改变你的策略是排他性的,而不是包容性;你只擦除最多的但不包括的一个上下的但不包括的湾继续这样下去,只要双方剩下的一个或两个数组中至少有三个要素(确保你取得进步)。

Iterate as long as you have three or more arrays. But once you are down to just one or two arrays, you have to change your strategy to be exclusive instead of inclusive; you only erase up to but not including a and down to but not including b. Continue like this as long as both of the remaining one or two arrays has at least three elements (guaranteeing you make progress).

最后,将减少到几例,其中最棘手的是两个阵列剩余,其中一个具有一个或两个元素。现在,如果我问你:给定一个排序的数组加上一个或两个额外的元素,找到所有元素的中位数,我觉得你可以做,在一定的时间。 (同样,也有一堆细节敲定,但其基本思路是增加一个或两个元素的数组不推着正中非常多。)

Finally, you will reduce to a few cases, the trickiest of which is two arrays remaining, one of which has one or two elements. Now, if I asked you: "Given a sorted array plus one or two additional elements, find the median of all elements", I think you can do that in constant time. (Again, there are a bunch of details to hammer out, but the basic idea is that adding one or two elements to an array does not "push the median around" very much.)