从各整数,使得给定的n组,它们成对距离的总和为最小选择的整数整数、总和、最小、距离

2023-09-11 06:11:09 作者:稳重熟男

我不知道这是一个众所周知的问题,或者是还原到一个,但这里是我在挣扎,因为过去的几天。

I don't know if this a well known problem or is reducible to one, but here is what I am struggling with since last few days.

让X_1,X_2,X_3 .. X_n是整数集。

Let X_1, X_2, X_3 .. X_n be sets of integers.

在每集X_i,我们有米(人人平等套)元素x_i,J:J = 1 ... M

In each set X_i, we have m (equal for all sets) elements x_i,j : j = 1...m.

我的目标是挑选的单的每一集,元素使得所有选定的元素之间两两差之和尽可能少的。

My goal is to pick a single element from each set, such that the sum of pairwise differences between all the selected elements is least possible.

我开始了与最小化的串行差异(即,来自连续集中选择元件之间差之和)中的DP .The向前计算是如下的总和的一个稍微不同的问题:

I started off with a slightly different problem of minimizing the sum of serial-differences (that is, sum of differences between elements selected from consecutive sets).The forward computation in DP is as follows:

让F(I,J)是成本,从盘我选择的元素学家

Let F(i, j) be the cost to select element j from set i.

F(I,J)= min_k F(I - 1,k)的+ | X(ⅰ-1中,k) - X(I,J)|

F(i, j) = min_k F(i - 1, k) + |x(i-1,k) - x(i, j)|

即,我们选择在previous组k个元素,然后评估在当前组中选择第j个要素的成本。当前步骤的成本等于在组中的第k个元素之间的绝对差的i-1,并在集合j个元素i。

That is, we select k-th element in the previous set and then evaluate the cost of selecting j-th element in the current set. The cost of current step is equal to the absolute difference between the k-th element in set i-1 and j-th element in set i.

不过,我真正的问题不依赖于集X_1,... X_n的顺序。从本质上说,动态规划是给我的东西,但正如我所说,这是寻找产业链的元素,从每个组,使得连环差之和最小的一个相关但不同的问题。

However, my real problem does not depend on the order of the sets X_1, ... X_n. In essence, the dynamic programming is giving me something, but as I said, it is a related but different problem of finding the "chain" of elements, one from each set, such that the sum of "serial" differences is minimized.

这是一个基本的分析,我看到一个天真的解决这个问题将是产生N组的所有排列,然后利用动态规划的每个置换解决。这是棘手的,但是当n足够小 - 我们甚至可以采取这种愚蠢的详尽办法

From a basic analysis, I see that a naive solution to this problem would be to generate all permutations of the n sets and then solve it using dynamic programming for each permutation. This is intractable, but when n is small enough - we can even take this dumb exhaustive approach.

我无法找出如果这个问题可以用一个多项式算法在所有来解决,或者如果它可以减少到已知NP难题/完整的问题之一在这种情况下,我会去的近似算法通过将其建模为一个二次规划。

I am unable to figure out if this problem can be solved using a polynomial algorithm at all, or if it can be reduced to one of the known NP-hard/complete problems in which case, I will go for an approximate algorithm by modeling it as a quadratic program.

请告诉我或点我到一些阅读材料。

Please advise me or point me to some reading material.

感谢。

添加一个例子这里讨论的方便:

Adding an example here for the convenience of discussion:

X = [[11,24,38],[12,21,33],[13,22],[15,23]]

X = [[11, 24, 38], [12, 21, 33], [13, 22], [15, 23]]

解决方案是24,21,22,23(可能是无关紧要的,但我的DP给了我11,12,13,15,我的这个例子特别失败我DP)。

The solution is 24, 21, 22, 23 (possibly irrelevant, but my DP gives me 11, 12, 13, 15, I have constructed this example particularly to fail my DP.).

我想这是不是NP完全问题。我们可以从DP扩展的解决方案如下:[不知道是否正确,但看起来它]:

I guess this is not an NP-complete problem. The solution that we can expand from the DP is as follows [Not sure if correct, but seems like it]:

该溶液包含来自每一组的至少一种元素。

The solution contains at least one element from each set.

那么,就让我们选择(pferably $ P $最小,如果大小不同)的任何设置,并从列表中删除。

So, let us select any set (preferably smallest, if the sizes vary), and remove it from the list.

让我们把它叫做X \ XP中,其中XP是集合中删除。

Lets call it X\Xp where Xp is the set removed.

现在的每个元素x在\ XP中,建立了一套新的X'= X \ XP的ü{X}。

Now for each element x in \Xp, construct a new set X' = X\Xp U {x}.

我们解决了DP M-时间,并计算目标函数(成对距离的总和),然后我们就可以选择其中最好的。

We solve the DP m-times and compute the objective function (sum of pairwise distances) and then we can pick the best of them.

在DP本身需要O(NM ^ 2),并运行此M倍,所以我们有O(纳米^ 3)。

The DP itself takes O(nm^2) and we run this m-times, so we have O(nm^3).

推荐答案

我不认为这是NP完全问题,因为我认为我们可以发现和认识了一个解决方案,最多为O(n ^2米^ 2)猜测

I don't think this is NP-complete, because I think we can find and recognize a solution with at most O(n^2m^2) guesses.

这样才不会担心的关系,从整数到实数,并添加极少量的抖动。我认为这是随机的,但我认为你可以选择确定性抖动来达到同样的效果。

So as not to worry about ties, go from integers to reals and add a very small amount of jitter. I think of this as random, but I think you could chose deterministic jitter to achieve the same effect.

认为问题为,从布置在实线的彩色点的集合中选择的,以便选择一个点的每种颜色的,并尽量减少它们之间的绝对差之和

Think of the problem as that of choosing from a collection of coloured points laid out on the real line so as to pick one point of each colour and minimise the sum of absolute differences between them.

当n为偶数,我认为只能这样了。该组解析点的中值的两个中央点之间发生。在解集的每个点,没有点它和两个中心点之间的颜色相同的(或者我们可以改善溶液)。出于同样的原因,每个点是最接近点颜色来得到,如果我们从溶液中集合中删除它的n-1个点的中值

I consider only the case when n is even. The median of the set of solution points occurs between the two central points. For each point in the solution set, there is no point of the same colour between it and the two centre points (or we could improve the solution). For the same reason, each point is the closest point of that colour to the median of the n-1 points we get if we remove it from the solution set.

用我们的(纳米)^ 2猜测我们猜测的解集两个中心点的身份。这给我们留下的n-2点找到的,我们可以减少的可能性两个每种颜色,最接近点到两个中央点这两点的每一侧的

Using our (nm)^2 guesses we guess the identity of the two central points in the solution set. This leaves us n-2 points to find, and we can reduce the possibilities to two of each colour, the closest point to the two central points on each side of those two points.

现在考虑从溶液中取出设定一个点形成的中位数。如果我们以删除该点右侧的两个中央点,中值是这两个点的左边。如果我们以删除该点是在左侧,中值是这两个点的右侧。在一个解集,我们只是删除了点更接近新的中位数比任何其他点的颜色的,并且新值是进一步的两个中央点从它。因此,我们可以从同一颜色的其它候选区分 - 它是取其两者的最接近进一步的两个中央点

Now consider the median formed by removing one point from the solution set. If the point we remove is on the right side of the two central points, the median is the left of those two points. If the point we remove is on the left side, the median is the right of those two points. In a solution set, the point we have just removed is closer to the new median than any other point of that colour, and that new median is the further of the two central points from it. So we can distinguish it from the other candidate of the same colour - it is whichever of the two is closest to the further of the two central points.

因此​​通过至多为O(n ^ 2 *平方公尺)猜测,我们可以找到一个可行的解决方案对每个获得集,并从这些解决方案套件,我们选择一个具有最小的目标是让全球最低。每次猜测需要一些工作 - 也许O(M),因此这很可能是为O(n ^ 2立方公尺)具有完全天真的实现 - 也许大多是理论方法

Therefore by making at most O(n^2*m^2) guesses we can find a possible solution set for each get, and from these solution sets we choose the one with smallest objective to get the global minimum. Each guess takes some work - maybe O(m) so this may well be an O(n^2m^3) with a completely naive implementation - perhaps mostly a theoretical approach.

也许这是好得是真实的,但我们变成了简单的数据检查每个点与最接近它对方颜色的点证明这一点?一个参数是,如果我们有两个点,我们可以在溶液中作为最接近最远的两个点的识别每个点,则该点也必须最接近对中的另一点。因此,无论是猜测点​​在对和查找最近的点,它可能工作。这开始看起来很像的叶夫根尼·Kluev的解决方案,证明

Perhaps this is too good to be true, but can we turn this into a proof of simply checking each point in the data together with the point of each other colour closest to it? An argument would be that if we have two points and we can recognize each point in the solution as closest to the furthest of the two points then this point must also be closest to the other point in the pair. So guessing either point in the pair and finding the closest points to it might work. This begins to look a lot like a proof of Evgeny Kluev's solution