在二分图最佳匹配(egassociating与分标签上图)上图、标签、egassociating

2023-09-11 02:27:08 作者:日久不生情

我想从图形XY坐标图,其中点绘,部分或全部有一个标签中提取语义。标签绘制近点,使人类可以正常理解它的标签与哪个点。例如,在该图很清楚哪个标签(编号)属于哪个点(*)和基于欧几里德距离的算法会工作。 (标签和点没有语义顺序 - 如散点图)

I am trying to extract semantics from graphical xy plots where the points are plotted and some or all have a label. The label is plotted "near the point" so that a human can normally understand which label goes with which point. For example in this plot it is clear which label(number) belongs to which point(*) and an algorithm based on Euclidian distance would work. (The labels and points have no semantic ordering - e.g. a scatterplot)

 *1
    *2

        *3

      *4

在拥挤的地块创作软件/人可以将标签贴在不同的方向,以避免重叠。例如,在

In congested plots the authoring software/human may place the label in different directions to avoid overlap. For example in

1**2
 **4
 3

一个人的阅读器就可以正常工作了哪些标签与标签相关联。

A human reader can normally work out which label is associated with which label.

的一个解决方案我接受是创建一个欧几里德距离矩阵和随机的行来获得函数的最小值(在对角线上或其它启发式的距离如求和方块)。在第二个例子(与点标记的A,B,C,D从西北角顺时针方向),我们有一个距离矩阵(1 dp)中

One solution I'd accept would be to create a Euclidean distance matrix and shuffle the rows to get the minimum of a function (e.g. the summed squares of the distances on the diagonal or other heuristic). In the second example (with the points labelled a,b,c,d clockwise from the NW corner) we have a distance matrix (to 1 d.p.)

             a   b   c   d
 1ab2    1  1.0 2.0 2.2 1.4    
  dc4    2  2.0 1.0 1.4 2.2
  3      3  2.0 2.2 1.4 1.0
         4  2.2 1.4 1.0 2.0

和我们需要的标签 A1 B2 C4 D3 。交换行3和4给出了对角线的最小总和。下面是一个更为复杂的例子,简单地采摘最近可能会失败

and we need to label a1 b2 c4 d3. Swapping rows 3 and 4 gives the minimum sum of the diagonal. Here's a more complex example where simply picking the nearest may fail

 *1*2*5
  **4
  3 *6

如果这个解决然后我需要去的情况下的标签数量可以比的点的数量更小或更大。

If this is solved then I shall need to go to cases where the number of labels may be smaller or larger than the number of points.

如果该算法标准比我AP preciate指向开放Java源代码(如JAMA或Apache数学)

If the algorithm is standard than I would appreciate a pointer to Open Source Java (e.g. JAMA or Apache maths)

注:此SO回答附近的关联点的路径没有按'作为一个答案,因为通过点的路径给出了牛逼相当的工作。

NOTE: This SO answer Associating nearby points with a path doesn't quite work as an answer because the path through the points is given.

推荐答案

您有一个的完全偶图一个部分是数字和另一种是点。重量的该图的边缘的是数字和点之间的欧氏距离。而你的任务是找到以最小的重量匹配。

You have a complete bipartite graph that one part is numbers and other one is points. Weight's of edge in this graph is euclidean distance between numbers and points. And you're task is finding matching with minimal weight.

这是已知的问题,有一个著名的算法命名为 匈牙利算法

This is known problem and has a well known algorithm named as Hungarian Algorithm:

从维基:

我们给出一个非负的n×n矩阵,其中在第i个元素   行和第j列重新presents分配第j个点的成本   第i个数字。我们必须找到点的分配   具有最低的成本数字。如果目标是找到分配   能产生的最大成本,这个问题可以被改变,以适应   通过减去的最大开销替换每个成本设置   成本。

We are given a nonnegative n×n matrix, where the element in the i-th row and j-th column represents the cost of assigning the j-th point to the i-th number. We have to find an assignment of the point to the numbers that has minimum cost. If the goal is to find the assignment that yields the maximum cost, the problem can be altered to fit the setting by replacing each cost with the maximum cost subtracted by the cost.

该算法更容易描述,如果我们使用制定问题   二部图。我们有一个完整的二分图G =(S,T,E)与   n个顶点(S)和n点的顶点(T),并且每个边缘具有   非负成本C(I,J)。我们希望找到一个完美匹配   最低的成本。匈牙利法是一种组合优化   算法,解决了在多项式时间和分配问题   其中预期后原对偶方法。 ˚F

The algorithm is easier to describe if we formulate the problem using a bipartite graph. We have a complete bipartite graph G=(S, T; E) with n number vertices (S) and n point vertices (T), and each edge has a nonnegative cost c(i,j). We want to find a perfect matching with minimum cost. The Hungarian method is a combinatorial optimization algorithm which solves the assignment problem in polynomial time and which anticipated later primal-dual methods. f

有关详细算法和code,你可以看看的顶部codeR文章 这 PDF 可能使用

For detailed algorithm and code you can take a look at topcoder article and this pdf maybe to use

有一个媒体文件来描述它。 (此视频解释了为什么匈牙利算法的工作)

there is a media file to describe it. (This video explains why the Hungarian algorithm works)

算法:    步骤1: - prepare成本matrix.if成本矩阵不是正方形   矩阵然后用零成本元素添加一个虚拟行(列)。

algorithm : step 1:- prepare a cost matrix.if the cost matrix is not a square matrix then add a dummy row(column) with zero cost element.

步骤2: - 从减每行中的最小元素的所有   各行的元素

step 2:- subtract the minimum element in each row from all the elements of the respective rows.

步骤3: - 通过减去进一步修改生成的矩阵   每列从的所有元素最小elememnt   各columns.thus获得修改后的矩阵。

step 3:- further modify the resulting matrix by subtracting the minimum elememnt of each column from all the elements of the respective columns.thus obtain the modified matrix.

步骤4: - 然后,绘制最小没有的水平和垂直线   覆盖所有零在所得matrix.let最小无线是   N.now有2可能的情况。

step 4:- then,draw minimum no of horizontal and vertical lines to cover all zeros in the resulting matrix.let the minimum no of lines be N.now there are 2 possible cases.

1的情况下 - 如果N = n,其中n是矩阵,顺序则最优   获得所需要的分配可能会产生made.so进行分配   解决方案。

case 1 - if N=n,where n is the order of matrix,then an optimal assignment can be made.so make the assignment to get the required solution.

2的情况下 - 如果N小于n然后继续第5步

case 2 - if N less than n then proceed to step 5

第5步:确定在最小覆盖的元素   矩阵(元不包括在N行).subtract这个最小元素   所有裸露的元素,并添加在相同的元素   的水平和垂直lines.thus相交的第二变形   得到矩阵

step 5: determine the smallest uncovered element in the matrix(element not covered by N lines).subtract this minimum element from all uncovered elements and add the same elements at the intersection of horizontal and vertical lines.thus the second modified matrix is obtained.

步骤6: - 重复步骤(3)和(4),直到我们得到步骤4的情况下(1)

step 6:- repeat step(3) and (4) untill we get the case (1) of step 4.

第7步: - (使零分配)检查行先后   直到一个行方式完全零单是found.circle(O)这个零   使assignment.then标记交叉(X)对所有零,如果躺在   圆圈零的列,表明它们不能被认为是   以这种方式将来assignment.continue,直到所有的零   已经审查。重复相同的步骤列也。

step 7:- (to make zero assignments) examine the rows successively untill a row-wise exactly single zero is found.circle(o) this zero to make the assignment.then mark a cross(x) over all zeros if lying in the column of the circled zero,showing that they can't be considered for future assignment.continue in this manner untill all the zeros have been examined. repeat the same procedure for column also.

第8步: - 重复步骤6 succeccively直到下一个   情况的产生─(i)在没有未标记的零是左,则该过程   结束或(ii)是否存在位于一个以上的未标记零中任何   列或行话,圈出一个无人盯防的零随意和   标志着剩余的零在其列的单元格交叉或   column.repeat直到没有未标记的零留在该过程   矩阵。

step 8:- repeat the step 6 succeccively until one of the following situation arises- (i)if no unmarked zeros is left,then the process ends or (ii) if there lies more than one of the unmarked zero in any column or row then,circle one of the unmarked zeros arbitrarily and mark a cross in the cell of remaining zeros in its row or column.repeat the process untill no unmarked zero is left in the matrix.

步骤9: - 从而恰好一个标有圆圈的每一行中的零和每个   获得矩阵的列。相应于分配   这些都标志着圈零点会给最优分配。

step 9:- thus exactly one marked circled zero in each row and each column of the matrix is obtained. the assignment corresponding to these marked circle zeros will give the optimal assignment.

有关详细信息,请参阅wiki和 HTTP://www.ams。 jhu.edu/~castello/362/Handouts/hungarian.pdf

For details see wiki and http://www.ams.jhu.edu/~castello/362/Handouts/hungarian.pdf