在设定分区更好的结果比差分分区、差分、结果

2023-09-11 03:15:39 作者:怪难瘦

分区问题被称为是NP难。根据问题的具体实例,我们可以尝试动态编程或一些启发式像差分(也称为卡马卡尔-卡普算法)。

Partition problem is known to be NP-hard. Depending on the particular instance of the problem we can try dynamic programming or some heuristics like differencing (also known as Karmarkar-Karp algorithm).

后者似乎是用大数字的情况下非常有用(是什么使动态规划顽固性),但是并不总是完美的。什么是有效的方法,找到一个更好的解决方案(随机的,禁忌搜索,其他的近似)?

The latter seems to be very useful for the instances with big numbers (what makes dynamic programming intractable), however not always perfect. What is an efficient way to find a better solution (random, tabu search, other approximations)?

PS:这个问题有它背后的一些故事。还有一个挑战约翰尼去购物可在SPOJ自2004年7月至今,面临的挑战已经解决了1087用户,但只有11人得分比正确的卡氏 - 卡普算法实现更好的(与目前的得分王,卡氏 - 卡普给11.796614分)。如何做的更好? (答案支持接受提交最想要的,但请不要随便显露你的code)

PS: The question has some story behind it. There is a challenge Johnny Goes Shopping available at SPOJ since July 2004. Till now, the challenge has been solved by 1087 users, but only 11 of them scored better than correct Karmarkar-Karp algorithm implementation (with current scoring, Karmarkar-Karp gives 11.796614 points). How to do better? (Answers supported by accepted submission most wanted but please do not reveal your code.)

推荐答案

对于不管它的价值,一个简单的,未优化Python实现了完成卡马卡尔卡普(CKK)的搜索结果在[Korf88]程序 - 修改只是稍微在给定的时间限制(例如,4.95秒),之后摆脱困境的搜索,并返回迄今为止发现的最好的解决方案 - 足以得分 14.204234 的SPOJ问题,击败得分卡马卡尔 - 卡普。 在撰写本文时,这是#3对排名 ( 看到编辑#2以下的)

For whatever it's worth, a straightforward, unoptimized Python implementation of the "complete Karmarkar Karp" (CKK) search procedure in [Korf88] -- modified only slightly to bail out of the search after a given time limit (say, 4.95 seconds) and return the best solution found so far -- is sufficient to score 14.204234 on the SPOJ problem, beating the score for Karmarkar-Karp. As of this writing, this is #3 on the rankings (see Edit #2 below)

的科尔夫的CKK算法稍微更可读presentation可以在[Mert99]中找到。

A slightly more readable presentation of Korf's CKK algorithm can be found in [Mert99].

编辑#2 - 应用Karmarkar-我已经实现叶夫根Kluev的混合启发式卡普直到号码的列表低于某个阈值,然后切换到确切维茨-萨尼子集枚举法[HS74](一简要描述可在[Korf88]中找到)。由于怀疑,我的Python实现需要降低切换阈值与他的C ++实现。对于一些试验和错误,我发现37阈值是让我的计划,在限定时间内完成最大。然而,即使在较低的门槛,我是能够实现的得分 15.265633 ,为第二不够好位置。

EDIT #2 - I've implemented Evgeny Kluev's hybrid heuristic of applying Karmarkar-Karp until the list of numbers is below some threshold and then switching over to the exact Horowitz-Sahni subset enumeration method [HS74] (a concise description may be found in [Korf88]). As suspected, my Python implementation required lowering the switchover threshold versus his C++ implementation. With some trial and error, I found that a threshold of 37 was the maximum that allowed my program to finish within the time limit. Yet, even at that lower threshold, I was able to achieve a score of 15.265633, good enough for second place.

我还试图将这种混合KK / HS方法进CKK树搜索,基本上是使用HS作为一个非常积极的和昂贵的剪枝策略。在平原CKK,我无法找到,即使是匹配KK / HS方法的切换阈值。但是,使用的CKK和HS的ILDS(见下文)的搜索策略(附25阈值)进行修剪,我能得到一个非常小的收获超过了previous得分,最高为 15.272802 。它也许不应该是不足为奇的CKK + ILDS将超越普通CKK在这种情况下,因为它会在设计上提供的投入更加多样化的HS阶段。

I further attempted to incorporate this hybrid KK/HS method into the CKK tree search, basically by using HS as a very aggressive and expensive pruning strategy. In plain CKK, I was unable to find a switchover threshold that even matched the KK/HS method. However, using the ILDS (see below) search strategy for CKK and HS (with a threshold of 25) to prune, I was able to yield a very small gain over the previous score, up to 15.272802. It probably should not be surprising that CKK+ILDS would outperform plain CKK in this context since it would, by design, provide a greater diversity of inputs to the HS phase.

编辑#1 - 我已经尝试了两种精炼的基础CKK算法:

EDIT #1 - I've tried two further refinements to the base CKK algorithm:

提高即可差异搜索(ILDS)[Korf96]这是一个替代的搜索树中的路径自然DFS排序。它有一种倾向,比普通深度优先搜索较早前探索更多样化的解决方案。

"Improved Limited Discrepancy Search" (ILDS) [Korf96] This is an alternative to the natural DFS ordering of paths within the search tree. It has a tendency to explore more diverse solutions earlier on than regular Depth-First Search.

加快-2-通数分区〔Cerq12]此概括的从内4个级别的叶节点到节点的节点中CKK修剪标准内5 1,6和7以上的叶节点的水平。

"Speeding up 2-Way Number Partitioning" [Cerq12] This generalizes one of the pruning criteria in CKK from nodes within 4 levels of the leaf nodes to nodes within 5, 6, and 7 levels above leaf nodes.

在我的试验的情况下,两者通常在减少探索节点的数量(在后者的情况下),并在到达更好的解决方案更快(在前者的情况下)设置在原始CKK明显好处这些改进的。然而,SPOJ问题结构的范围内,没有这些足以提高我的成绩。

In my test cases, both of these refinements generally provided noticeable benefits over the original CKK in reducing the number of nodes explored (in the case of the latter) and in arriving at better solutions sooner (in the case of the former). However, within the confines of the SPOJ problem structure, neither of these were sufficient to improve my score.

鉴于此SPOJ问题的异质性(即:5秒的时间限制,只有一个特定的和未公开的问题的实例),它是很难给出什么实际上可能提高分数的意见 * 。例如,我们应该继续追求备用搜索排序策略(如:很多文件被惠勒Ruml 上市这里)?或者我们应该尽量纳入某种形式的局部改进启发式的解决方案通过CKK,以帮助修剪发现?或者,也许我们应该完全放弃CKK为基础的方法,并尝试了动态规划方法?怎么样PTAS?不知道更多有关在SPOJ问题使用的实例的具体形状,这是很难猜测什么样的方式会产生最大的效益。每个人都有其长处和短处,根据给定实例的特定属性。

Given the idiosyncratic nature of this SPOJ problem (i.e.: 5-second time limit and only one specific and undisclosed problem instance), it is hard to give advice on what may actually improve the score*. For example, should we continue to pursue alternate search ordering strategies (e.g.: many of the papers by Wheeler Ruml listed here)? Or should we try incorporating some form of local improvement heuristic to solutions found by CKK in order to help pruning? Or maybe we should abandon CKK-based approaches altogether and try for a dynamic programming approach? How about a PTAS? Without knowing more about the specific shape of the instance used in the SPOJ problem, it's very difficult to guess at what kind of approach would yield the most benefit. Each one has its strengths and weaknesses, depending on the specific properties of a given instance.

* 的除了简单地运行相同的东西快,比如说,通过实现在C ++中,而不是Python的。的

[Cerq12] Cerquides,耶稣和佩德罗Meseguer旅馆。 加快2路号码的分区。 ECAI。 2012,DOI: 10.3233 / 978-1-61499-098-7-223

[Cerq12] Cerquides, Jesús, and Pedro Meseguer. "Speeding Up 2-way Number Partitioning." ECAI. 2012, doi:10.3233/978-1-61499-098-7-223

[HS74]霍洛维茨,埃利斯和萨尔塔杰萨尼。 计算分区,应用背包问题。的杂志的ACM(JACM)21.2 (1974):277-292

[HS74] Horowitz, Ellis, and Sartaj Sahni. "Computing partitions with applications to the knapsack problem." Journal of the ACM (JACM) 21.2 (1974): 277-292.

[Korf88]科尔夫,理查E.(1998年),一个完整的任意时间算法对于一些分区,人工智能106(2):181-203,DOI: 10.1016 / S0004-3702(98)00086-1 ,

[Korf88] Korf, Richard E. (1998), "A complete anytime algorithm for number partitioning", Artificial Intelligence 106 (2): 181–203, doi:10.1016/S0004-3702(98)00086-1,

[Korf96]科尔夫,理查德·E提高即可出入搜索。 AAAI / IAAI,卷。 1. 1996年。

[Korf96] Korf, Richard E. "Improved limited discrepancy search." AAAI/IAAI, Vol. 1. 1996.

[Mert99]梅尔滕斯,斯蒂芬(1999年),一个完整的随时算法平衡号分区,的arXiv: CS / 9903011

[Mert99] Mertens, Stephan (1999), A complete anytime algorithm for balanced number partitioning, arXiv:cs/9903011