我在找的一套和解算法实现。问题是以下几点:有两套与确定一些相对紧凑的值(例如UUID或MD5 / SHA1 /不管哈希)坐在不同的机器上的元素。这些集合的不同之处较少的因素,我想同步这些集在传输的数据量极少。大多数谷歌搜索线索这里的。这是GPL的实施似乎是最先进的方法来工作。问题是,我不能在我的应用程序使用GPL下的code。最有可能我会使用类似nzmath重新实现它自己,但也许还有其他的实现(preferably Python或C / C ++),或者有其他更好的算法?
I'm looking for implementations of set reconciliation algorithm. The problem is following: there are two sets with elements identified by some relatively compact value (e.g. UUID or MD5/SHA1/whatever hash) sitting on different machines. These sets differ in relatively few elements and I want to synchronize these sets while transferring minimal amount of data. Most of googling leads here. This is GPL'd implementation of what seems to be the state-of-art approach to the task. The problem is that I can't use GPL'd code in my app. Most likely I'll have to reimplement it myself using something like nzmath, but maybe there are other implementations (preferably Python or C/C++), or maybe there are other nicer algorithms?
这code是从我的头,从而覆盖任何许可证本网站适用于code样本。
This code is out of my head, and thus covered by whatever license applies for code samples in this site.
# given two finite sequences of unique and hashable data,
# return needed opcodes and data needed for reconciliation
def set_reconcile(src_seq, dst_seq):
"Return required operations to mutate src_seq into dst_seq"
src_set= set(src_seq) # no-op if already of type set
dst_set= set(dst_seq) # ditto
for item in src_set - dst_set:
yield 'delete', item
for item in dst_set - src_set:
yield 'create', item
使用方法如下:
Use as follows:
for opcode, datum in set_reconcile(machine1_stuff, machine2_stuff):
if opcode == 'create':
# act accordingly
elif opcode == 'delete':
# likewise
else:
raise RuntimeError, 'unexpected opcode'