最佳的COM pression算法整数序列整数、序列、算法、COM

2023-09-11 01:50:11 作者:江湖人

我有一个大阵与整数范围的大多是连续的,如1-100,110-160,等等。所有的整数是积极的。 什么是最好的算法融为一体preSS呢? 我试图deflate算法,但是这给了我只有50%的玉米pression。 需要注意的是,该算法不能有损

I have a large array with a range of integers that are mostly continuous, eg 1-100, 110-160, etc. All integers are positive. What would be the best algorithm to compress this? I tried the deflate algorithm but that gives me only 50% compression. Note that the algorithm cannot be lossy.

所有的数字都是独特而逐步增加。

All numbers are unique and progressively increasing.

此外,如果你可以点我的Java实现这样的算法,这将是巨大的。

Also if you can point me to the java implementation of such algorithm that would be great.

推荐答案

我们已经写了调查这一问题的最佳方案最近的研究论文。请参阅:

We have written recent research papers that survey the best schemes for this problem. Please see:

丹尼尔·雷赛和列昂尼德Boytsov,通过矢量解码每秒数十亿次的整数,软件:实践与放大器;经历45(1),2015年。 http://arxiv.org/abs/1209.2137

Daniel Lemire and Leonid Boytsov, Decoding billions of integers per second through vectorization,Software: Practice & Experience 45 (1), 2015. http://arxiv.org/abs/1209.2137

丹尼尔·雷赛,内森库尔兹,狮子座Boytsov,SIMD的COM pression和排序整数,软件的交叉路口:实践与经验(出现)的 http://arxiv.org/abs/1401.6399

Daniel Lemire, Nathan Kurz, Leonid Boytsov, SIMD Compression and the Intersection of Sorted Integers, Software: Practice and Experience (to appear) http://arxiv.org/abs/1401.6399

它们包括了广泛的实验评估。

They include an extensive experimental evaluation.

您可以找到一个完整的实现C ++ 11网络中的所有技术: https://github.com/lemire/FastPFor 和https://github.com/lemire/SIMDCom$p$pssionAndIntersection

You can find a complete implementation of all techniques in C++11 online: https://github.com/lemire/FastPFor and https://github.com/lemire/SIMDCompressionAndIntersection

还有C库: https://github.com/lemire/simdcomp 和的