德布鲁因样序列`2 ^ N - 1`:它是如何构建的?它是、序列、布鲁

2023-09-11 02:53:22 作者:继续颓废着╮

我期待在进入找到N位整数的中澳日志基地2(LG (N))与位操作黑客的乘法和查找操作。

I'm looking at the entry Find the log base 2 of an N-bit integer in O(lg(N)) operations with multiply and lookup from Bit Twiddling hacks.

我可以很容易地看到第二个算法,该条目适用

I can easily see how the second algorithm in that entry works

static const int MultiplyDeBruijnBitPosition2[32] = 
{
  0, 1, 28, 2, 29, 14, 24, 3, 30, 22, 20, 15, 25, 17, 4, 8, 
  31, 27, 13, 23, 21, 19, 16, 7, 26, 12, 18, 6, 11, 5, 10, 9
};
r = MultiplyDeBruijnBitPosition2[(uint32_t)(v * 0x077CB531U) >> 27];

该计算 LOG2ñ,其中 N 被称为是2的幂在这种情况下 0x077CB531 是一个普通的德布鲁因的序列,剩下的就是显而易见的。

which calculates log2 n where n is known to be a power of 2. In this case 0x077CB531 is an ordinary De Bruijn sequence, and the rest is obvious.

然而,在该条目中的第一算法

However, the first algorithm in that entry

static const int MultiplyDeBruijnBitPosition[32] = 
{
  0, 9, 1, 10, 13, 21, 2, 29, 11, 14, 16, 18, 22, 25, 3, 30,
  8, 12, 20, 28, 15, 17, 24, 7, 19, 27, 23, 6, 26, 5, 4, 31
};

v |= v >> 1;
v |= v >> 2;
v |= v >> 4;
v |= v >> 8;
v |= v >> 16;

r = MultiplyDeBruijnBitPosition[(uint32_t)(v * 0x07C4ACDDU) >> 27];

看起来更棘手的我有点。我们首先通过捕捉 v 到最近的更大 2 ^ N - 1 值。这 2 ^ N - 1 的值,然后通过 0x07C4ACDD ,在这种情况下的行为方式相同的DeBruijn乘以序列中的previous算法做了。

looks a bit more tricky to me. We begin by snapping v to a nearest greater 2^n - 1 value. This 2^n - 1 value is then multiplied by 0x07C4ACDD, which in this case acts the same way as the DeBruijn sequence in the previous algorithm did.

我的问题是:我们如何构建这个神奇的 0x07C4ACDD 序列?即我们如何构建一个序列,可以用来当由 2的n次方乘以生成唯一的指标 - 1 价值?对于 2的n次方乘数,这只是一个普通的德布鲁因的序列,因为我们可以在上面看到,所以很清楚其中 0x077CB531 是从哪里来的。但对于 2 ^ N - 1 乘数 0x07C4ACDD ?我觉得我缺少明显的东西在这里。

My question is: how do we construct this magic 0x07C4ACDD sequence? I.e. how do we construct a sequence that can be used to generate unique indices when multiplied by a 2^n - 1 value? For 2^n multiplier it is just an ordinary De Bruijn sequence, as we can see above, so it is clear where 0x077CB531 came from. But what about 2^n - 1 multiplier 0x07C4ACDD? I feel like I'm missing something obvious here.

PS 要澄清我的问题:我真的不想找一个算法来生成这些序列。我在一些更多或更少的琐碎性更感兴趣(如果有的话),使 0x07C4ACDD 的工作,因为我们希望它的工作。对于 0x077CB531 ,使得它的工作性质是pretty的显而易见的:它包含存储的顺序与1位步进的所有5位的组合(这基本上是什么德布鲁因序列)。

P.S. To clarify my question: I'm not really looking for an algorithm to generate these sequences. I'm more interested in some more-or-less trivial property (if one exists) that makes 0x07C4ACDD work as we want it to work. For 0x077CB531 the property that makes it work is pretty obvious: it contains all 5-bit combinations "stored" in the sequence with 1-bit stepping (which is basically what De Bruijn sequence is).

0x07C4ACDD ,在另一方面,是不是德布鲁因序列本身。于是,被他们瞄准施工时是什么属性 0x07C4ACDD (除了非建设性的,应该作出上述算法的工作)?有人也提出了上述算法莫名其妙。所以,他们可能知道这​​个方法是可行的,并且相应的序列存在。他们怎么知道的?

The 0x07C4ACDD, on the other hand, is not a De Bruijn sequence by itself. So, what property were they aiming for when constructing 0x07C4ACDD (besides the non-constructive "it should make the above algorithm work")? Someone did come up with the above algorithm somehow. So they probably knew that the approach is viable, and that the appropriate sequence exists. How did they know that?

例如,如果我是构造算法的任意 v ,我会做

For example, if I were to construct the algorithm for an arbitrary v, I'd do

v |= v >> 1;
v |= v >> 2;
...

第一。然后,我只是做 + v 打开 v 成2(我们假设它不是动力溢出)。然后,我申请了第一个算法。最后我会做 - R的来获得最终的答案。然而,这些人设法优化它:他们淘汰了领先的 + v 和后 - R的步骤简单通过改变乘法器和重排表。他们怎么知道这是可能的吗?这背后的优化数学?

first. Then I'd just do ++v to turn v into a power of 2 (let's assume it doesn't overflow). Then I'd apply the first algorithm. And finally I'd do --r to obtain the final answer. However, these people managed to optimize it: they eliminated the leading ++v and the trailing --r steps simply by changing the multiplier and rearranging the table. How did they know it was possible? What is the math behind this optimization?

推荐答案

n阶在K个符号的德布鲁因序列(和k ^ N长度)有一切可能的正长字出现连续的字符属性失败,有人与环包裹。例如,在k的= 2,n = 2时的情况下,可能的话是00,01,10,11,和一个德布鲁因序列是0011 00,01,11将出现在它,10与环绕。此属性自然意味着左移一德布鲁因的序列(有两个功率相乘),并以它的上n位产生一个唯一的编号为两个乘数每个电源。那么你只需要一个查找表来确定是哪一个。它的工作原理上类似的原理来这比两个幂小的数字,但在这种情况下,幻数是不是一个德布鲁因序列,但类似物。该定义属性只是更改为一切可能的正长字显示为长度为n的第一个M个子,模2 ^ n的总和。这个属性是所有需要的算法工作。他们只是用这种另类​​的幻数,以加快算法。我也为好。

A De Bruijn sequence of order n over k symbols (and of k^n length) have a property that every possible n-length word appears as consecutive characters in it, some of them with cyclic wrapping. For example, in the case of k=2, n=2, the possible words are 00, 01, 10, 11, and a De Bruijn sequence is 0011. 00, 01, 11 appears in it, 10 with wrapping. This property naturally means that left shifting a De Bruijn sequence (multiplying with power of two) and taking its upper n bits results in a unique number for each power of two multiplier. Then you only need a lookup table to determine which one it is. It works on a similar principle to numbers which are one less than power of two, but the magic number in this case is not a De Bruijn sequence, but an analogue. The defining property simply changes to "every possible n-length word appears as the sum of the first m subsequences of length n, mod 2^n". This property is all that is needed for the algorithm to work. They simply used this different class of magic numbers to speed up the algorithm. I did as well.

施工德布鲁因数的一种可能的方法是德Bruijn的曲线图的汉弥尔顿路径的生成,Wikipedia提供这样的曲线图的一个例子。在这种情况下,节点2 ^ 5 = 32位整数,有向边缘是它们之间的转换,其中的过渡是根据边缘的标签,0或1。有可能一个左移位和二元或操作是直接模拟到2 ^ N-1型幻数,这可能是值得探讨的,但是这不是一种人们通常构造这样的算法。

One possible method of construction of De Bruijn numbers is the generation of a Hamiltonian path of the De Bruijn graph, Wikipedia provides an example of such a graph. In this case, the nodes are 2^5=32-bit integers, the directed edges are transitions between them, where a transition is a left shift and a binary or operation according to the label of the edge, 0 or 1. There might be a direct analogue to 2^n-1 type magic numbers, it might be worth exploring, but this isn't a way people usually construct such algorithms.

在实践中你可以尝试不同的构造,特别是如果你想让它表现在一个稍微不同的方式。例如,零点算法领先/落后号的比特摆弄黑客页上的执行只能返回在[0..31]值。它需要额外检查的0的情况,其中有32个0。这项检查需要一个分支,并可以在某些CPU太慢。

In practice you might try to construct it differently, especially if you want it to behave in a tad different manner. For example, the implementation of leading/trailing number of zeros algorithms on the bit twiddling hacks page can only return values in [0..31]. It needs additional checking for the case of 0, which has 32 zeros. This check requires a branching and can be way too slow on some CPUs.

我做的方式,我用了一个64元的查找表,而不是32,产生的随机幻数,并为他们每个人我建立了一个查找表的两个输入电源,检查它的正确性(注入),然后验证它所有的32位数字。我继续,直到我遇到了正确的幻数。由此产生的数字不履行出现在每个可能的正长字的属性,因为只有33的数字显示,这是唯一的全部33个可能的输入。

The way I did it, I used a 64 element lookup table instead of 32, generated random magic numbers, and for each of them I built up a lookup table with power of two inputs, checked its correctness (injectivity), then verified it for all 32-bit numbers. I continued till I encountered a correct magic number. The resulting numbers do not fulfill a property of "every possible n-length word appears", since only 33 numbers appear, which are unique for all 33 possible input.

穷举蛮力搜索声音缓慢,尤其是好神奇的数字是罕见的,但如果我们两个值作为输入的第一个测试已知功率,该表很快就会被填满,拒绝是速度快,废品率是非常高的。我们只需要清除每个幻数表后。从本质上说我利用的废品率较高的算法来构建幻数。

Exhaustive brute force search sounds slow, especially if good magic numbers are rare, but if we first test known power of two values as inputs, the table is filled quickly, rejection is fast, and the rejection rate is very high. We only need to clear the table after each magic number. In essence I exploited a high rejection rate algorithm to construct magic numbers.

由此产生的算法

int32 Integer::numberOfLeadingZeros (int32 x)
{
    static int32 v[64] = {
        32, -1, 1, 19, -1, -1, -1, 27, -1, 24, 3, -1, 29, -1, 9, -1,
        12, 7, -1, 20, -1, -1, 4, 30, 10, -1, 21, -1, 5, 31, -1, -1,
        -1, -1, 0, 18, 17, 16, -1, -1, 15, -1, -1, -1, 26, -1, 14, -1,
        23, -1, 2, -1, -1, 28, 25, -1, -1, 13, 8, -1, -1, 11, 22, 6};
    x |= x >> 1;
    x |= x >> 2;
    x |= x >> 4;
    x |= x >> 8;
    x |= x >> 16;
    x *= 0x749c0b5d;
    return v[cast<uint32>(x) >> 26];
}

int32 Integer::numberOfTrailingZeros (int32 x)
{
    static int32 v[64] = {
        32, -1, 2, -1, 3, -1, -1, -1, -1, 4, -1, 17, 13, -1, -1, 7,
        0, -1, -1, 5, -1, -1, 27, 18, 29, 14, 24, -1, -1, 20, 8, -1,
        31, 1, -1, -1, -1, 16, 12, 6, -1, -1, -1, 26, 28, 23, 19, -1,
        30, -1, 15, 11, -1, 25, 22, -1, -1, 10, -1, 21, 9, -1, -1, -1};
    x &= -x;
    x *= 0x4279976b;
    return v[cast<uint32>(x) >> 26];
}

对于你的问题,他们是怎么知道的,他们可能没有。他们尝试,试图改变的事情,就像我一样。毕竟,这不是一个大舒展想象力的2 ^ n-1个输入可能工作2 ^ n个输入具有不同的幻数和查找表来代替。

As for your question of how did they know, they probably didn't. They experimented, tried to change things, just like me. After all, it isn't a big stretch of imagination that 2^n-1 inputs might work instead of 2^n inputs with different magic number and lookup table.

在这里,我做了我的魔数发生器code的简化版本。它会检查所有可能的幻数在5分钟内,如果我们只检查了两个输入电源,发现1024幻数。检查对其他输入是毫无意义的,因为它们减少到2 ^ N-1的形式反正。不构造表,但它是平凡的,一旦你知道的神奇数字。

Here, I made a simplified version of my magic number generator code. It checks all possible magic numbers in 5 minutes if we check only for power of two inputs, finding 1024 magic numbers. Checking against other inputs are pointless since they are reduced to 2^n-1 form anyway. Does not construct the table, but it is trivial once you know the magic number.

#include <Frigo/all>
#include <Frigo/all.cpp>

using namespace Frigo::Lang;
using namespace std;

class MagicNumberGenerator
{

    public:

        static const int32 log2n = 5;
        static const int32 n = 1 << log2n;
        static const bool tryZero = false;

        MagicNumberGenerator () {}

        void tryAllMagic ()
        {
            for( int32 magic = 0; magic < Integer::MAX_VALUE; magic++ ){
                tryMagic(magic);
            }
            tryMagic(Integer::MAX_VALUE);
            for( int32 magic = Integer::MIN_VALUE; magic < 0; magic++ ){
                tryMagic(magic);
            }
        }

        bool tryMagic (int32 magic)
        {
            //  clear table
            for( int32 i = 0; i < n; i++ ){
                table[i] = -1;
            }
            //  try for zero
            if( tryZero and not tryInput(magic, 0) ){
                return false;
            }
            //  try for all power of two inputs, filling table quickly in the process
            for( int32 i = 0; i < 32; i++ ){
                if( not tryInput(magic, 1 << i) ){
                    return false;
                }
            }
            //  here we would test all possible 32-bit inputs except zero, but it is pointless due to the reduction to 2^n-1 form
            //  we found a magic number
            cout << "Magic number found: 0x" << Integer::toHexString(magic) << endl;
            return true;
        }

        bool tryInput (int32 magic, int32 x)
        {
            //  calculate good answer
            int32 leadingZeros = goodNumberOfLeadingZeros(x);
            //  calculate scrambled but hopefully injective answer
            x |= x >> 1;
            x |= x >> 2;
            x |= x >> 4;
            x |= x >> 8;
            x |= x >> 16;
            x *= magic;
            x = Integer::unsignedRightShift(x, 32 - log2n);
            //  reject if answer is not injective
            if( table[x] != -1 ){
                return table[x] == leadingZeros;
            }
            //  store result for further injectivity checks
            table[x] = leadingZeros;
            return true;
        }

        static int32 goodNumberOfLeadingZeros (int32 x)
        {
            int32 r = 32;
            if( cast<uint32>(x) & 0xffff0000 ){
                x >>= 16;
                r -= 16;
            }
            if( x & 0xff00 ){
                x >>= 8;
                r -= 8;
            }
            if( x & 0xf0 ){
                x >>= 4;
                r -= 4;
            }
            if( x & 0xc ){
                x >>= 2;
                r -= 2;
            }
            if( x & 0x2 ){
                x >>= 1;
                r--;
            }
            if( x & 0x1 ){
                r--;
            }
            return r;
        }

        int32 table[n];

};

int32 main (int32 argc, char* argv[])
{
    if(argc||argv){}
    measure{
        MagicNumberGenerator gen;
        gen.tryAllMagic();
    }
}