如何使用最少的比特,以产生一个任意范围内的非偏压随机数偏压、随机数、范围内、如何使用

2023-09-11 02:37:52 作者:仰望辉煌。

由于的鸽子洞的原则,你不能只用

 输出=分钟+(RAND()%(INT)(最大 - 最小+ 1))
 

要生成一个公正,统一,结果。该answer一个类似的问题提供了一个解决方案,但它是非常浪费的消耗随机比特的条款。

此地无人生还 区块链随机数的原罪与救赎

例如,如果随机范围源的低,则具有以产生从所述源的第二值的机会可以是相当高的。另类,使用较大的源范围本质上是一种浪费了。

虽然我敢肯定,可以得出一个最佳的来源范围的大小,不解决这个问题,有可能是一个更好的算法,而不是优化这一块。

我的想法已被证明的答案产生偏见的结果。

这是发生在我的方法是

消费必要以覆盖期望的范围的比特的最小数目。 如果该值超出了所需的范围只能扔掉一位,消费多了一个。 在必要时重复操作。 解决方案

通常的做法,以消除偏见是扔掉那些期望的范围之外的数字。如上所述,这是一种浪费。有可能通过启动比特的数量较多,并在同一时间产生多个随机数,以减少浪费;你可以实现输入到输出的范围之间更好的匹配。

例如采取模具的辊。输出有6个可能性。天真的方法将采取3个随机位产生的每一个随机数。第一个例子演示了鸽子洞的问题。

 高清pigeon_die(total_bit_count):
    对我的xrange(total_bit_count // 3):
        位= random.getrandbits(3)
        产生出1 +位* 6 // 8

1:832855
2:417835
3:416012
4:833888
5:416189
6:416554
总共3333333
最大/最小2.00448063998
 

的第二个例子是浪费的方法常用。你可以看到,它产生从随机位相同数量较少的随机数,但偏置被消除。

 高清wasteful_die(total_bit_count):
    对我的xrange(total_bit_count // 3):
        位= random.getrandbits(3)
        如果bits< 6:
            产生1 +位

1:417043
2:415812
3:417835
4:416012
5:416645
6:417243
总共2500590
最大/最小1.00486517946
 

最后一个例子需要13比特的时间和产生5随机数从它。这会产生较幼稚的做法,甚至更多的数字!

 高清optimized_die(total_bit_count):
    对我的xrange(total_bit_count // 13):
        位= random.getrandbits(13)
        如果bits< 6 ** 5:
            对于j的范围(5):
                产生出1 +位%6
                位// = 6

1:608776
2:608849
3:608387
4:608119
5:607855
6:608559
总共3650545
最大/最小1.00163525841
 

13比特的选择是由以2的幂的底的对数6和选择,这就是一个最接近的整数。

 高清waste_list(N):
    在范围位(1,31):
        潜力=将Math.log(2 **位,N)
        数= INT(潜在)
        如果计数> 0:
            废=潜能 - 计数
            产生的废物,位计数

废物,位计数排序(waste_list(6)):
    打印位,计数,废
    如果位== 3:
        打破

13 5 0.029086494049
26 10 0.0581729880981
8 3 0.0948224578763
21 8 0.123908951925
3 1 0.160558421704
 

正如你所看到的,有4个选择较简单的3位更好。

Because of the pigeon hole principle, you can't just use

output = min + (rand() % (int)(max - min + 1))

to generate an unbiased, uniform, result. This answer to a similar question provides one solution but it is highly wasteful in terms of random bits consumed.

For example, if the random range of the source is low, then the chances of having to generate a second value from the source can be quite high. Alternative, using a larger source range is inherently wasteful too.

While I'm sure an optimal source range size could be derived, that doesn't address the question that there may be a better algorithm, rather than optimizing this one.

[EDIT] The my idea has been shown in the answers to produce biased results.

An approach that occurs to me is

Consume the minimum number of bits necessary to cover the desired range. If that value is outside the desired range only throw away one bit and consume one more. Repeat as necessary.

解决方案

The common approach to eliminating bias is to throw away numbers that are outside of the desired range. As noted, this is wasteful. It's possible to minimize the waste by starting with a larger number of bits and generating multiple random numbers at the same time; you can achieve a better match between the range of inputs to outputs.

For example take a roll of a die. The output has 6 possibilities. The naive approach would take 3 random bits for each random number produced. The first example demonstrates the pigeon-hole problem.

def pigeon_die(total_bit_count):
    for i in xrange(total_bit_count // 3):
        bits = random.getrandbits(3)
        yield 1 + bits * 6 // 8

1 : 832855
2 : 417835
3 : 416012
4 : 833888
5 : 416189
6 : 416554
total 3333333
max/min 2.00448063998

The second example is the wasteful approach commonly used. You can see that it generates fewer random number from the same number of random bits, but the bias is eliminated.

def wasteful_die(total_bit_count):
    for i in xrange(total_bit_count // 3):
        bits = random.getrandbits(3)
        if bits < 6:
            yield 1 + bits

1 : 417043
2 : 415812
3 : 417835
4 : 416012
5 : 416645
6 : 417243
total 2500590
max/min 1.00486517946

The final example takes 13 bits at a time and generates 5 random numbers from it. This generates even more numbers than the naive approach!

def optimized_die(total_bit_count):
    for i in xrange(total_bit_count // 13):
        bits = random.getrandbits(13)
        if bits < 6**5:
            for j in range(5):
                yield 1 + bits % 6
                bits //= 6

1 : 608776
2 : 608849
3 : 608387
4 : 608119
5 : 607855
6 : 608559
total 3650545
max/min 1.00163525841

The choice of 13 bits was made by taking the logarithm base 6 of powers of 2 and choosing the one that was closest to an integer.

def waste_list(n):
    for bit in range(1, 31):
        potential = math.log(2**bit, n)
        count = int(potential)
        if count > 0:
            waste = potential - count
            yield waste, bit, count

for waste, bit, count in sorted(waste_list(6)):
    print bit, count, waste
    if bit == 3:
        break

13 5 0.029086494049
26 10 0.0581729880981
8 3 0.0948224578763
21 8 0.123908951925
3 1 0.160558421704

As you can see, there are 4 choices better than the simple 3 bits.