随机选择

2023-09-11 02:25:48 作者:张望的时光

由于两个整数N和N(N> = N> 0),我该如何生成随机的[0,N)选择(没有重复!)长度= N? 例如。定N = 5,n = 3的可能的解决方案是(3,0,2)或(2,4,1)等。

Given two integer numbers N and n (N >= n > 0), how do I generate random selection (without repetition!) of [0, N) with length = n? E.g. Given N = 5, n = 3 possible solutions are (3,0,2) or (2,4,1), etc.

有一个限制,即$用幼稚的做法的对$ pvents:内存的使用必须是O(N),而不是O(N)

There is a restriction that prevents of using naive approach: memory usage must be O(n), not O(N).

/ * 在天真的做法我的意思是大小= N临时数组多数民众赞成最初填入数字0..N-1的顺序使用。必须N项从这个数组随机选择。 * /

/* Under naive approach I mean usage of temporary array of size=N that's initially filled in with numbers 0..N-1 in order. Required n items are selected randomly from this array. */

推荐答案

经过的所有数m从0到N,决定是否包括米设置为遇到。需要更新包括基于已经处理过的数字下一个数的概率。

Go through all the numbers m from 0 to N, deciding whether to include m in the set as encountered. You need to update the probability of including the next number based on the numbers already treated.

让的申请此想法给出的实例,其中n = 3和N = 5。首先考虑M = 0。存在剩余3个数字,和5的可能性,所以0是在该组的概率是3/5。使用一个随机数发生器来决定包括数或没有。现在考虑M = 1。如果您在设置包含0,那么你有剩余的2号和4种可能性,所以应该纳入的概率2/4,但如果0不包括在内,你有剩余的3号和4种可能性,因此1应包括以概率3/4。这继续直到所需的3个数字被包括在该组

Let's apply this idea to the example given, with n=3 and N=5. First consider m=0. There are 3 numbers remaining, and 5 possibilities, so 0 is in the set with probability 3/5. Use a random number generator to decide to include the number or not. Now consider m=1. If you included 0 in the set, then you have 2 numbers remaining and 4 possibilities, so it should be included with probability 2/4, but if 0 is not included, you have 3 numbers remaining and 4 possibilities and thus 1 should be included with probability 3/4. This continues until the required 3 numbers are included in the set.

下面是用Python实现:

Here's an implementation in Python:

from __future__ import division
import random

def rand_set(n, N):
    nums_included=set()
    for m in range(N):
        prob = (n-len(nums_included)) / (N-m)
        if random.random() < prob:
            nums_included.add(m)
    return nums_included

您可以(并且可能应该)在测试中添加看的时候,你已经在你设置了足够的数量,并打破循环的早期。

You could (and probably should) add in a test to see when you've got enough numbers in your set, and break out of the loop early.

该号码存储在一组,而不同规模从0到n,所以使用的存储是 O(N)。其他的一切用不变的空间,所以它的整体 O(N)

The numbers are stored in a set, which varies in size from 0 to n, so the storage used is O(n). Everything else uses constant space, so it's overall O(n).

编辑其实,你可以去远一点这种方法,因此它需要不断的空间。在Python,才使基于上述发电机:

EDIT Actually, you can go a little further with this approach, so that it takes constant space. In Python, just make a generator based on the above:

def rand_set_iter(n, N):
    num_remaining = n
    m = 0
    while num_remaining > 0:
        prob = num_remaining / (N-m)
        if random.random() < prob:
            num_remaining -= 1
            yield m
        m += 1

在这里,我已经先行一步,并使用了while循环而非for循环。存储结果,你当然需要使用 O(N)的空间。但是,如果你需要做的,通过数字的迭代,发电机版本则是在 O(1)

Here, I've gone ahead and used a while loop instead of the for loop. To store the results, you'll of course need to use O(n) space. But if all you need to do is iterate through the numbers, the generator version does it in O(1).

对于没有生成一种语言,你可以滚你自己的发电机,重复调用的函数和更新静态或全局变量。

For a language without generators, you can roll your own generator, calling a function repeatedly and updating a static or global variable.

相关推荐