予有产生m个不同随机数在该范围的两个方法[0..N-1]
I have two methods of generating m distinct random numbers in the range [0..n-1]
方法1:
//C++-ish pseudocode
int result[m];
for(i = 0; i < m; ++i)
{
int r;
do
{
r = rand()%n;
}while(r is found in result array at indices from 0 to i)
result[i] = r;
}
方法2:
//C++-ish pseudocode
int arr[n];
for(int i = 0; i < n; ++i)
arr[i] = i;
random_shuffle(arr, arr+n);
result = first m elements in arr;
第一种方法是更有效的,当n是大于米大得多,而第二个是更有效的,否则。但更大不是一个严格的概念是什么呢? :)
The first method is more efficient when n is much larger than m, whereas the second is more efficient otherwise. But "much larger" isn't that strict a notion, is it? :)
问:的什么公式n和m我应该用它来确定方法1或方法2是否会更有效呢? (在运行时间数学期望计)
Question: What formula of n and m should I use to determine whether method1 or method2 will be more efficient? (in terms of mathematical expectation of the running time)
纯粹数学:
让我们计算的数量兰特()
在这两种情况下函数调用,并比较结果:
Pure mathematics:
Let's calculate the quantity of rand()
function calls in both cases and compare the results:
案例1:
让我们来看看电话上的步骤的数学期望 I = K
,当你已经选择了K个。概率得到一些有一个兰特()
通话等于 P =(NK)/ N
。我们需要知道这样的呼叫量,导致在获得了一些我们没有还的数学期望。
Case 1:
let's see the mathematical expectation of calls on step i = k
, when you already have k numbers chosen. The probability to get a number with one rand()
call is equal to p = (n-k)/n
. We need to know the mathematical expectation of such calls quantity which leads to obtaining a number we don't have yet.
的概率使用 1
呼叫 P
得到它。使用 2
要求 - Q * P
,其中 Q = 1 - P
。在一般的情况下,概率恰好在 N
要求是(Q ^(N-1))* P 。因此,数学期望为
总和[N * Q ^(N-1)* P],N = 1 - &GT; INF
。这和等于 1 / P> / code>(由Wolfram Alpha的证明)。
The probability to get it using 1
call is p
. Using 2
calls - q * p
, where q = 1 - p
. In general case, the probability to get it exactly after n
calls is (q^(n-1))*p
. Thus, the mathematical expectation is
Sum[ n * q^(n-1) * p ], n = 1 --> INF
. This sum is equal to 1/p
(proved by wolfram alpha).
所以,上步 I = K
将执行 1 / P = N /(NK)
通话在兰特()
的功能。
So, on the step i = k
you will perform 1/p = n/(n-k)
calls of the rand()
function.
现在让我们来整体概括:
Now let's sum it overall:
总和[N /(N - K),K = 0 - &GT;米 - 1 = N * T
- 的兰特
的方法1. 拨打该号码
在这里, T =总和[1 /(N - K),K = 0 - &GT;米 - 1
Sum[ n/(n - k) ], k = 0 --> m - 1 = n * T
- the number of rand
calls in method 1.
Here T = Sum[ 1/(n - k) ], k = 0 --> m - 1
案例2:
下面兰特()
被称为内 random_shuffle
N - 1
(在大多数实现)次。
Here rand()
is called inside random_shuffle
n - 1
times (in most implementations).
现在,选择的方法,我们来比较一下这两个值: N * T' N - 1
。
因此,要选择适当的方法,计算出 T
如上所述。如果 T&LT; (N - 1)/ N
最好是使用第一种方法。否则使用第二种方法。
Now, to choose the method, we have to compare these two values: n * T ? n - 1
.
So, to choose the appropriate method, calculate T
as described above. If T < (n - 1)/n
it's better to use the first method. Use the second method otherwise.