说我有Ÿ不同的价值观,我想选择其中x的随意。什么是一个高效的算法这样做?我可以调用rand()x次,但如果x的表现会很差,Y很大。
Say I have y distinct values and I want to select x of them at random. What's an efficient algorithm for doing this? I could just call rand() x times, but the performance would be poor if x, y were large.
罗伯特·弗洛伊德发明了一种采样算法只是这种情况。它通常优于洗牌,然后抓住了第一个X元素。由于最初写它假定从1..N值,但它是微不足道的生产0..N,和/或通过简单地处理它产生为下标成矢量/阵列/不管的值,用非连续的值。
Robert Floyd invented a sampling algorithm for just such situations. It's generally superior to shuffling then grabbing the first x elements. As originally written it assumes values from 1..N, but it's trivial to produce 0..N, and/or use non-contiguous values by simply treating the values it produces as subscripts into a vector/array/whatever.
在pseuo code,该算法的运行是这样的(由乔恩Bentley的偷的编程珠玑的专栏华晨的样本)。
In pseuocode, the algorithm runs like this (stealing from Jon Bentley's Programming Pearls column "A sample of Brilliance").
initialize set S to empty
for J := N-M + 1 to N do
T := RandInt(1, J)
if T is not in S then
insert T in S
else
insert J in S
这是最后一位(插入j若T是已经在S)是棘手的部分,但底线是,它保证了precisely插入的J正确的数学概率,因此它产生正确的,公正的结果。
That last bit (inserting J if T is already in S) is the tricky part, but the bottom line is that it assures precisely the correct mathematical probability of inserting J, so it produces correct, unbiased results.