给定一个未知的长列表,通过扫描就只有1次返回它的随机项目它的、项目、列表

2023-09-11 02:34:10 作者:腥風持枪者

由于未知的长列表,通过扫描就只有1次返回它的随机项目。

Given an unknown length list, return a random item in it by scanning it only 1 time.

我的想法:

一个类似的算法是水库采样(其他用户发布的)。但是,实在是太复杂,因为它需要运行兰特(),并保持k个节点每次迭代。

A similar algorithm is Reservoir Sampling (posted by others). But, it is too complicated because it needs to run rand() and keep k nodes each iteration.

有没有更好的解决办法? O(n)的时间和O(1)空间?

Is there a better solution? O(n) time and O(1) space?

推荐答案

你为什么对水库取样?碰巧你用K来那样做= 1,有轻微的优化(例如,你不需要选择1出k的,由于k = 1),但它是正确的做法。您可以尝试通过保持在同一时间处理一个固定的窗口,优化,做数学题,找出概率相同,如果你要选择任何在你的窗口,而不是一个你等,以最大限度地减少兰特的项目()调用在昂贵的更复杂的算法,但是你要风回到水库无论如何采样多跌少。

Why are you against reservoir sampling? You happen to be doing it with k = 1. There are minor optimizations (e.g. you don't need to select 1 out of the k, since k = 1) but it's the right approach. You could try to optimize by keeping processing a fixed window at a time, do the math to figure out with equal probability if you should choose any of the items in your window instead of the one you have, etc. to minimize rand() calls at the expensive of a more complicated algorithm, but you're going to wind up back at reservoir sampling more or less anyhow.