测试一个黑盒子的随机性,输出随机的64位浮点数黑盒子、随机性、测试、浮点数

2023-09-11 05:54:09 作者:独奏夜上弦

我得到这个面试问题,需要写一个函数吧。我失败了。

I got this interview question and need to write a function for it. I failed.

由于它是一个电话采访的问题,我觉得没有什么我应该code真正需要的是完美的随机测试。

Because it is a phone interview question, I don't think what I am supposed to code really need to be perfect random tester.

任何想法?

如何写一些code是在像30分钟,合理的随机性测试仪在接受采访?

How to write some code to be a reasonable randomness tester within like 30 minutes during an interview?

修改

在这个问题上的分布的均匀分布的

The distribution in this question is uniformly distributed

推荐答案

由于这是一个面试问题,我认为面试官正在寻找在两个方面评估:

As this is an interview question, I think the interviewers are looking to assess in two ways:

能够理解问题的要求,真的是。 能够想一些code,将满足这些要求。

这可能是某些设置一个很好的面试问题,尤其是当面试官愿意在必要时提出问题的,并提示考生。

This could be a really good interview question in certain settings, especially if the interviewer were willing to prompt the candidate with questions as and when necessary.

在理解问题的要求方面,它帮助,如果你知道这是一个非常棘手的问题,见证的 在 PJS的回答提到死硬的测试。从根本上说,我认为考生需要展示的两件事情AP preciation:

In terms of understanding the requirements of the question, it helps if you know that this is a really difficult problem, witness the Diehard tests mentioned in pjs's answer. Fundamentally I think a candidate would need to demonstrate appreciation of two things:

(一)的整体的分布的的数字应该匹配所需分布(我假设它是均匀的,这种情况下,但随着@pjs指出,在评论这个假设应该明确)。

(a) The overall distribution of the numbers should match the desired distribution (I'm assuming it is uniform in this case, but as @pjs points out in comments this assumption should be made explicit).

(二)的绘制每个号码应的独立的从previous数得出。

(b) Each number drawn should be independent from the previous numbers drawn.

使用半小时,code的东西在接受电话采访时,你不能走得很远。如果我回答这个问题,我会尝试建议是这样的:

With half an hour to code something up in a phone interview, you can't go very far. If I were answering this question I would try to suggest something like:

(一)的要测试的分布情况,拿出一套用于浮点数同等大小的垃圾箱,并计算落入每个箱的编号。绘制直方图和眼球它(绘图数据始终是一个好主意)。为了扩大这一点,你可以使用一个卡方检验,如 Amit的答案描述。

(a) To test the distribution, come up with a set of equal-sized bins for the floating point numbers, and count the numbers that fall into each bin. Plot a histogram and eyeball it (plotting the data is always a good idea). To extend this, you could use a chi-squared test, as described in amit's answer.

然而,如评价这里所讨论的,和 的

However, as discussed in the comments, and here

用卡的主要问题方检验是间隔的数量和大小的选择。虽然经验可以帮助产生好的结果的规则,也没有灵丹妙药的各种应用程序。

The main problem with chi squared test is the choice of number and size of the intervals. Although rules of thumb can help produce good results, there is no panacea for all kinds of applications.

为此,可以使用柯尔莫哥洛夫 - 斯米尔诺夫测试。本次测试背后的想法是,订购的数据,如果你的曲线应该是一个不错的选择对完美的有序数据(被称为累积分布)。为均匀分布的完美有序数据是直线:您所期望的数据的第10个百分是通过范围的方式的10%,20是通过等的范围内的方式20%。所以,编程,你可以整理数据,绘制其对理想值,你应该得到一条直线。还有一种正规的,量化的统计测试可以应用,这是基于实际和理想值之间的差

To this end, the Kolmogorov-Smirnov test can be used. The idea behind this test is that if you a plot of the ordered data should be a good fit against the perfect ordered data (known as the cumulative distribution). For a uniform distribution the perfect ordered data is a straight line: you expect the 10th percentile of the data to be 10% of the way through the range, the 20th to be 20% of the way through the range and so on. So, programmatically, you could sort the data, plot it against the ideal value and you should get a straight line. There is also a formal, quantitative statistical test you can apply, which is based on the differences between the actual and ideal values.

(B)的为了测试独立性,有多种方法。 自相关在不同的时间滞后是相当明显的:在何种程度上是在时间的值吨的类似的值在时间的 T + 1 的,例如。该运行测试是另一个好一个:你将所有的号码为1或0取决于他们是否落在高于或低于中位数,然后运行的长度的分布可以用于构建一个统计测试。的运行试验也可以用在一个方向上,以测试运行或另一个,如所描述的这里和这里(这可能是更有用你的情况)。两者都具有相当简单的实现,只要你有公式手!

(b) To test independence, there are multiple approaches. Autocorrelation at various time lags is one fairly obvious one: to what extent is the value at time t similar to the value at time t+1, for example. The runs test is another nice one: you convert all the numbers into 1 or 0 depending on whether they fall above or below the median, and then the distribution of the length of runs can be used to construct a statistical test. The runs test can also be used to test for runs in one direction or another, as described here and here (this might be more useful in your case). Both of these have fairly straightforward implementations so long as you have the formulas to hand!

除了死忠测试,其他的好来源讨论的随机数发生器包括这里和这里。

Apart from the diehard tests, other good sources discussing random number generators include here and here.

 
精彩推荐
图片推荐